Description of problem: Been happening for several weeks now since kernel 3.13.x I think. Different 3.13.x kernels have this problem. Happens within 20mins of a reboot. Seems to happen with MSWindows clients accessing via smb. Additional info: reporter: libreport-2.1.12 BUG: soft lockup - CPU#5 stuck for 22s! [smbd:9182] Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache fuse ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack xt_CHECKSUM iptable_mangle tun bridge stp llc snd_hda_codec_hdmi iTCO_wdt snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm iTCO_vendor_support snd_page_alloc ftdi_sio gpio_ich snd_timer coretemp kvm_intel snd soundcore lpc_ich mfd_core kvm e1000e ptp ppdev parport_pc pps_core microcode i2c_i801 serio_raw i5000_edac edac_core i5k_amb ioatdma parport dca acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd sunrpc binfmt_misc uinput raid1 radeon i2c_algo_bit drm_kms_helper ttm ata_generic drm usb_storage pata_acpi i2c_core analog gameport joydev CPU: 5 PID: 9182 Comm: smbd Not tainted 3.13.6-100.fc19.i686.PAE #1 Hardware name: Supermicro X7DA8/X7DA8, BIOS 2.1a 01/06/2009 task: c8ca5a00 ti: d281c000 task.ti: d281c000 EIP: 0060:[<c09c9042>] EFLAGS: 00200202 CPU: 5 EIP is at _raw_spin_lock+0x12/0x30 EAX: e73dbdcc EBX: 00000000 ECX: 00000001 EDX: 00001c1c ESI: 00000000 EDI: e0909070 EBP: d281de4c ESP: d281de4c DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 CR0: 80050033 CR2: b76a4000 CR3: 1fcab000 CR4: 000007f0 Stack: d281de9c c05ac559 00000004 efee29a0 c8ca5a00 00000204 00000001 e73dbdcc 00000000 e090909c 00000001 e73dbd78 e73dbd78 d281de8c c06a641b d7a78d00 d281de9c d7a78d00 fffffff5 d281ded0 d281df90 c0574092 cb701900 d281ded0 Call Trace: [<c05ac559>] __break_lease+0x259/0x330 [<c06a641b>] ? lockref_put_or_lock+0xb/0x30 [<c0574092>] SYSC_renameat+0x292/0x370 [<c05ac1f0>] ? do_lock_file_wait.part.19+0x30/0xb0 [<c05aac9e>] ? locks_free_lock+0x4e/0x60 [<c0576bf7>] SyS_rename+0x37/0x40 [<c09d054d>] sysenter_do_call+0x12/0x28 Code: 66 90 89 d8 f0 81 28 00 00 10 00 74 05 e8 2b cb cd ff 89 c8 5b 5d c3 66 90 55 89 e5 66 66 66 66 90 ba 00 01 00 00 f0 66 0f c1 10 <0f> b6 ce 38 d1 75 04 5d c3 f3 90 0f b6 10 38 ca 75 f7 5d c3 8d
Created attachment 880725 [details] File: dmesg
Hi, has been having problems in other areas than smbd as well. This is an oldish dual Xeon system with 8 CPU's reported (Hyperthreading enabled). Is doing this a few times a day. Was completly fine for years before a relatively recent kernel update. Apr 1 15:38:28 beam kernel: [ 7024.124002] BUG: soft lockup - CPU#3 stuck for 22s! [yum:20897] Apr 1 15:38:28 beam kernel: [ 7024.124002] Modules linked in: vfat fat fuse ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack xt_CHECKSUM iptable_mangle tun bridge stp llc snd_hda_codec_hdmi coretemp kvm_intel kvm snd_hda_codec_realtek iTCO_wdt snd_hda_intel iTCO_vendor_support gpio_ich snd_hda_codec snd_hwdep ppdev snd_seq snd_seq_device cdc_acm snd_pcm snd_page_alloc snd_timer snd lpc_ich i5000_edac e1000e i2c_i801 edac_core ioatdma mfd_core nfsd auth_rpcgss microcode soundcore ptp nfs_acl lockd serio_raw pps_core sunrpc uinput i5k_amb acpi_cpufreq dca parport_pc parport binfmt_misc raid1 radeon i2c_algo_bit drm_kms_helper ttm ata_generic drm usb_storage pata_acpi i2c_core analog gameport joydev Apr 1 15:38:28 beam kernel: [ 7024.124002] CPU: 3 PID: 20897 Comm: yum Not tainted 3.13.7-100.fc19.i686.PAE #1 Apr 1 15:38:28 beam kernel: [ 7024.124002] Hardware name: Supermicro X7DA8/X7DA8, BIOS 2.1a 01/06/2009 Apr 1 15:38:28 beam kernel: [ 7024.124002] task: d2d1ba80 ti: d3a80000 task.ti: d3a80000 Apr 1 15:38:28 beam kernel: [ 7024.124002] EIP: 0060:[<c05ac599>] EFLAGS: 00200246 CPU: 3 Apr 1 15:38:28 beam kernel: [ 7024.124002] EIP is at __break_lease+0x259/0x330 Apr 1 15:38:28 beam kernel: [ 7024.124002] EAX: c6b4767c EBX: 00000000 ECX: 00000001 EDX: 00000101 Apr 1 15:38:28 beam kernel: [ 7024.124002] ESI: 00000000 EDI: d2fc63f0 EBP: d3a81e9c ESP: d3a81e54 Apr 1 15:38:28 beam kernel: [ 7024.124002] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Apr 1 15:38:28 beam kernel: [ 7024.124002] CR0: 80050033 CR2: 45f43080 CR3: 0787e000 CR4: 000007f0 Apr 1 15:38:28 beam kernel: [ 7024.124002] Stack: Apr 1 15:38:28 beam kernel: [ 7024.124002] 00000004 eaa6ccb0 d2d1ba80 00000204 00000001 c6b4767c 00000000 d2fc641c Apr 1 15:38:28 beam kernel: [ 7024.124002] 00000001 c6b47628 c86a5198 d3a81e8c c06a646b c86dc080 d3a81e9c c86dc080 Apr 1 15:38:28 beam kernel: [ 7024.124002] fffffff5 d3a81ed0 d3a81f90 c05740a2 c6a69a00 d3a81ed0 ea853000 ea856000 Apr 1 15:38:28 beam kernel: [ 7024.124002] Call Trace: Apr 1 15:38:28 beam kernel: [ 7024.124002] [<c06a646b>] ? lockref_put_or_lock+0xb/0x30 Apr 1 15:38:28 beam kernel: [ 7024.124002] [<c05740a2>] SYSC_renameat+0x292/0x370 Apr 1 15:38:28 beam kernel: [ 7024.124002] [<c059ffb7>] ? fsnotify+0x1f7/0x2d0 Apr 1 15:38:28 beam kernel: [ 7024.124002] [<c0753bc2>] ? tty_write_unlock+0x32/0x40 Apr 1 15:38:28 beam kernel: [ 7024.124002] [<c0576c47>] SyS_rename+0x37/0x40 Apr 1 15:38:28 beam kernel: [ 7024.124002] [<c09d088d>] sysenter_do_call+0x12/0x28 Apr 1 15:38:28 beam kernel: [ 7024.124002] Code: ff 8b 17 85 d2 0f 94 c2 85 db 0f 85 ac 00 00 00 84 d2 0f 85 a2 00 00 00 8b 45 d4 8d 55 e0 e8 7f 32 ee ff 8b 45 cc e8 d7 cd 41 00 <89> f8 e8 b0 eb ff ff 85 db 78 4e 0f 84 a5 00 00 00 8b 45 dc 8b
Thanks for the report. The most likely culprit is the new delegation code that went into 3.13. Turning off leases might help. I'm not sure how to do that with Samba, but "echo 0 >/proc/sys/fs/leases-enable" before starting Samba will probably do the job. Do you nfs-export this filesystem as well? Are you seeing other symptoms (e.g., clients hanging?), or are the log warnings the only issue?
Thanks for the reply. Ok, will try turning of leases. Yes, the system is a server/workstation and exports a number of file systems to a number of clients. NFS is used heavily and I haven't noticed any issues that appear to be related to NFS accesses. Other, non Xeon, Fedora 19 servers that we have appear ok. All are running the same system. Although samba is not used with these. Log warning messages during samba accesses from the one MSWindows system we have seems to be the main cause. However it happened during a createrepo on the local file systems today. The createrepo process locked up and the warning message appeared. The createrepo stayed locked up. I killed it with CTRL-C, but it did not fully die for some time.
Just done a new search: Bug 1038929 may be a duplicate.
(In reply to Terry Barnaby from comment #4) > Ok, will try turning of leases. > > Yes, the system is a server/workstation and exports a number of file systems > to a number of clients. NFS is used heavily and I haven't noticed any issues > that appear to be related to NFS accesses. And the problem occurred on a filesystem that is used by both Samba and NFS clients? > Other, non Xeon, Fedora 19 servers that we have appear ok. All are running > the same system. Although samba is not used with these. It's more likely that this is triggered by Samba and/or NFS than by different hardware, so I think that the fact that Samba was not running on those servers is probably the relevant fact. > Log warning messages during samba accesses from the one MSWindows system we > have seems to be the main cause. However it happened during a createrepo on > the local file systems today. The createrepo process locked up and the > warning message appeared. The createrepo stayed locked up. I killed it with > CTRL-C, but it did not fully die for some time. So createrepo was run directly on the server's filesystem, not over NFS or SMB? Could createrepo have been reading from or writing to files that had also been recently accessed by SMB or NFS clients?
(In reply to J. Bruce Fields from comment #6) > (In reply to Terry Barnaby from comment #4) > > Ok, will try turning of leases. > > > > Yes, the system is a server/workstation and exports a number of file systems > > to a number of clients. NFS is used heavily and I haven't noticed any issues > > that appear to be related to NFS accesses. > > And the problem occurred on a filesystem that is used by both Samba and NFS > clients? Yes, all these file systems are exported for NFS. Some portions of these file systems are exported by samba. > > > Other, non Xeon, Fedora 19 servers that we have appear ok. All are running > > the same system. Although samba is not used with these. > > It's more likely that this is triggered by Samba and/or NFS than by > different hardware, so I think that the fact that Samba was not running on > those servers is probably the relevant fact. > > > Log warning messages during samba accesses from the one MSWindows system we > > have seems to be the main cause. However it happened during a createrepo on > > the local file systems today. The createrepo process locked up and the > > warning message appeared. The createrepo stayed locked up. I killed it with > > CTRL-C, but it did not fully die for some time. > > So createrepo was run directly on the server's filesystem, not over NFS or > SMB? Yes. > > Could createrepo have been reading from or writing to files that had also > been recently accessed by SMB or NFS clients? The file system that createrepo was running on could be accessed by NFS clients but not samba clients. The yum directory that createrepo was updating is used for locally built RPM's that other systems will update via a yum update (and yumbackend).
Just looked through /var/log/messages on my home server (running same Fedora19 with same update level but a basic Intel Core2 CPU 6300) and this shows a couple of similar lock ups. This has smbd enabled but should not have been in use at that time. The system NFS exports a number of file systems to a number of home systems. In this case it looks like a MythTv mythpreviewgen had problems. mythpreviewgen would have been directly accessing the local file system on the server, but other clients would have been accessing the same file system via NFS and mythbackend. Certainly the file mythpreviewgen would have been creating could have been accessed by the clients over NFS ... I will try and setup a test to get this fault to occur on demand. Apr 1 17:10:11 king kernel: [373928.040000] BUG: soft lockup - CPU#0 stuck for 22s! [mythpreviewgen:2695] Apr 1 17:10:11 king kernel: [373928.040000] Modules linked in: ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack xt_CHECKSUM iptable_mangle bridge stp llc tun rfcomm bnep saa7134_alsa isl6421 cx24116 cx88_dvb cx88_vp3054_i2c rc_hauppauge tda1004x saa7134_dvb videobuf_dvb coretemp kvm_intel kvm iTCO_wdt iTCO_vendor_support rc_dib0700_rc5 ppdev microcode i2c_i801 serio_raw ir_lirc_codec lirc_dev ir_mce_kbd_decoder ir_sanyo_decoder ir_sony_decoder ir_rc6_decoder ir_jvc_decoder ir_nec_decoder ir_rc5_decoder tda827x tda8290 tuner r820t cx88_alsa cx8800 cx8802 cx88xx rtl2832 snd_hda_codec_via saa7134 btcx_risc snd_hda_intel tveeprom dvb_usb_dib0700 videobuf_dma_sg btusb dib8000 videobuf_core snd_hda_codec dib7000m v4l2_common snd_hwdep bluetooth dib0090 dvb_usb_rtl28xxu dib0070 dib7000p dib3000mc dibx000_common dvb_usb rfkill videodev snd_seq snd_seq_device rtl2830 dvb_usb_v2 media snd_pcm dvb_core rc_core lpc_ich r8169 mii i2c_algo_bit parport_pc parport i2c_core snd_page_alloc mfd_core snd_timer snd soundcore acpi_cpufreq asus_atk0110 nfsd binfmt_misc auth_rpcgss nfs_acl lockd uinput sunrpc ata_generic pata_acpi raid1 pata_jmicron analog gameport joydev Apr 1 17:10:11 king kernel: [373928.040000] CPU: 0 PID: 2695 Comm: mythpreviewgen Not tainted 3.13.7-100.fc19.i686.PAE #1 Apr 1 17:10:11 king kernel: [373928.040000] Hardware name: System manufacturer System Product Name/P5QL/EPU, BIOS 0315 11/10/2008 Apr 1 17:10:11 king kernel: [373928.040000] task: e4849680 ti: e7e0c000 task.ti: e7e0c000 Apr 1 17:10:11 king kernel: [373928.040000] EIP: 0060:[<c05ac520>] EFLAGS: 00200286 CPU: 0 Apr 1 17:10:11 king kernel: [373928.040000] EIP is at __break_lease+0x1e0/0x330 Apr 1 17:10:11 king kernel: [373928.040000] EAX: cc4678ec EBX: ef32ee00 ECX: ef32ee0c EDX: ef32ee0c Apr 1 17:10:11 king kernel: [373928.040000] ESI: 00000000 EDI: ef32e9a0 EBP: e7e0df20 ESP: e7e0ded8 Apr 1 17:10:11 king kernel: [373928.040000] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Apr 1 17:10:11 king kernel: [373928.040000] CR0: 80050033 CR2: 4e394d90 CR3: 280fd000 CR4: 000007f0 Apr 1 17:10:11 king kernel: [373928.040000] Stack: Apr 1 17:10:11 king kernel: [373928.040000] 00000004 ef32ee00 e4849680 00000204 00000001 cc4678ec 00000000 ef32e9cc Apr 1 17:10:11 king kernel: [373928.040000] 00000001 cc467898 cc467898 cc4678ec e7e0df20 c057f1fe e7e0df20 e55c6080 Apr 1 17:10:11 king kernel: [373928.040000] fffffff5 cc467898 e7e0dfa4 c05737e1 00000000 0a0ed0c8 ffffff9c 00000000 Apr 1 17:10:11 king kernel: [373928.040000] Call Trace: Apr 1 17:10:11 king kernel: [373928.040000] [<c057f1fe>] ? iput+0x2e/0x130 Apr 1 17:10:11 king kernel: [373928.040000] [<c05737e1>] do_unlinkat+0x151/0x200 Apr 1 17:10:11 king kernel: [373928.040000] [<c045af05>] ? irq_exit+0x55/0xb0 Apr 1 17:10:11 king kernel: [373928.040000] [<c0411f65>] ? do_IRQ+0x45/0xb0 Apr 1 17:10:11 king kernel: [373928.040000] [<c0576645>] SyS_unlink+0x15/0x20 Apr 1 17:10:11 king kernel: [373928.040000] [<c09d088d>] sysenter_do_call+0x12/0x28 Apr 1 17:10:11 king kernel: [373928.040000] Code: 00 b9 01 00 00 00 31 f6 b8 00 ef e6 c0 89 4d d8 e8 66 ce 41 00 89 fa 89 d8 e8 8d f3 ff ff 80 05 00 ef e6 c0 01 8b 45 cc 80 00 01 <8b> 0f 85 c9 8b 4d d8 0f 94 c0 84 c9 0f 84 16 01 00 00 84 c0 0f Apr 1 17:10:13 king sh[656]: abrt-dump-oops: Found oopses: 1 Apr 1 17:10:13 king sh[656]: abrt-dump-oops: Creating problem directories Apr 1 17:10:14 king abrt-dump-oops: Reported 1 kernel oopses to Abrt Apr 1 17:10:15 king mythbackend[27408]: 2014-04-01 17:10:15.078274 I Child PID 2695 killed with Terminated Apr 1 17:10:17 king mythbackend[27408]: 2014-04-01 17:10:17.079700 I Child PID 2695 killed with Killed
I have managed to trigger some more lockups. On the server I was running: while true do createrepo /dist/fedora_19/beamExtra done On a client with NFS mounts I was running (/dist/fedora_19/beamExtra repo in repo list): while true do yum check-update done and while true do for f in /dist/fedora_19/beamExtra/repodata/* do # echo "Do: $f" cat $f > /dev/null done done There was other work being done as well and other systems connected and being used and there was a MSWindows system connected via samba but not being used. Apr 2 09:34:44 beam kernel: [ 5516.161002] BUG: soft lockup - CPU#6 stuck for 23s! [genpkgmetadata.:17353] Apr 2 09:34:44 beam kernel: [ 5516.161002] Modules linked in: vfat fat fuse ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack xt_CHECKSUM iptable_mangle tun bridge stp llc snd_hda_codec_hdmi snd_hda_codec_realtek coretemp kvm_intel kvm e1000e snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd cdc_acm ppdev soundcore iTCO_wdt iTCO_vendor_support parport_pc gpio_ich lpc_ich i5000_edac microcode mfd_core ptp serio_raw edac_core ioatdma pps_core parport i2c_i801 acpi_cpufreq i5k_amb dca nfsd auth_rpcgss nfs_acl lockd sunrpc uinput binfmt_misc raid1 radeon i2c_algo_bit drm_kms_helper ttm drm ata_generic usb_storage pata_acpi i2c_core analog gameport joydev Apr 2 09:34:44 beam kernel: [ 5516.161002] CPU: 6 PID: 17353 Comm: genpkgmetadata. Not tainted 3.13.7-100.fc19.i686.PAE #1 Apr 2 09:34:44 beam kernel: [ 5516.161002] Hardware name: Supermicro X7DA8/X7DA8, BIOS 2.1a 01/06/2009 Apr 2 09:34:44 beam kernel: [ 5516.161002] task: eaf21680 ti: ebc5c000 task.ti: ebc5c000 Apr 2 09:34:44 beam kernel: [ 5516.161002] EIP: 0060:[<c09c9382>] EFLAGS: 00200202 CPU: 6 Apr 2 09:34:44 beam kernel: [ 5516.161002] EIP is at _raw_spin_lock+0x12/0x30 Apr 2 09:34:44 beam kernel: [ 5516.161002] EAX: c0e6ef00 EBX: df3f9f50 ECX: 0000000f EDX: 00007575 Apr 2 09:34:44 beam kernel: [ 5516.161002] ESI: 00000000 EDI: df3f9f50 EBP: ebc5dec4 ESP: ebc5dec4 Apr 2 09:34:44 beam kernel: [ 5516.161002] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Apr 2 09:34:44 beam kernel: [ 5516.161002] CR0: 80050033 CR2: b76860b0 CR3: 1f409000 CR4: 000007f0 Apr 2 09:34:44 beam kernel: [ 5516.161002] Stack: Apr 2 09:34:44 beam kernel: [ 5516.161002] ebc5ded0 c05ab165 00000000 ebc5df20 c05ac5a0 00000004 df3f9a80 eaf21680 Apr 2 09:34:44 beam kernel: [ 5516.161002] 00000204 00000001 c85180dc 00000000 df3f9f7c 00000001 c8518088 c8518088 Apr 2 09:34:44 beam kernel: [ 5516.161002] c85180dc ebc5df20 c057f1fe ebc5df20 c856fd00 fffffff5 c8518088 ebc5dfa4 Apr 2 09:34:44 beam kernel: [ 5516.161002] Call Trace: Apr 2 09:34:44 beam kernel: [ 5516.161002] [<c05ab165>] locks_delete_block+0x15/0x30 Apr 2 09:34:44 beam kernel: [ 5516.161002] [<c05ac5a0>] __break_lease+0x260/0x330 Apr 2 09:34:44 beam kernel: [ 5516.161002] [<c057f1fe>] ? iput+0x2e/0x130 Apr 2 09:34:44 beam kernel: [ 5516.161002] [<c05737e1>] do_unlinkat+0x151/0x200 Apr 2 09:34:44 beam kernel: [ 5516.161002] [<c0576645>] SyS_unlink+0x15/0x20 Apr 2 09:34:44 beam kernel: [ 5516.161002] [<c09d088d>] sysenter_do_call+0x12/0x28 Apr 2 09:34:44 beam kernel: [ 5516.161002] Code: 66 90 89 d8 f0 81 28 00 00 10 00 74 05 e8 3b c8 cd ff 89 c8 5b 5d c3 66 90 55 89 e5 66 66 66 66 90 ba 00 01 00 00 f0 66 0f c1 10 <0f> b6 ce 38 d1 75 04 5d c3 f3 90 0f b6 10 38 ca 75 f7 5d c3 8d Apr 2 09:34:45 beam sh[743]: abrt-dump-oops: Found oopses: 1 Apr 2 09:34:45 beam sh[743]: abrt-dump-oops: Creating problem directories Apr 2 09:34:46 beam abrt-dump-oops: Reported 1 kernel oopses to Abrt Apr 2 09:34:47 beam abrt-server[17395]: Duplicate: core backtrace Apr 2 09:34:47 beam abrt-server[17395]: DUP_OF_DIR: /var/tmp/abrt/oops-2014-03-05-16:02:14-24348-0 Apr 2 09:34:47 beam abrt-server[17395]: Deleting problem directory oops-2014-04-02-09:34:45-17394-0 (dup of oops-2014-03-05-16:02:14-24348-0) Apr 2 09:34:47 beam dbus-daemon[788]: dbus[788]: [system] Activating service name='org.freedesktop.problems' (using servicehelper) Apr 2 09:34:47 beam dbus[788]: [system] Activating service name='org.freedesktop.problems' (using servicehelper) Apr 2 09:34:47 beam ypserv[827]: refused connect from 192.168.1.1:41294 to procedure ypproc_match (beamnet,shadow.byname;-1) Apr 2 09:34:47 beam dbus-daemon[788]: dbus[788]: [system] Successfully activated service 'org.freedesktop.problems' Apr 2 09:34:47 beam dbus[788]: [system] Successfully activated service 'org.freedesktop.problems' Apr 2 09:34:47 beam ypserv[827]: refused connect from 192.168.1.1:60534 to procedure ypproc_match (beamnet,shadow.byname;-1) Apr 2 09:35:12 beam kernel: [ 5544.161003] BUG: soft lockup - CPU#6 stuck for 22s! [genpkgmetadata.:17353] Apr 2 09:35:12 beam kernel: [ 5544.161003] Modules linked in: vfat fat fuse ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack xt_CHECKSUM iptable_mangle tun bridge stp llc snd_hda_codec_hdmi snd_hda_codec_realtek coretemp kvm_intel kvm e1000e snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd cdc_acm ppdev soundcore iTCO_wdt iTCO_vendor_support parport_pc gpio_ich lpc_ich i5000_edac microcode mfd_core ptp serio_raw edac_core ioatdma pps_core parport i2c_i801 acpi_cpufreq i5k_amb dca nfsd auth_rpcgss nfs_acl lockd sunrpc uinput binfmt_misc raid1 radeon i2c_algo_bit drm_kms_helper ttm drm ata_generic usb_storage pata_acpi i2c_core analog gameport joydev Apr 2 09:35:12 beam kernel: [ 5544.161003] CPU: 6 PID: 17353 Comm: genpkgmetadata. Not tainted 3.13.7-100.fc19.i686.PAE #1 Apr 2 09:35:12 beam kernel: [ 5544.161003] Hardware name: Supermicro X7DA8/X7DA8, BIOS 2.1a 01/06/2009 Apr 2 09:35:12 beam kernel: [ 5544.161003] task: eaf21680 ti: ebc5c000 task.ti: ebc5c000 Apr 2 09:35:12 beam kernel: [ 5544.161003] EIP: 0060:[<c05ac50a>] EFLAGS: 00200246 CPU: 6 Apr 2 09:35:12 beam kernel: [ 5544.161003] EIP is at __break_lease+0x1ca/0x330 Apr 2 09:35:12 beam kernel: [ 5544.161003] EAX: c0e6ef00 EBX: df3f9a80 ECX: 000000d6 EDX: 0000d6d6 Apr 2 09:35:12 beam kernel: [ 5544.161003] ESI: 00000000 EDI: df3f9f50 EBP: ebc5df20 ESP: ebc5ded8 Apr 2 09:35:12 beam kernel: [ 5544.161003] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Apr 2 09:35:12 beam kernel: [ 5544.161003] CR0: 80050033 CR2: b76860b0 CR3: 1f409000 CR4: 000007f0 Apr 2 09:35:12 beam kernel: [ 5544.161003] Stack: Apr 2 09:35:12 beam kernel: [ 5544.161003] 00000004 df3f9a80 eaf21680 00000204 00000001 c85180dc 00000000 df3f9f7c Apr 2 09:35:12 beam kernel: [ 5544.161003] 00000001 c8518088 c8518088 c85180dc ebc5df20 c057f1fe ebc5df20 c856fd00 Apr 2 09:35:12 beam kernel: [ 5544.161003] fffffff5 c8518088 ebc5dfa4 c05737e1 00000000 0a4ca200 ffffff9c 00000000 Apr 2 09:35:12 beam kernel: [ 5544.161003] Call Trace: Apr 2 09:35:12 beam kernel: [ 5544.161003] [<c057f1fe>] ? iput+0x2e/0x130 Apr 2 09:35:12 beam kernel: [ 5544.161003] [<c05737e1>] do_unlinkat+0x151/0x200 Apr 2 09:35:12 beam kernel: [ 5544.161003] [<c0576645>] SyS_unlink+0x15/0x20 Apr 2 09:35:12 beam kernel: [ 5544.161003] [<c09d088d>] sysenter_do_call+0x12/0x28 Apr 2 09:35:12 beam kernel: [ 5544.161003] Code: 0f 01 00 00 8d 47 2c 89 f3 89 45 d4 8b 43 50 85 c0 0f 85 14 01 00 00 b9 01 00 00 00 31 f6 b8 00 ef e6 c0 89 4d d8 e8 66 ce 41 00 <89> fa 89 d8 e8 8d f3 ff ff 80 05 00 ef e6 c0 01 8b 45 cc 80 00 Apr 2 09:35:13 beam sh[743]: abrt-dump-oops: Found oopses: 1 Apr 2 09:35:13 beam sh[743]: abrt-dump-oops: Creating problem directories Apr 2 09:35:13 beam abrt-server[17487]: Duplicate: core backtrace Apr 2 09:35:13 beam abrt-server[17487]: DUP_OF_DIR: /var/tmp/abrt/oops-2014-04-01-15:33:54-18737-0 Apr 2 09:35:13 beam abrt-server[17487]: Deleting problem directory oops-2014-04-02-09:35:13-17485-0 (dup of oops-2014-04-01-15:33:54-18737-0) Apr 2 09:35:14 beam abrt-dump-oops: Reported 1 kernel oopses to Abrt Apr 2 09:35:17 beam kernel: [ 5549.754003] INFO: rcu_sched self-detected stall on CPU { 6} (t=60000 jiffies g=994011 c=994010 q=106923) Apr 2 09:35:17 beam kernel: [ 5549.754003] sending NMI to all CPUs: Apr 2 09:35:17 beam kernel: [ 5549.754003] NMI backtrace for cpu 6 Apr 2 09:35:17 beam kernel: [ 5549.754003] CPU: 6 PID: 17353 Comm: genpkgmetadata. Not tainted 3.13.7-100.fc19.i686.PAE #1 Apr 2 09:35:17 beam kernel: [ 5549.754003] Hardware name: Supermicro X7DA8/X7DA8, BIOS 2.1a 01/06/2009 Apr 2 09:35:17 beam kernel: [ 5549.754003] task: eaf21680 ti: ebc5c000 task.ti: ebc5c000 Apr 2 09:35:17 beam kernel: [ 5549.754003] EIP: 0060:[<c06a4b21>] EFLAGS: 00200046 CPU: 6 Apr 2 09:35:17 beam kernel: [ 5549.754003] EIP is at __const_udelay+0x1/0x20 Apr 2 09:35:17 beam kernel: [ 5549.754003] EAX: 00418958 EBX: 00002710 ECX: 00000008 EDX: 000000ff Apr 2 09:35:17 beam kernel: [ 5549.754003] ESI: c0c57240 EDI: f75d1820 EBP: ebc5dd58 ESP: ebc5dd48 Apr 2 09:35:17 beam kernel: [ 5549.754003] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Apr 2 09:35:17 beam kernel: [ 5549.754003] CR0: 80050033 CR2: b76860b0 CR3: 1f409000 CR4: 000007f0 Apr 2 09:35:17 beam kernel: [ 5549.754003] Stack: Apr 2 09:35:17 beam kernel: [ 5549.754003] ebc5dd58 c043e4e2 c0b5adc5 c0c57240 ebc5dda0 c04a9938 c0b68014 0000ea60 Apr 2 09:35:17 beam kernel: [ 5549.754003] 000f2adb 000f2ada 0001a1ab 00000000 00000000 00000000 c0c57240 c0cc034c Apr 2 09:35:17 beam kernel: [ 5549.754003] f75d1820 c0c57240 00000006 eaf21680 00000000 00000006 ebc5ddb4 c046245c Apr 2 09:35:17 beam kernel: [ 5549.754003] Call Trace: Apr 2 09:35:17 beam kernel: [ 5549.754003] [<c043e4e2>] ? arch_trigger_all_cpu_backtrace+0x72/0x90 Apr 2 09:35:17 beam kernel: [ 5549.754003] [<c04a9938>] rcu_check_callbacks+0x388/0x5a0 Apr 2 09:35:17 beam kernel: [ 5549.754003] [<c046245c>] update_process_times+0x3c/0x60 Apr 2 09:35:17 beam kernel: [ 5549.754003] [<c04b4236>] tick_sched_handle.isra.12+0x26/0x60 Apr 2 09:35:17 beam kernel: [ 5549.754003] [<c04b42a7>] tick_sched_timer+0x37/0x70 Apr 2 09:35:17 beam kernel: [ 5549.754003] [<c0475c18>] ? __remove_hrtimer+0x38/0x90 Apr 2 09:35:17 beam kernel: [ 5549.754003] [<c0475e5d>] __run_hrtimer+0x6d/0x190 Apr 2 09:35:17 beam kernel: [ 5549.754003] [<c04b4270>] ? tick_sched_handle.isra.12+0x60/0x60 Apr 2 09:35:17 beam kernel: [ 5549.754003] [<c0476a58>] hrtimer_interrupt+0x1e8/0x2a0 Apr 2 09:35:17 beam kernel: [ 5549.754003] [<c043c4ff>] local_apic_timer_interrupt+0x2f/0x60 Apr 2 09:35:17 beam kernel: [ 5549.754003] [<c045ae7e>] ? irq_enter+0x3e/0x70 Apr 2 09:35:17 beam kernel: [ 5549.754003] [<c043c6d3>] smp_apic_timer_interrupt+0x33/0x50 Apr 2 09:35:17 beam kernel: [ 5549.754003] [<c09c9b6c>] apic_timer_interrupt+0x34/0x3c Apr 2 09:35:17 beam kernel: [ 5549.754003] [<c05a007b>] ? fsnotify+0x2bb/0x2d0 Apr 2 09:35:17 beam kernel: [ 5549.754003] [<c09c938a>] ? _raw_spin_lock+0x1a/0x30 Apr 2 09:35:17 beam kernel: [ 5549.754003] [<c05ac50a>] ? __break_lease+0x1ca/0x330 Apr 2 09:35:17 beam kernel: [ 5549.754003] [<c057f1fe>] ? iput+0x2e/0x130 Apr 2 09:35:17 beam kernel: [ 5549.754003] [<c05737e1>] do_unlinkat+0x151/0x200 Apr 2 09:35:17 beam kernel: [ 5549.754003] [<c0576645>] SyS_unlink+0x15/0x20 Apr 2 09:35:17 beam kernel: [ 5549.754003] [<c09d088d>] sysenter_do_call+0x12/0x28 Apr 2 09:35:17 beam kernel: [ 5549.754003] Code: 00 00 00 8d bc 27 00 00 00 00 48 75 fd 48 5d c3 8d 76 00 8d bc 27 00 00 00 00 55 89 e5 66 66 66 66 90 ff 15 90 3c c8 c0 5d c3 55 <89> e5 64 8b 15 1c 2e d8 c0 c1 e0 02 69 d2 fa 00 00 00 f7 e2 8d Apr 2 09:35:17 beam kernel: [ 5549.754224] NMI backtrace for cpu 4 Apr 2 09:35:17 beam kernel: [ 5549.754229] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 3.13.7-100.fc19.i686.PAE #1 Apr 2 09:35:17 beam kernel: [ 5549.754232] Hardware name: Supermicro X7DA8/X7DA8, BIOS 2.1a 01/06/2009 Apr 2 09:35:17 beam kernel: [ 5549.754235] task: f705ba80 ti: f70b8000 task.ti: f70b8000 Apr 2 09:35:17 beam kernel: [ 5549.754238] EIP: 0060:[<c0447865>] EFLAGS: 00200286 CPU: 4 Apr 2 09:35:17 beam kernel: [ 5549.754242] EIP is at native_safe_halt+0x5/0x10 Apr 2 09:35:17 beam kernel: [ 5549.754245] EAX: ffffffed EBX: 53ea0dc0 ECX: 00000000 EDX: 00000000 Apr 2 09:35:17 beam kernel: [ 5549.754248] ESI: 00000000 EDI: 00000004 EBP: f70b9f50 ESP: f70b9f50 Apr 2 09:35:17 beam kernel: [ 5549.754250] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Apr 2 09:35:17 beam kernel: [ 5549.754253] CR0: 8005003b CR2: b776c000 CR3: 267ae000 CR4: 000007f0 Apr 2 09:35:17 beam kernel: [ 5549.754255] Stack: Apr 2 09:35:17 beam kernel: [ 5549.754257] f70b9f64 c0417cbc 53ea0dc0 00000000 f70b8000 f70b9f6c c04183e6 f70b9f90 Apr 2 09:35:17 beam kernel: [ 5549.754265] c04a0ee9 f70b9f88 c043bfdb af5e7923 53ea0dc0 1ae265da 00000000 00000000 Apr 2 09:35:17 beam kernel: [ 5549.754271] f70b9fb4 c043a6ca 943bfe68 cc7bbef3 aa9bd741 deba0abb 442bf33a 1ae265da Apr 2 09:35:17 beam kernel: [ 5549.754278] Call Trace: Apr 2 09:35:17 beam kernel: [ 5549.754284] [<c0417cbc>] default_idle+0x1c/0xa0 Apr 2 09:35:17 beam kernel: [ 5549.754288] [<c04183e6>] arch_cpu_idle+0x26/0x30 Apr 2 09:35:17 beam kernel: [ 5549.754294] [<c04a0ee9>] cpu_startup_entry+0x1c9/0x210 Apr 2 09:35:17 beam kernel: [ 5549.754298] [<c043bfdb>] ? setup_APIC_timer+0xab/0x130 Apr 2 09:35:17 beam kernel: [ 5549.754303] [<c043a6ca>] start_secondary+0x20a/0x2d0 Apr 2 09:35:17 beam kernel: [ 5549.754305] Code: bc 27 00 00 00 00 55 89 e5 fa 5d c3 8d 76 00 8d bc 27 00 00 00 00 55 89 e5 fb 5d c3 8d 76 00 8d bc 27 00 00 00 00 55 89 e5 fb f4 <5d> c3 89 f6 8d bc 27 00 00 00 00 55 89 e5 f4 5d c3 8d 76 00 8d Apr 2 09:35:17 beam kernel: [ 5549.754346] NMI backtrace for cpu 2 Apr 2 09:35:17 beam kernel: [ 5549.754357] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 3.13.7-100.fc19.i686.PAE #1 Apr 2 09:35:17 beam kernel: [ 5549.754360] Hardware name: Supermicro X7DA8/X7DA8, BIOS 2.1a 01/06/2009 Apr 2 09:35:17 beam kernel: [ 5549.754364] task: f705b180 ti: f70b4000 task.ti: f70b4000 Apr 2 09:35:17 beam kernel: [ 5549.754368] EIP: 0060:[<c0447865>] EFLAGS: 00200286 CPU: 2 Apr 2 09:35:17 beam kernel: [ 5549.754375] EIP is at native_safe_halt+0x5/0x10 Apr 2 09:35:17 beam kernel: [ 5549.754378] EAX: ffffffed EBX: 43923243 ECX: 00000000 EDX: 00000000 Apr 2 09:35:17 beam kernel: [ 5549.754381] ESI: 00000000 EDI: 00000002 EBP: f70b5f50 ESP: f70b5f50 Apr 2 09:35:17 beam kernel: [ 5549.754385] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Apr 2 09:35:17 beam kernel: [ 5549.754388] CR0: 8005003b CR2: b39a0000 CR3: 21df1000 CR4: 000007f0 Apr 2 09:35:17 beam kernel: [ 5549.754391] Stack: Apr 2 09:35:17 beam kernel: [ 5549.754393] f70b5f64 c0417cbc 43923243 00000000 f70b4000 f70b5f6c c04183e6 f70b5f90 Apr 2 09:35:17 beam kernel: [ 5549.754403] c04a0ee9 f70b5f88 c043bfdb 6f61d708 43923243 5f072e3f 00000000 00000000 Apr 2 09:35:17 beam kernel: [ 5549.754410] f70b5fb4 c043a6ca a5320739 a18da4af 7e2ef0de 249aa942 a8c6b58e 5f072e3f Apr 2 09:35:17 beam kernel: [ 5549.754418] Call Trace: Apr 2 09:35:17 beam kernel: [ 5549.754427] [<c0417cbc>] default_idle+0x1c/0xa0 Apr 2 09:35:17 beam kernel: [ 5549.754432] [<c04183e6>] arch_cpu_idle+0x26/0x30 Apr 2 09:35:17 beam kernel: [ 5549.754438] [<c04a0ee9>] cpu_startup_entry+0x1c9/0x210 Apr 2 09:35:17 beam kernel: [ 5549.754445] [<c043bfdb>] ? setup_APIC_timer+0xab/0x130 Apr 2 09:35:17 beam kernel: [ 5549.754450] [<c043a6ca>] start_secondary+0x20a/0x2d0 Apr 2 09:35:17 beam kernel: [ 5549.754452] Code: bc 27 00 00 00 00 55 89 e5 fa 5d c3 8d 76 00 8d bc 27 00 00 00 00 55 89 e5 fb 5d c3 8d 76 00 8d bc 27 00 00 00 00 55 89 e5 fb f4 <5d> c3 89 f6 8d bc 27 00 00 00 00 55 89 e5 f4 5d c3 8d 76 00 8d Apr 2 09:35:17 beam kernel: [ 5549.754497] NMI backtrace for cpu 0 Apr 2 09:35:17 beam kernel: [ 5549.754503] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.7-100.fc19.i686.PAE #1 Apr 2 09:35:17 beam kernel: [ 5549.754506] Hardware name: Supermicro X7DA8/X7DA8, BIOS 2.1a 01/06/2009 Apr 2 09:35:17 beam kernel: [ 5549.754509] task: c0c3ea00 ti: c0c32000 task.ti: c0c32000 Apr 2 09:35:17 beam kernel: [ 5549.754513] EIP: 0060:[<c0447865>] EFLAGS: 00200286 CPU: 0 Apr 2 09:35:17 beam kernel: [ 5549.754517] EIP is at native_safe_halt+0x5/0x10 Apr 2 09:35:17 beam kernel: [ 5549.754520] EAX: ffffffed EBX: 8513c1aa ECX: 00000000 EDX: 00000000 Apr 2 09:35:17 beam kernel: [ 5549.754524] ESI: 0008f800 EDI: 00000000 EBP: c0c33f80 ESP: c0c33f80 Apr 2 09:35:17 beam kernel: [ 5549.754527] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Apr 2 09:35:17 beam kernel: [ 5549.754530] CR0: 8005003b CR2: b47ff208 CR3: 21df1000 CR4: 000007f0 Apr 2 09:35:17 beam kernel: [ 5549.754532] Stack: Apr 2 09:35:17 beam kernel: [ 5549.754534] c0c33f94 c0417cbc 8513c1aa 0008f800 c0c32000 c0c33f9c c04183e6 c0c33fc0 Apr 2 09:35:17 beam kernel: [ 5549.754543] c04a0ee9 00000000 00200282 aef34454 8513c1aa c0d223a0 0008f800 c0d8f800 Apr 2 09:35:17 beam kernel: [ 5549.754551] c0c33fc8 c09bbae2 c0c33fec c0cc5af2 000000fb ffffffff ffffffff c0cc556c Apr 2 09:35:17 beam kernel: [ 5549.754559] Call Trace: Apr 2 09:35:17 beam kernel: [ 5549.754564] [<c0417cbc>] default_idle+0x1c/0xa0 Apr 2 09:35:17 beam kernel: [ 5549.754569] [<c04183e6>] arch_cpu_idle+0x26/0x30 Apr 2 09:35:17 beam kernel: [ 5549.754574] [<c04a0ee9>] cpu_startup_entry+0x1c9/0x210 Apr 2 09:35:17 beam kernel: [ 5549.754579] [<c09bbae2>] rest_init+0x62/0x70 Apr 2 09:35:17 beam kernel: [ 5549.754586] [<c0cc5af2>] start_kernel+0x3bd/0x3c3 Apr 2 09:35:17 beam kernel: [ 5549.754590] [<c0cc556c>] ? repair_env_string+0x51/0x51 Apr 2 09:35:17 beam kernel: [ 5549.754594] [<c0cc5393>] i386_start_kernel+0x12e/0x131 Apr 2 09:35:17 beam kernel: [ 5549.754596] Code: bc 27 00 00 00 00 55 89 e5 fa 5d c3 8d 76 00 8d bc 27 00 00 00 00 55 89 e5 fb 5d c3 8d 76 00 8d bc 27 00 00 00 00 55 89 e5 fb f4 <5d> c3 89 f6 8d bc 27 00 00 00 00 55 89 e5 f4 5d c3 8d 76 00 8d Apr 2 09:35:17 beam kernel: [ 5549.754644] NMI backtrace for cpu 3 Apr 2 09:35:17 beam kernel: [ 5549.754655] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 3.13.7-100.fc19.i686.PAE #1 Apr 2 09:35:17 beam kernel: [ 5549.754658] Hardware name: Supermicro X7DA8/X7DA8, BIOS 2.1a 01/06/2009 Apr 2 09:35:17 beam kernel: [ 5549.754662] task: f705b600 ti: f70b6000 task.ti: f70b6000 Apr 2 09:35:17 beam kernel: [ 5549.754666] EIP: 0060:[<c0447865>] EFLAGS: 00200286 CPU: 3 Apr 2 09:35:17 beam kernel: [ 5549.754673] EIP is at native_safe_halt+0x5/0x10 Apr 2 09:35:17 beam kernel: [ 5549.754677] EAX: ffffffed EBX: 27ad2caf ECX: 00000000 EDX: 00000000 Apr 2 09:35:17 beam kernel: [ 5549.754680] ESI: 00000000 EDI: 00000003 EBP: f70b7f50 ESP: f70b7f50 Apr 2 09:35:17 beam kernel: [ 5549.754683] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Apr 2 09:35:17 beam kernel: [ 5549.754686] CR0: 8005003b CR2: b496a000 CR3: 26583000 CR4: 000007f0 Apr 2 09:35:17 beam kernel: [ 5549.754689] Stack: Apr 2 09:35:17 beam kernel: [ 5549.754691] f70b7f64 c0417cbc 27ad2caf 00000000 f70b6000 f70b7f6c c04183e6 f70b7f90 Apr 2 09:35:17 beam kernel: [ 5549.754701] c04a0ee9 f70b7f88 c043bfdb 2874cfbe 27ad2caf e21fcd79 00000000 00000000 Apr 2 09:35:17 beam kernel: [ 5549.754709] f70b7fb4 c043a6ca 4fece81f b07d9ec7 5f9feb03 284e9064 3ec34a08 e21fcd79 Apr 2 09:35:17 beam kernel: [ 5549.754717] Call Trace: Apr 2 09:35:17 beam kernel: [ 5549.754725] [<c0417cbc>] default_idle+0x1c/0xa0 Apr 2 09:35:17 beam kernel: [ 5549.754731] [<c04183e6>] arch_cpu_idle+0x26/0x30 Apr 2 09:35:17 beam kernel: [ 5549.754737] [<c04a0ee9>] cpu_startup_entry+0x1c9/0x210 Apr 2 09:35:17 beam kernel: [ 5549.754743] [<c043bfdb>] ? setup_APIC_timer+0xab/0x130 Apr 2 09:35:17 beam kernel: [ 5549.754748] [<c043a6ca>] start_secondary+0x20a/0x2d0 Apr 2 09:35:17 beam kernel: [ 5549.754751] Code: bc 27 00 00 00 00 55 89 e5 fa 5d c3 8d 76 00 8d bc 27 00 00 00 00 55 89 e5 fb 5d c3 8d 76 00 8d bc 27 00 00 00 00 55 89 e5 fb f4 <5d> c3 89 f6 8d bc 27 00 00 00 00 55 89 e5 f4 5d c3 8d 76 00 8d Apr 2 09:35:17 beam kernel: [ 5549.754795] NMI backtrace for cpu 1 Apr 2 09:35:17 beam kernel: [ 5549.754801] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.13.7-100.fc19.i686.PAE #1 Apr 2 09:35:17 beam kernel: [ 5549.754804] Hardware name: Supermicro X7DA8/X7DA8, BIOS 2.1a 01/06/2009 Apr 2 09:35:17 beam kernel: [ 5549.754807] task: f705ad00 ti: f70b2000 task.ti: f70b2000 Apr 2 09:35:17 beam kernel: [ 5549.754811] EIP: 0060:[<c0447865>] EFLAGS: 00200286 CPU: 1 Apr 2 09:35:17 beam kernel: [ 5549.754815] EIP is at native_safe_halt+0x5/0x10 Apr 2 09:35:17 beam kernel: [ 5549.754818] EAX: ffffffed EBX: fe901306 ECX: 00000000 EDX: 00000000 Apr 2 09:35:17 beam kernel: [ 5549.754821] ESI: 00000000 EDI: 00000001 EBP: f70b3f50 ESP: f70b3f50 Apr 2 09:35:17 beam kernel: [ 5549.754824] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Apr 2 09:35:17 beam kernel: [ 5549.754828] CR0: 8005003b CR2: b7312000 CR3: 267ae000 CR4: 000007f0 Apr 2 09:35:17 beam kernel: [ 5549.754830] Stack: Apr 2 09:35:17 beam kernel: [ 5549.754832] f70b3f64 c0417cbc fe901306 00000000 f70b2000 f70b3f6c c04183e6 f70b3f90 Apr 2 09:35:17 beam kernel: [ 5549.754840] c04a0ee9 f70b3f88 c043bfdb e5cfb025 fe901306 a38e2d61 00000000 00000000 Apr 2 09:35:17 beam kernel: [ 5549.754848] f70b3fb4 c043a6ca c145ad8a bd81511b 6d9d931e 1bd278dc 97e53bc6 a38e2d61 Apr 2 09:35:17 beam kernel: [ 5549.754856] Call Trace: Apr 2 09:35:17 beam kernel: [ 5549.754862] [<c0417cbc>] default_idle+0x1c/0xa0 Apr 2 09:35:17 beam kernel: [ 5549.754867] [<c04183e6>] arch_cpu_idle+0x26/0x30 Apr 2 09:35:17 beam kernel: [ 5549.754871] [<c04a0ee9>] cpu_startup_entry+0x1c9/0x210 Apr 2 09:35:17 beam kernel: [ 5549.754877] [<c043bfdb>] ? setup_APIC_timer+0xab/0x130 Apr 2 09:35:17 beam kernel: [ 5549.754881] [<c043a6ca>] start_secondary+0x20a/0x2d0 Apr 2 09:35:17 beam kernel: [ 5549.754884] Code: bc 27 00 00 00 00 55 89 e5 fa 5d c3 8d 76 00 8d bc 27 00 00 00 00 55 89 e5 fb 5d c3 8d 76 00 8d bc 27 00 00 00 00 55 89 e5 fb f4 <5d> c3 89 f6 8d bc 27 00 00 00 00 55 89 e5 f4 5d c3 8d 76 00 8d Apr 2 09:35:17 beam kernel: [ 5549.754931] NMI backtrace for cpu 7 Apr 2 09:35:17 beam kernel: [ 5549.754941] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 3.13.7-100.fc19.i686.PAE #1 Apr 2 09:35:17 beam kernel: [ 5549.754944] Hardware name: Supermicro X7DA8/X7DA8, BIOS 2.1a 01/06/2009 Apr 2 09:35:17 beam kernel: [ 5549.754948] task: f705c800 ti: f70be000 task.ti: f70be000 Apr 2 09:35:17 beam kernel: [ 5549.754952] EIP: 0060:[<c0447865>] EFLAGS: 00200286 CPU: 7 Apr 2 09:35:17 beam kernel: [ 5549.754959] EIP is at native_safe_halt+0x5/0x10 Apr 2 09:35:17 beam kernel: [ 5549.754963] EAX: ffffffed EBX: 0f13f92d ECX: 00000000 EDX: 00000000 Apr 2 09:35:17 beam kernel: [ 5549.754966] ESI: 00000000 EDI: 00000007 EBP: f70bff50 ESP: f70bff50 Apr 2 09:35:17 beam kernel: [ 5549.754969] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Apr 2 09:35:17 beam kernel: [ 5549.754972] CR0: 8005003b CR2: b50a7000 CR3: 2cc3a000 CR4: 000007f0 Apr 2 09:35:17 beam kernel: [ 5549.754975] Stack: Apr 2 09:35:17 beam kernel: [ 5549.754977] f70bff64 c0417cbc 0f13f92d 00000000 f70be000 f70bff6c c04183e6 f70bff90 Apr 2 09:35:17 beam kernel: [ 5549.754987] c04a0ee9 f70bff88 c043bfdb 49aa38fe 0f13f92d a3b7e692 00000000 00000000 Apr 2 09:35:17 beam kernel: [ 5549.754995] f70bffb4 c043a6ca 65d35ad1 f89db0ba 6f0e2a73 c78cfe1f 342ab19c a3b7e692 Apr 2 09:35:17 beam kernel: [ 5549.755003] Call Trace: Apr 2 09:35:17 beam kernel: [ 5549.755003] [<c0417cbc>] default_idle+0x1c/0xa0 Apr 2 09:35:17 beam kernel: [ 5549.755003] [<c04183e6>] arch_cpu_idle+0x26/0x30 Apr 2 09:35:17 beam kernel: [ 5549.755003] [<c04a0ee9>] cpu_startup_entry+0x1c9/0x210 Apr 2 09:35:17 beam kernel: [ 5549.755003] [<c043bfdb>] ? setup_APIC_timer+0xab/0x130 Apr 2 09:35:17 beam kernel: [ 5549.755003] [<c043a6ca>] start_secondary+0x20a/0x2d0 Apr 2 09:35:17 beam kernel: [ 5549.755003] Code: bc 27 00 00 00 00 55 89 e5 fa 5d c3 8d 76 00 8d bc 27 00 00 00 00 55 89 e5 fb 5d c3 8d 76 00 8d bc 27 00 00 00 00 55 89 e5 fb f4 <5d> c3 89 f6 8d bc 27 00 00 00 00 55 89 e5 f4 5d c3 8d 76 00 8d Apr 2 09:35:17 beam kernel: [ 5549.755003] NMI backtrace for cpu 5 Apr 2 09:35:17 beam kernel: [ 5549.755003] INFO: NMI handler (arch_trigger_all_cpu_backtrace_handler) took too long to run: 1.211 msecs Apr 2 09:35:17 beam kernel: [ 5549.755003] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 3.13.7-100.fc19.i686.PAE #1 Apr 2 09:35:17 beam kernel: [ 5549.755003] Hardware name: Supermicro X7DA8/X7DA8, BIOS 2.1a 01/06/2009 Apr 2 09:35:17 beam kernel: [ 5549.755003] task: f705bf00 ti: f70ba000 task.ti: f70ba000 Apr 2 09:35:17 beam kernel: [ 5549.755003] EIP: 0060:[<c0447865>] EFLAGS: 00200286 CPU: 5 Apr 2 09:35:17 beam kernel: [ 5549.755003] EIP is at native_safe_halt+0x5/0x10 Apr 2 09:35:17 beam kernel: [ 5549.755003] EAX: ffffffed EBX: 5ec57104 ECX: 00000000 EDX: 00000000 Apr 2 09:35:17 beam kernel: [ 5549.755003] ESI: 00000000 EDI: 00000005 EBP: f70bbf50 ESP: f70bbf50 Apr 2 09:35:17 beam kernel: [ 5549.755003] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Apr 2 09:35:17 beam kernel: [ 5549.755003] CR0: 8005003b CR2: b50a7000 CR3: 2ad09000 CR4: 000007f0 Apr 2 09:35:17 beam kernel: [ 5549.755003] Stack: Apr 2 09:35:17 beam kernel: [ 5549.755003] f70bbf64 c0417cbc 5ec57104 00000000 f70ba000 f70bbf6c c04183e6 f70bbf90 Apr 2 09:35:17 beam kernel: [ 5549.755003] c04a0ee9 f70bbf88 c043bfdb 41fb9fe2 5ec57104 f3634b4d 00000000 00000000 Apr 2 09:35:17 beam kernel: [ 5549.755003] f70bbfb4 c043a6ca 6fa0f530 7f7988b6 a1948175 7787977b f48909c3 f3634b4d Apr 2 09:35:17 beam kernel: [ 5549.755003] Call Trace: Apr 2 09:35:17 beam kernel: [ 5549.755003] [<c0417cbc>] default_idle+0x1c/0xa0 Apr 2 09:35:17 beam kernel: [ 5549.755003] [<c04183e6>] arch_cpu_idle+0x26/0x30 Apr 2 09:35:17 beam kernel: [ 5549.755003] [<c04a0ee9>] cpu_startup_entry+0x1c9/0x210 Apr 2 09:35:17 beam kernel: [ 5549.755003] [<c043bfdb>] ? setup_APIC_timer+0xab/0x130 Apr 2 09:35:17 beam kernel: [ 5549.755003] [<c043a6ca>] start_secondary+0x20a/0x2d0 Apr 2 09:35:17 beam kernel: [ 5549.755003] Code: bc 27 00 00 00 00 55 89 e5 fa 5d c3 8d 76 00 8d bc 27 00 00 00 00 55 89 e5 fb 5d c3 8d 76 00 8d bc 27 00 00 00 00 55 89 e5 fb f4 <5d> c3 89 f6 8d bc 27 00 00 00 00 55 89 e5 f4 5d c3 8d 76 00 8d Apr 2 09:35:17 beam kernel: [ 5549.755003] INFO: NMI handler (arch_trigger_all_cpu_backtrace_handler) took too long to run: 1.352 msecs
My suspicion is that __break_lease is getting stuck continually doing a "goto retry;". Looking at this code in time_out_leases(): if (past_time(fl->fl_break_time)) lease_modify(before, F_UNLCK); ...and past_time uses time_after() macro. Should that be using time_after_eq? It seems like if the fl_break_time == jiffies then we should go ahead and break the lease, right? That doesn't explain the softlockups, but it doesn't quite look right to me...
Pulled down the kernel that spewed the backtraces in comment #9: 0xc059cc80 is in __break_lease (fs/locks.c:1390). 1385 spin_unlock(&inode->i_lock); 1386 error = wait_event_interruptible_timeout(new_fl->fl_wait, 1387 !new_fl->fl_next, break_time); 1388 spin_lock(&inode->i_lock); 1389 locks_delete_block(new_fl); 1390 if (error >= 0) { 1391 if (error == 0) 1392 time_out_leases(inode); 1393 /* 1394 * Wait for the next conflicting lease that has not been (gdb) list *(__break_lease+0x1ca) 0xc059cbea is in __break_lease (fs/locks.c:603). 598 /* Must be called with i_lock held. */ 599 static void locks_insert_block(struct file_lock *blocker, 600 struct file_lock *waiter) 601 { 602 spin_lock(&blocked_lock_lock); 603 __locks_insert_block(blocker, waiter); 604 spin_unlock(&blocked_lock_lock); 605 } 606 607 /* (gdb) list *(__break_lease+0x1e0) 0xc059cc00 is in __break_lease (fs/locks.c:1386). 1381 if (break_time == 0) 1382 break_time++; 1383 } 1384 locks_insert_block(flock, new_fl); 1385 spin_unlock(&inode->i_lock); 1386 error = wait_event_interruptible_timeout(new_fl->fl_wait, 1387 !new_fl->fl_next, break_time); 1388 spin_lock(&inode->i_lock); 1389 locks_delete_block(new_fl); 1390 if (error >= 0) { ...so it does seem somewhat likely that we're continually cycling in __break_lease via the "goto retry;" loop. How that occurs, I'm not sure. One hypothesis: perhaps this is a livelock where some caller is continually getting a delegation, we try to break it. Then by the time we get around to rescanning the i_lock list again, another lease has appeared?
Thanks for the reproducer. I was able to trigger the problem with that. FWIW, the yum command on the client is unnecessary, just a job that reads all of the files is sufficient. After discussing this with Bruce, I have a possible approach to fix it, but it means adding a field to the generic inode structure. We'll have to see if that's acceptable. If not, then maybe we can stick it in the union with i_pipe field or something.
Here's the stacktrace from the one I reproduced. I'm fairly sure the problem is as we suspect. The server is just too aggressive about giving out delegations. If we've had one broken recently, we should back off on giving out others on that inode for a bit. [76822.564007] INFO: rcu_sched self-detected stall on CPU { 1} (t=65000 jiffies g=67021 c=67020 q=15460) [76822.564007] sending NMI to all CPUs: [76822.564007] NMI backtrace for cpu 1 [76822.564007] CPU: 1 PID: 6301 Comm: genpkgmetadata. Not tainted 3.15.0-0.rc0.git2.1.fc21.x86_64 #1 [76822.564007] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [76822.564007] task: ffff8801181f1ae0 ti: ffff880098698000 task.ti: ffff880098698000 [76822.564007] RIP: 0010:[<ffffffff81057724>] [<ffffffff81057724>] flat_send_IPI_all+0x94/0xc0 [76822.564007] RSP: 0018:ffff88011b003dd0 EFLAGS: 00010006 [76822.564007] RAX: 0000000000000000 RBX: 0000000000000046 RCX: 000000000f000000 [76822.564007] RDX: 0000000000000c00 RSI: ffffffff81c28b80 RDI: 0000000000000300 [76822.564007] RBP: ffff88011b003de8 R08: 0000000000000001 R09: 0000000000000001 [76822.564007] R10: 0000000000000000 R11: ffff88011b003b0e R12: 0000000000000800 [76822.564007] R13: 000000000000000f R14: ffffffff81c7c980 R15: 0000000000000001 [76822.564007] FS: 00007f3d5b7f1740(0000) GS:ffff88011b000000(0000) knlGS:0000000000000000 [76822.564007] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [76822.564007] CR2: 00007f3d5b792210 CR3: 00000000a3d5d000 CR4: 00000000000006e0 [76822.564007] Stack: [76822.564007] 0000000000002710 ffffffff81c7c980 ffffffff81d864a0 ffff88011b003e00 [76822.564007] ffffffff81052424 ffff88011b1cede0 ffff88011b003e58 ffffffff8111a4d4 [76822.564007] ffff88011b003e58 ffffffff810de7bc 0000000000000046 0000000000003c64 [76822.564007] Call Trace: [76822.564007] <IRQ> [76822.564007] [<ffffffff81052424>] arch_trigger_all_cpu_backtrace+0x64/0xa0 [76822.564007] [<ffffffff8111a4d4>] rcu_check_callbacks+0x584/0x850 [76822.564007] [<ffffffff810de7bc>] ? account_process_tick+0x11c/0x1b0 [76822.564007] [<ffffffff810a89a7>] update_process_times+0x47/0x70 [76822.564007] [<ffffffff81126855>] tick_sched_handle.isra.19+0x25/0x60 [76822.564007] [<ffffffff81127061>] tick_sched_timer+0x41/0x60 [76822.564007] [<ffffffff810c85b6>] __run_hrtimer+0x86/0x440 [76822.564007] [<ffffffff81127020>] ? tick_sched_do_timer+0x40/0x40 [76822.564007] [<ffffffff810c9147>] hrtimer_interrupt+0x107/0x260 [76822.564007] [<ffffffff81050337>] local_apic_timer_interrupt+0x37/0x60 [76822.564007] [<ffffffff817e9d6f>] smp_apic_timer_interrupt+0x3f/0x60 [76822.564007] [<ffffffff817e86b2>] apic_timer_interrupt+0x72/0x80 [76822.564007] <EOI> [76822.564007] [<ffffffff810fca1a>] ? lock_acquire+0xba/0x1d0 [76822.564007] [<ffffffff81291249>] ? locks_delete_block+0x19/0x30 [76822.564007] [<ffffffff817dcace>] _raw_spin_lock+0x3e/0x80 [76822.564007] [<ffffffff81291249>] ? locks_delete_block+0x19/0x30 [76822.564007] [<ffffffff81291249>] locks_delete_block+0x19/0x30 [76822.564007] [<ffffffff81292978>] __break_lease+0x378/0x400 [76822.564007] [<ffffffff810fa18d>] ? trace_hardirqs_on+0xd/0x10 [76822.564007] [<ffffffff812448fa>] do_unlinkat+0x1da/0x2c0 [76822.564007] [<ffffffff8111cfc9>] ? current_kernel_time+0x69/0xd0 [76822.564007] [<ffffffff810fa0b5>] ? trace_hardirqs_on_caller+0x105/0x1d0 [76822.564007] [<ffffffff810fa18d>] ? trace_hardirqs_on+0xd/0x10 [76822.564007] [<ffffffff81248196>] SyS_unlink+0x16/0x20 [76822.564007] [<ffffffff817e79a9>] system_call_fastpath+0x16/0x1b [76822.564007] Code: 48 8b 35 f0 c0 d2 00 44 8b 66 3c ff 96 50 01 00 00 44 89 e9 c1 e1 18 89 0c 25 10 a3 57 ff 44 89 e2 80 ce 04 89 14 25 00 a3 57 ff <f6> c7 02 74 16 e8 52 2a 0a 00 48 89 df 57 9d 0f 1f 44 00 00 5b
Created attachment 882803 [details] patch -- locks: add new i_last_lease_break field to struct inode I've been experimenting with this patch today. It does seem to help some -- once the initial round of leases is cleared you stop seeing the softlockups. It doesn't seem to be sufficient though -- that initial round of lease breaks seems to take a very long time, and you still see: [ 452.044004] BUG: soft lockup - CPU#0 stuck for 22s! [genpkgmetadata.:1618] ...so, I need to get to the bottom of why that is. The delegation recalls shouldn't take so long. I generally only see 4-5 in /proc/locks, but it can take many seconds for them to be cleared. Also, I think looking at the last_lease_break time isn't sufficient. If there hasn't been a delegation set on it, but there were recently active writers then you probably still don't want to give out a delegation either. We should probably be setting that value whenever we would have broken the lease, even if there isn't one presently.
Looking at network traces, the callbacks get replies very quickly, so it's not a case of the client taking too long to respond. One possibility is that the server is taking too long to process the CB_RECALL reply?
No, I had it wrong -- the CB_RECALL replies come in quickly, but the client is taking a long time to issue a DELEGRETURN. Very odd...
For some strange reason I'm unable to reproduce the softlockup today. I'm not sure what changed. Still, I can see that the client is, in some cases, taking excessively long to return the delegation. That's at least part of the problem I suspect. There are some tracepoints in the delegreturn code that might be helpful. For my own notes: trace_nfs4_recall_delegation trace_nfs4_delegreturn trace_nfs4_delegreturn_exit
Glad to sea you have recreated the fault and are working on it, thanks. Haven't seen the issue since running: echo 0 >/proc/sys/fs/leases-enable If I can test anything here let me know.
Ok, I've figured out part of the problem on the client. The 4.0 backchannel code seems to be broken, in particular if: - you have the client and server set up for krb5 - you mount a non-canonical name for the server The problem seems to be in check_gss_callback_principal, which does this: /* Expect a GSS_C_NT_HOSTBASED_NAME like "nfs@serverhostname" */ if (memcmp(p, "nfs@", 4) != 0) return 0; p += 4; if (strcmp(p, clp->cl_hostname) != 0) return 0; ...but the principal and the clp->cl_hostname are not guaranteed to be the same if you don't mount the exact hostname that's in the SPN. I'm not certain what the fix is for this. Terry, do you have secure NFS set up at all? It's probably still possible to hit the softlockups even with AUTH_SYS on the backchannel, but this does seem to help exacerbate the problem.
Not using any form of secure NFS. We use a basic NFS setup although we do use "bind" on the server/workstation. On the one site we have one main server and about 6 clients all but one running Fedora 19 (One MSWIndows system) and all generally lightly loaded on a lightly loaded Gigabit local network. Server fstab portion: UUID=848bc851-ea5a-4f30-b3c5-0e6fc8a8e6cd /data ext4 defaults 1 2 UUID=34d22e5f-402c-429d-b132-2a4ac6e22079 /data1 ext4 defaults 1 2 /data/home /home auto bind 0 0 /data/beam /usr/beam auto bind 0 0 /data/src /src auto bind 0 0 /data1/opt /opt auto bind 0 0 /data1/dist /dist auto bind 0 0 /data1/scratch /scratch auto bind 0 0 Client fstab portions: beam.beamnet:/home /home nfs exec,suid,async,rw 0 0 beam.beamnet:/usr/beam /usr/beam nfs exec,suid,async,rw 0 0 beam.beamnet:/src /src nfs exec,suid,async,rw 0 0 beam.beamnet:/scratch /scratch nfs exec,suid,async,rw 0 0 beam.beamnet:/dist /dist nfs exec,suid,async,rw 0 0 beam.beamnet:/var/cache/yum /var/cache/yum nfs exec,suid,async,rw 0 0 beam.beamnet:/opt /opt nfs exec,suid,async,rw 0 0
Ok, I've made some progress. The issue seems to be that we're making this call: error = wait_event_interruptible_timeout(new_fl->fl_wait, !new_fl->fl_next, break_time); ...with a break_time of 0. That prevents the task from ever going to sleep, and that causes it to spin in a tight loop. The logic that sets the break_time is pretty convoluted, so I'm still working out the best way to fix it.
Ah-ha! In nfsd_break_deleg_cb: /* * We don't want the locks code to timeout the lease for us; * we'll remove it ourself if a delegation isn't returned * in time: */ fl->fl_break_time = 0; ...so that line is the culprit. Unfortunately, it was added there to plug another race, so before we can remove it we'll have to see if we need to take steps to mitigate this in other ways. The commit that added it was this one: commit 0272e1fd9f4fa8a43357c168e081744f99e67195 Author: J. Bruce Fields <bfields.edu> Date: Wed Sep 12 18:56:12 2007 -0400 knfsd: let nfsd manage timing out its own leases ..it's probably obsolete now, but we'll need to look carefully.
Ok, I think the right thing to do for this is to just ensure that we don't pass a timeout of 0 to wait_event_interruptible. Patch sent for that upstream. I'll merge it into my linux-next branch today and send it on to Linus next week unless someone objects.
Terry, I have a test kernel building with this patch now: http://koji.fedoraproject.org/koji/taskinfo?taskID=6746470 ...would you be able to test it out and see if it prevents the softlockups? Please don't forget to reenable leases after you install it. Disabling them shouldn't be required with this patched kernel.
I am on holiday so will try this next week. Thanks.
Have been testing this kernel for 4 days now with both test examples and the original MSWindows smb shares. There has been no problems so I consider the fix you have applied works. Many thanks, any idea when this may make it into the Fedora release kernels ?
Not sure yet. Linus pulled this in over the weekend so it's in mainline now. It's marked for stable kernels, so F19 should get it within the next few weeks?
Many thanks and thanks for your work and excellent communications on what has been happening. If all of the bugs I have reported to various bugs lists had such an excellent response. Thanks again.
Just piling on here... I started seeing this issue within a few minutes of a reboot after doing a yum update on a F20 system yesterday. "BUG: soft lockup - CPU#5 stuck for 23s! [10.0.0.254-mana:3582]" where 10.0.0.254 is the address of a NFSv4 server. Funny thing is I was already at 3.13.10-200.fc20.x86_64 and did not see the problem. After yum installed 3.14.2-200.fc20.x86_64 (and a bunch of other things) I got the issue but rebooting back to 3.13.10-200.fc20.x86_64 didn't fix it. Something other than the kernel update must have triggered the issue but I don't see anything in the update package list that would have obviously triggered it. Anyway, turning off "leases-enable" does fix the soft lockup so I'll keep an eye out for the patch. Any info about where to find it? linux.git commit id perhaps? Thanks.
The commit id is f1c6bb2cb8b81013e8979806f8e15e3d53efb96d. This is a long-standing bug, so the patch is applicable to a wide range of kernels, but there were some changes merged into 3.13 that might make this occur more frequently so it's not surprising that we have more reports of this.
Thanks Jeff! Applied the patch to 3.14.2 and rebooted with fs.leases-enable=1 and not seeing any issues.
Im a bit of a newbie, how do I apply this patch? I get this error on my fedora 19 box when copying files from a windows client and was hoping this fixes it. Thanks
I'll backport the patch today. The upstream stable maintainer is behind and it will be a release or two before it hits 3.14.y stable (3.13.y is EOL already).
Patch applied in Fedora git for F20. F19 should pick it up when it rebases to 3.14.y later this week.
Thanks Josh. FWIW, this patch should apply just fine to even very old kernels if you want to pull it in for v3.13 as well. This bug was there for a long time...
I would, but we aren't doing another build of 3.13.y in F19. It's being rebased today I think, so applying it before the rebase wouldn't result in anything productive. Thanks much for working through the issue.
kernel-3.14.3-200.fc20 has been submitted as an update for Fedora 20. https://admin.fedoraproject.org/updates/kernel-3.14.3-200.fc20
Package kernel-3.14.3-200.fc20: * should fix your issue, * was pushed to the Fedora 20 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing kernel-3.14.3-200.fc20' as soon as you are able to, then reboot. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2014-6122/kernel-3.14.3-200.fc20 then log in and leave karma (feedback).
kernel-3.14.3-200.fc20 has been pushed to the Fedora 20 stable repository. If problems still persist, please make note of it in this bug report.
kernel-3.14.3-200.fc20 does not fix the issue for me, I'm having issues booting Fedora Xen DomUs with that kernel version: BUG: soft lockup - CPU#1 stuck for 22s! [systemd-udevd:157] INFO: rcu_sched self-detected stall on CPU { 1} This happens right after grub attempts to load the kernel. However, the Dom0 itself boots fine with it. kernel-3.13.10-200.fc20.x86_64 is the last version that boots up in the DomUs.
Looks like this patch did not make it into F19's next kernel kernel-3.14.3-100.fc19. Any idea when it will be available for F19 ?
Yeah, reopening bug. I'm guessing that Josh didn't notice that this was an F19 bug. This is fixed in F20, but I don't think F19 has the patch yet.
(In reply to Robby from comment #40) > kernel-3.14.3-200.fc20 does not fix the issue for me, I'm having issues > booting Fedora Xen DomUs with that kernel version: > BUG: soft lockup - CPU#1 stuck for 22s! [systemd-udevd:157] > INFO: rcu_sched self-detected stall on CPU { 1} > > This happens right after grub attempts to load the kernel. However, the Dom0 > itself boots fine with it. > > kernel-3.13.10-200.fc20.x86_64 is the last version that boots up in the > DomUs. Do you have the stack trace? I suspect that this is unrelated to this bug (which is rather specific to the lease handling code).
(In reply to Jeff Layton from comment #42) > Yeah, reopening bug. I'm guessing that Josh didn't notice that this was an > F19 bug. This is fixed in F20, but I don't think F19 has the patch yet. It's there in git and in the 3.14.3 rebase for F19 in updates-testing.
(In reply to Terry Barnaby from comment #41) > Looks like this patch did not make it into F19's next kernel > kernel-3.14.3-100.fc19. > Any idea when it will be available for F19 ? Do you say that because you tested the f19 3.14.3 rebase and you still hit the issue? The patch is included: #rhbz 1082586 Patch25075: locks-allow-__break_lease-to-sleep-even-when-break_t.patch #rhbz 1082586 ApplyPatch locks-allow-__break_lease-to-sleep-even-when-break_t.patch and went in with Fedora commit: commit 511f256d9fb8703f883adfe92f4a0d46289a14b8 Author: Justin M. Forbes <jforbes> Date: Thu May 8 07:24:27 2014 -0500 Linux v3.14.3 as I said it would in comment #34.
No I haven't tested it. I looked at the info for kernel-3.14.3-100.fc19 at: https://admin.fedoraproject.org/updates/FEDORA-2014-6167/kernel-3.14.3-100.fc19?_csrf_token=4cd1047c552a95b179371477782e0c98f5f2225a But there was no mention in the Bugs Fixed section or the change log on this bug. The same info for kernel-3.14.3-200.fc20 (mentioned above) does have info on the bug in the Bugs Fixed section and the change log. So I assumed that the fix has not been included in kernel-3.14.3-100.fc19 which is updates-testing as yet.
OK, well the patch is included. Please test the f19 3.14.3 to make sure it fixes your issue.
My mistake then. Let's reclose this bug since the patch is apparently in, and just reopen it if it turns out not to fix Terry's issue for some reason.
kernel-3.14.3-100.fc19 has been submitted as an update for Fedora 19. https://admin.fedoraproject.org/updates/FEDORA-2014-6167/kernel-3.14.3-100.fc19
kernel-3.14.4-100.fc19 has been submitted as an update for Fedora 19. https://admin.fedoraproject.org/updates/kernel-3.14.4-100.fc19
This does fix the "stuck for 22s!" problem. However I have found a problem with this kernel that "may" be related to the bug fix. Scenario: Two Fedora19 systems both running kernel-PAE-3.14.3-100.fc19.i686 The client has mounted directories using NFSv4 from the server. There is a directory "dir1" with a few files in it (5 say). On the client run: cp -a dir1 dir2 On the client run: cd dir2 On the client run: cp filea filea1 The copy locks up. ps -alx gives: 0 1002 2978 1633 20 0 5124 576 nfs4_h D+ pts/4 0:00 cp kingnet.conf kingnet1.conf While the cp is going on, on the client run: ls The ls locks up. ps -alx gives 0 1002 3054 2981 20 0 5136 588 iterat D+ pts/11 0:00 ls While the cp is going on, on the server: run: cp fileb fileb1 The copy returns the error: cp: cannot create regular file ‘beamnet1.conf’: Cannot allocate memory There are no messages in /var/log/messages on either system. There is plenty of memory available on both systems (> 3G free). There is plenty of disk space available. Rebooting the client has no effect (cp still locks up). Restarting nfs-server on the server has no effect. A reboot of the server clears the problem. Instantly repeatable. A "cp -r" does not have this problem, so something todo with file metadata.
kernel-3.14.4-100.fc19 has been pushed to the Fedora 19 stable repository. If problems still persist, please make note of it in this bug report.
The problem in comment 51 is still in kernel-3.14.4-100.fc19.
Seems to still be present with: "/proc/sys/fs/leases-enable = 0" so assume it is not related. Opened bug 1100941 for this one: https://bugzilla.redhat.com/show_bug.cgi?id=1100941