Can you describe the hardware on which this occurs, specifically the number and type of processors/cores? We've been tracking an intermittent occurrence of what may be the same problem, and would appreciate as much information as possible to allow us to reproduce the problem more reliably.
I have so far been unable to reproduce a crash with the same version of systemtap & kernel. I'll continue to investigate.
Created attachment 463163 [details] potential patch Please apply this trial patch to your copy of the systemtap runtime: # cd /usr/share/systemtap # patch -p1 < foo.patch Then rerun your test case (with --poison-cache one time, in order to invalidate prior cached versions of the .ko's with the old runtime): # stap --poison-cache test.stp -c ./test Add a -DDEBUG_TRANS and/or -DSTAP_DEBUG_PRINTK if the problem recurs, to assist the next round of debugging.
Another item to test is whether the crash occurs near startup vs. shutdown. Please try stap test.stp -c 'sleep 100'
These tests should be run on whatever machine displays the original problem. Another few variants to test, with the patched runtime: stap test.stp -c 'sleep 2' stap test.stp -c '/bin/true' # to represent 0 time
Can you describe the NFS client/server setup on the machine where this problem occurs, such as NFS daemons running, exports/mounts configured, which of them connected?
It would be helpful to compare the probes generated from the previous (working?) systemtap-1.1 version and the new 1.3 version. Could you run stap -k -p4 .... on the same machine, both using the previous systemtap and the current one, then pack up the two /tmp/stap* directories into bugzilla attachments?
At this point any fix will likely not make rhel 5.6 release timeframe but issue will continue to be worked and this issue will move to z-stream so that fix will be made available for rhel 5.6 when completed.
It appears that the residual crash may be related to the delayed_work instance used by runtime/transport/transport.c, associated with the procfs .ctl file.
Created attachment 468977 [details] potential patch #2 Elements of this patch have been committed to upstream systemtap, and found to help the problem. Please test it and let us know.
OK, thanks for testing. Can you attach a new console panic log to this bug? Can you describe the workload and NFS configuration of the machine, at the time the test script is run? Do you run the script in isolation (with no other scripts running near the same time)?
We have never been able to reproduce this problem without a highly concurrent execution of systemtap, so if indeed the newest patch showed the problem still, we don't have a fix yet. (I believe QA would have a similar difficulty in verifying a fix.) I'm trying to work out some other monitoring script/tool to run on the affected machine at the same time as this problem triggers, to gather more information. (It would be ideal to get remote logon access to it.)
Does kdump work on the affected machine(s), so that we could collect a crash image?
Since local reproduction of this exact problem has still not been successful, I'm hoping to supply you with another timer-event-monitoring script to run in the background, while the problematic script run is attempted. The hope is that this secondary script would save some output in at least the kernel crash buffer, that we could later try to extract. One possible timer-monitoring-script could be something like: stap -t -v -e ' probe kernel.trace("timer*") { log($$name . " " . $$parms$$ . " " . symdata($timer->function)) }' -d kernel -o LOGFILE -c 'SHELL SCRIPT RUNNING OTHER STAP SCRIPT'
The log file would be interesting for a non-crashing case, but mainly for comparing with a crashing case. Is the kernel panic not reproducible with the timer-monitoring script also running?
I am continuing to analyze this problem on another RH internal rhts machine (sun-x4200-01.rhts). I found an interesting combination of load that triggers what looks like the same problem. In this scenario, the testpro/test program is run in a tight shell loop, for a couple of minutes. No apparent problem. Then, in a separate window, I started a pretty vanilla systemtap script: probe timer.s(1) { println (%{ jiffies %}) } A minute or two later, the same timer-list corruption BUG_ON showed up. It appears that there is a problem with the nfslock's and systemtap's frequent kernel timer operations, but only in unison (!). Then, a further test, with the test binary being run in an even tighter loop # while true; do date; ./testpro/test; done showed a kernel crash, with no systemtap activity whatsoever. This is finally evidence that, while systemtap's presence seems to amplify the problem, it is not actually responsible. See /var/crash/2011-02-19-12:12 on that machine, whose RHTS reservation is now extended till about Feb. 22. I'll archive the vmcore somewhere else too. I'm reassigning this bug to the kernel.
Changing the summary line, to remove the possibly-distracting mention of systemtap.
I've been able to reproduce this too by running the reproducer in a tight loop: ----------- [cut here ] --------- [please bite here ] --------- Kernel BUG at lib/list_debug.c:31 invalid opcode: 0000 [1] SMP last sysfs file: /devices/pci0000:00/0000:00:00.0/irq CPU 1 Modules linked in: nfsd exportfs nfs_acl autofs4 hidp rfcomm l2cap bluetooth rpcsec_gss_krb5 auth_rpcgss testmgr_cipher testmgr aead crypto_blkcipher crypto_algapi des lockd sunrpc ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables be2iscsi ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic ipv6 xfrm_nalgo crypto_api uio cxgb3i cxgb3 8021q libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec dell_wmi wmi button battery asus_acpi acpi_memhotplug ac parport_pc lp parport floppy virtio_net ata_piix libata i2c_piix4 scsi_mod i2c_core pcspkr virtio_balloon tpm_tis tpm serio_raw tpm_bios dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod virtio_blk virtio_pci virtio_ring virtio ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 25099, comm: lockd Tainted: GF 2.6.18-247.el5.jtltest.136debug #1 RIP: 0010:[<ffffffff80165c28>] [<ffffffff80165c28>] __list_add+0x49/0x65 RSP: 0000:ffff8100081a5ee0 EFLAGS: 00010086 RAX: 0000000000000058 RBX: ffff81003ffb8000 RCX: 0000000000000058 RDX: ffff810007a8cec0 RSI: 0000000000000000 RDI: ffffffff803327e0 RBP: ffffffff885b50a0 R08: 0000000000000002 R09: ffffffff80017f85 R10: ffffffff80017f85 R11: 0000000000000000 R12: ffff81003ffb8000 R13: 000000010595e254 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff8100028cf4c8(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000036df6d4670 CR3: 0000000012166000 CR4: 00000000000006e0 Process lockd (pid: 25099, threadinfo ffff8100081a4000, task ffff810007a8cec0) Stack: ffff81003106fa48 ffffffff8001df6d ffff81003106fa48 0000000000000286 000000010595e254 ffff81003df2d6d0 ffff81003106fa48 0000000000000003 ffffffff8859e37e ffffffff8859e493 ffff81003df2d6d0 0000000000000000 Call Trace: [<ffffffff8001df6d>] __mod_timer+0xf2/0x10e [<ffffffff8859e37e>] :lockd:lockd+0x0/0x2c6 [<ffffffff8859e493>] :lockd:lockd+0x115/0x2c6 [<ffffffff80061079>] child_rip+0xa/0x11 [<ffffffff800606a8>] restore_args+0x0/0x30 [<ffffffff800d9697>] zone_statistics+0x3e/0x6d [<ffffffff8859e37e>] :lockd:lockd+0x0/0x2c6 [<ffffffff8006106f>] child_rip+0x0/0x11 -------------------------------[snip]-------------------------------- Turning up nfsd_debug and nlm_debug and running the reproducer shows this: nfsd: creating service: port 2049 tcp 0 udp 0 nfsd: allocating 32 readahead buffers. NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory NFSD: starting 90-second grace period NFS locking service started (ver 0.5). lockd: shutting down host module lockd: nuking all hosts... lockd: host garbage collection lockd: nlmsvc_mark_resources nfsd: freeing readahead buffers. nfsd: last server has exited nfsd: unexporting all filesystems ...so it looks like even though we're not asking to start any threads, we still end up starting up and then shutting down lockd anyway. That should probably be fixed, but doesn't seem immediately harmful. What's less clear is how we're corrupting the timer list. lockd does set a timer to handle the grace period so perhaps the problem is in how that is being handled. That said, this is a legacy interface and we isn't generally used under normal circumstances. The exception is when the machine is first booted. If nfsd.ko is not yet plugged in, then we'll use nfsctl() from nfsd. All subsequent accesses generally are done via /proc/fs/nfsd however. nfsctl() is restricted root as well, so an unprivileged user shouldn't be able to exploit this. I simply don't see why this bug is marked urgent. Can you explain the possible attack vector or normal usecase that would cause someone to trip over this?
Ok, I went and commented out the timer ops that touch the nlm_grace_period_timer, rebuilt lockd.ko and reran the test in a tight loop. It survived for about an hour and a half, which means that the race is probably in that stuff somewhere. I wonder if we're getting into a state where a new lockd is being started before the old lockd has finished going down. lockd_down does this: /* * Wait for the lockd process to exit, but since we're holding * the lockd semaphore, we can't wait around forever ... */ clear_thread_flag(TIF_SIGPENDING); wait_event_timeout(lockd_exit, nlmsvc_pid == 0, HZ); ...so I think it's possible but it seems like we'd probably get a printk if that happens.
Yes, when I change that wait_event_timeout to a wait_event, it seems to survive. Also, when I looked back through the log before the panic, I see this: lockd_down: lockd failed to exit, clearing pid ...so yeah, it looks like the old lockd wasn't all the way down before the new one came up. I think the right thing to do is just to not time out while waiting for lockd to come down. I think it's better to risk having lockd_down() hang than to have two lockd's going at the same time. An alternate idea would be to move the nlm_grace_period_timer to the stack. If each lockd had its own private copy, then this would be unlikely to happen. But, we could still end up with races. The old timer could pop just after the new one is added, and you'd end up with the grace period being far too short.
Created attachment 483556 [details] patch -- make lockd_down wait indefinitely for lockd to come down This patch seems to fix the problem for me. The idea is to get rid of the timeout when lockd_down is waiting for lockd to come down. Would you be able to test this against your reproducer and see if it fixes the problem for you too?
Patch(es) available in kernel-2.6.18-250.el5 Detailed testing feedback is always welcomed.
hi, If someone know any stable reproduce methond? I try to reproduce on RHEL5.6GA using below steps: # gcc -g -o test test.c (get it from attachment) # cat test.stp probe syscall.nfsservctl { log(sprint(cmd)) log(sprint(argp_uaddr)) log(sprint(resp_uaddr)) } # cat test.sh #!/bin/bash stap -v test.stp -c ./test # while true; do date; stap -t -v -e ' probe kernel.trace("timer*") { log($$name . " " . $$parms$$ . " " . symdata($timer->function))} ' -d kernel -o LOGFILE -c ./test.sh; done It's still running and not crash kernel yet, so is it correct reproducer? # uname -a Linux athlon4.rhts.eng.bos.redhat.com 2.6.18-238.el5 #1 SMP Sun Dec 19 14:24:47 EST 2010 i686 athlon i386 GNU/Linux You have new mail in /var/spool/mail/root # rpm -qa|grep systemtap systemtap-runtime-1.3-4.el5 systemtap-1.3-4.el5
(In reply to comment #77) > hi, > If someone know any stable reproduce methond? I try to reproduce on RHEL5.6GA > using below steps: > > # gcc -g -o test test.c (get it from attachment) > # cat test.stp > probe syscall.nfsservctl { > log(sprint(cmd)) > log(sprint(argp_uaddr)) > log(sprint(resp_uaddr)) > } > > # cat test.sh > #!/bin/bash > stap -v test.stp -c ./test > > # while true; do date; stap -t -v -e ' probe kernel.trace("timer*") { > log($$name . " " . $$parms$$ . " " . symdata($timer->function))} ' -d kernel -o > LOGFILE -c ./test.sh; done > > It's still running and not crash kernel yet, so is it correct reproducer? > > # uname -a > Linux athlon4.rhts.eng.bos.redhat.com 2.6.18-238.el5 #1 SMP Sun Dec 19 14:24:47 > EST 2010 i686 athlon i386 GNU/Linux > You have new mail in /var/spool/mail/root > # rpm -qa|grep systemtap > systemtap-runtime-1.3-4.el5 > systemtap-1.3-4.el5 I can answer my question, got kernel crash just now when running it period of time(less than one hour): list_add corruption. prev->next should be c081fa54, but was 00000000 ------------[ cut here ]------------ kernel BUG at lib/list_debug.c:31! invalid opcode: 0000 [#1] SMP last sysfs file: /module/xfrm_nalgo/sections/__versions Modules linked in: stap_6f0ab4882faa5855489d842eabf8c101_14436(U) stap_79061a3e611ba6498b7c01ffbe675c4d_6937(U) nfsd exportfs nfs_acl auth_rpcgss autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc be2iscsi ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic ipv6 xfrm_nalgo crypto_api uio cxgb3i cxgb3 8021q libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi loop dm_multipath scsi_dh scsi_mod video backlight sbs power_meter hwmon i2c_ec dell_wmi wmi button battery asus_acpi ac lp snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_emu10k1 snd_rawmidi snd_ac97_codec ac97_bus snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_device floppy snd_timer snd_page_alloc snd_util_mem snd_hwdep snd pcspkr emu10k1_gp soundcore gameport 3c59x i2c_amd756 mii i2c_core tpm_tis tpm parport_pc parport tpm_bios dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd CPU: 0 EIP: 0060:[<c04f3e19>] Tainted: G S VLI EFLAGS: 00010046 (2.6.18-238.el5 #1) EIP is at __list_add+0x39/0x52 eax: 00000048 ebx: c081fa54 ecx: 00000096 edx: 00000000 esi: f0d55550 edi: f0d55550 ebp: 00d66178 esp: dc2bdfa0 ds: 007b es: 007b ss: 0068 Process lockd (pid: 13363, ti=dc2bd000 task=dbe61550 task.ti=dc2bd000) Stack: c0650890 c081fa54 00000000 c081ef80 f0d55550 c081ef80 c042e064 00000000 00000282 00d66178 d7fe1200 dc3808e0 00000000 f0d47f56 f0d47e69 00000000 00000000 c0405c87 d7fe1200 00000000 00000000 00000000 00000000 00000000 Call Trace: [<c042e064>] __mod_timer+0xcf/0xe4 [<f0d47f56>] lockd+0xed/0x26f [lockd] [<f0d47e69>] lockd+0x0/0x26f [lockd] [<c0405c87>] kernel_thread_helper+0x7/0x10 ======================= Code: 74 17 50 52 68 53 08 65 c0 e8 bf 20 f3 ff 0f 0b 1a 00 05 08 65 c0 83 c4 0c 8b 06 39 d8 74 17 50 53 68 90 08 65 c0 e8 a2 20 f3 ff <0f> 0b 1f 00 05 08 65 c0 83 c4 0c 89 7b 04 89 1f 89 77 04 89 3e EIP: [<c04f3e19>] __list_add+0x39/0x52 SS:ESP 0068:dc2bdfa0 <0>Kernel panic - not syncing: Fatal exception So I will test on new kernel and paste results later.
verified result: I run the reproducer of comment #77 with almost two hours on i386 and x86_64 arch, servers work fine and no other issue found till now. If need any extra testing pls let me know. [root@athlon4 ~]# uname -a Linux athlon4.rhts.eng.bos.redhat.com 2.6.18-256.el5 #1 SMP Thu Apr 7 19:59:40 EDT 2011 i686 athlon i386 GNU/Linux
hi, I re-test the newest kernel 2.6.18-264.el5 on RHEL5.7-Server-20110525.0 x86_64, and got kernel crash again, the console shown as below, pls help to check: BUG: soft lockup - CPU#2 stuck for 60s! [init:1] CPU 2: Modules linked in: stap_ce81c73dec031bff99dc861d2e5c0fd8_14132(U) stap_b1a252a1d53a35fa01915881e7637775_6997(U) nfsd exportfs nfs_acl auth_rpcgss autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc cpufreq_ondemand powernow_k8 freq_table mperf be2iscsi ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic ipv6 xfrm_nalgo crypto_api uio cxgb3i libcxgbi cxgb3 8021q libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi loop dm_multipath scsi_dh video backlight sbs power_meter i2c_ec dell_wmi wmi button battery asus_acpi acpi_memhotplug ac parport_pc lp parport i2c_piix4 i2c_core tpm_tis tpm amd64_edac_mod bnx2 k10temp tpm_bios serio_raw edac_mc sg hwmon pcspkr dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod qla2xxx scsi_transport_fc shpchp mptsas mptscsih mptbase scsi_transport_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 1, comm: init Tainted: G ---- 2.6.18-264.el5 #1 RIP: 0010:[<ffffffff80064bca>] [<ffffffff80064bca>] .text.lock.spinlock+0x0/0x30 RSP: 0018:ffff810107b73f18 EFLAGS: 00000286 RAX: 000000000000bbe3 RBX: ffffffff80321a00 RCX: ffff810107b3a000 RDX: 0000000000000000 RSI: ffff81000100ea80 RDI: ffffffff80321a80 RBP: ffff810107b73e90 R08: ffff810219fe3cf8 R09: ffff8103cdb1e000 R10: 0000000000000001 R11: ffff810107b3bb20 R12: ffffffff8005dc8e R13: ffff81000100ea80 R14: ffffffff80079390 R15: ffff810107b73e90 FS: 00002b16274b25b0(0000) GS:ffff810107b193c0(0000) knlGS:00000000f7f166c0 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000042ae4000 CR3: 0000000217805000 CR4: 00000000000006e0 Call Trace: <IRQ> [<ffffffff800a0e3a>] __rcu_process_callbacks+0xe4/0x1a1 [<ffffffff80096d3d>] tasklet_action+0x8f/0x12b [<ffffffff80012562>] __do_softirq+0x89/0x133 [<ffffffff8005e2fc>] call_softirq+0x1c/0x28 [<ffffffff8006d636>] do_softirq+0x2c/0x7d [<ffffffff8005dc8e>] apic_timer_interrupt+0x66/0x6c <EOI> [<ffffffff887d432b>] :stap_b1a252a1d53a35fa01915881e7637775_6997:lookup_bad_addr+0x35/0xbf [<ffffffff887d432b>] :stap_b1a252a1d53a35fa01915881e7637775_6997:lookup_bad_addr+0x35/0xbf [<ffffffff887d68f4>] :stap_b1a252a1d53a35fa01915881e7637775_6997:function__dwarf_pretty_print_0+0x167/0x5f6 [<ffffffff887d8e56>] :stap_b1a252a1d53a35fa01915881e7637775_6997:probe_1996+0xb0/0x2e8 [<ffffffff887d7809>] :stap_b1a252a1d53a35fa01915881e7637775_6997:enter_tracepoint_probe_2+0x17c/0x307 [<ffffffff80048482>] init_timer+0x23/0x51 [<ffffffff8006387c>] schedule_timeout+0x77/0xad [<ffffffff80049603>] add_wait_queue+0x24/0x34 [<ffffffff80099dc9>] process_timeout+0x0/0x5 [<ffffffff8002dbe4>] pipe_poll+0x2d/0x90 [<ffffffff80011ba9>] do_select+0x3dc/0x43c [<ffffffff8001ed44>] __pollwait+0x0/0xe2 [<ffffffff8008e85a>] default_wake_function+0x0/0xe [<ffffffff8012d57a>] avc_has_perm_noaudit+0x35f/0x38a [<ffffffff8012e039>] avc_has_perm+0x46/0x58 [<ffffffff8012e039>] avc_has_perm+0x46/0x58 [<ffffffff8012e90d>] inode_has_perm+0x56/0x63 [<ffffffff8000a851>] __link_path_walk+0xfa8/0xfd1 [<ffffffff800ec0e9>] core_sys_select+0x1bc/0x265 [<ffffffff8002cc54>] mntput_no_expire+0x19/0x89 [<ffffffff8001b035>] cp_new_stat+0xe5/0xfd [<ffffffff80016a61>] sys_select+0x153/0x17c [<ffffffff8005d116>] system_call+0x7e/0x83 BUG: soft lockup - CPU#3 stuck for 60s! [swapper:0] CPU 3: Modules linked in: stap_ce81c73dec031bff99dc861d2e5c0fd8_14132(U) stap_b1a252a1d53a35fa01915881e7637775_6997(U) nfsd exportfs nfs_acl auth_rpcgss autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc cpufreq_ondemand powernow_k8 freq_table mperf be2iscsi ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic ipv6 xfrm_nalgo crypto_api uio cxgb3i libcxgbi cxgb3 8021q libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi loop dm_multipath scsi_dh video backlight sbs power_meter i2c_ec dell_wmi wmi button battery asus_acpi acpi_memhotplug ac parport_pc lp parport i2c_piix4 i2c_core tpm_tis tpm amd64_edac_mod bnx2 k10temp tpm_bios serio_raw edac_mc sg hwmon pcspkr dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod qla2xxx scsi_transport_fc shpchp mptsas mptscsih mptbase scsi_transport_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 0, comm: swapper Tainted: G ---- 2.6.18-264.el5 #1 RIP: 0010:[<ffffffff80064bcc>] [<ffffffff80064bcc>] .text.lock.spinlock+0x2/0x30 RSP: 0000:ffff810107b9ff18 EFLAGS: 00000286 RAX: 000000000000bbe3 RBX: ffffffff80321a00 RCX: 00000000000000ff RDX: 0000000000000000 RSI: ffff81023706ba80 RDI: ffffffff80321a80 RBP: ffff810107b9fe90 R08: 000000000301fa9c R09: ffff810107b97e48 R10: 0000000000000001 R11: 0000000000000010 R12: ffffffff8005dc8e R13: ffff81023706ba80 R14: ffffffff80079390 R15: ffff810107b9fe90 FS: 00002b7694a366e0(0000) GS:ffff8102371cc6c0(0000) knlGS:00000000f7f698d0 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 000000399027b110 CR3: 0000000000201000 CR4: 00000000000006e0 Call Trace: <IRQ> [<ffffffff800a0e3a>] __rcu_process_callbacks+0xe4/0x1a1 [<ffffffff80096d3d>] tasklet_action+0x8f/0x12b [<ffffffff80012562>] __do_softirq+0x89/0x133 [<ffffffff8005e2fc>] call_softirq+0x1c/0x28 [<ffffffff8006d636>] do_softirq+0x2c/0x7d [<ffffffff8006bdf3>] default_idle+0x0/0x50 [<ffffffff8005dc8e>] apic_timer_interrupt+0x66/0x6c <EOI> [<ffffffff8006be1c>] default_idle+0x29/0x50 [<ffffffff80048fc8>] cpu_idle+0x95/0xb8 [<ffffffff80078a9a>] start_secondary+0x479/0x488 BUG: soft lockup - CPU#6 stuck for 60s! [swapper:0] CPU 6: Modules linked in: stap_ce81c73dec031bff99dc861d2e5c0fd8_14132(U) stap_b1a252a1d53a35fa01915881e7637775_6997(U) nfsd exportfs nfs_acl auth_rpcgss autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc cpufreq_ondemand powernow_k8 freq_table mperf be2iscsi ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic ipv6 xfrm_nalgo crypto_api uio cxgb3i libcxgbi cxgb3 8021q libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi loop dm_multipath scsi_dh video backlight sbs power_meter i2c_ec dell_wmi wmi button battery asus_acpi acpi_memhotplug ac parport_pc lp parport i2c_piix4 i2c_core tpm_tis tpm amd64_edac_mod bnx2 k10temp tpm_bios serio_raw edac_mc sg hwmon pcspkr dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod qla2xxx scsi_transport_fc shpchp mptsas mptscsih mptbase scsi_transport_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 0, comm: swapper Tainted: G ---- 2.6.18-264.el5 #1 RIP: 0010:[<ffffffff80064bcf>] [<ffffffff80064bcf>] .text.lock.spinlock+0x5/0x30 RSP: 0018:ffff81022fc43f18 EFLAGS: 00000286 RAX: 000000000000bbe3 RBX: ffffffff80321a00 RCX: ffff81022fc38fd8 RDX: 0000000000000000 RSI: ffff81000101f780 RDI: ffffffff80321a80 RBP: ffff81022fc43e90 R08: 000000000301fa9e R09: ffff81022fc3fe48 R10: 0000000000000001 R11: 0000000000000010 R12: ffffffff8005dc8e R13: ffff81000101f780 R14: ffffffff80079390 R15: ffff81022fc43e90 FS: 00002abf63f4e4a0(0000) GS:ffff810107bf5dc0(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 00002abf63249980 CR3: 0000000000201000 CR4: 00000000000006e0 Call Trace: <IRQ> [<ffffffff800a0e3a>] __rcu_process_callbacks+0xe4/0x1a1 [<ffffffff80096d3d>] tasklet_action+0x8f/0x12b [<ffffffff80012562>] __do_softirq+0x89/0x133 [<ffffffff8005e2fc>] call_softirq+0x1c/0x28 [<ffffffff8006d636>] do_softirq+0x2c/0x7d [<ffffffff8006bdf3>] default_idle+0x0/0x50 [<ffffffff8005dc8e>] apic_timer_interrupt+0x66/0x6c <EOI> [<ffffffff8006be1c>] default_idle+0x29/0x50 [<ffffffff80048fc8>] cpu_idle+0x95/0xb8 [<ffffffff80078a9a>] start_secondary+0x479/0x488 BUG: soft lockup - CPU#0 stuck for 60s! [swapper:0] CPU 0: Modules linked in: stap_ce81c73dec031bff99dc861d2e5c0fd8_14132(U) stap_b1a252a1d53a35fa01915881e7637775_6997(U) nfsd exportfs nfs_acl auth_rpcgss autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc cpufreq_ondemand powernow_k8 freq_table mperf be2iscsi ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic ipv6 xfrm_nalgo crypto_api uio cxgb3i libcxgbi cxgb3 8021q libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi loop dm_multipath scsi_dh video backlight sbs power_meter i2c_ec dell_wmi wmi button battery asus_acpi acpi_memhotplug ac parport_pc lp parport i2c_piix4 i2c_core tpm_tis tpm amd64_edac_mod bnx2 k10temp tpm_bios serio_raw edac_mc sg hwmon pcspkr dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod qla2xxx scsi_transport_fc shpchp mptsas mptscsih mptbase scsi_transport_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 0, comm: swapper Tainted: G ---- 2.6.18-264.el5 #1 RIP: 0010:[<ffffffff80064bcf>] [<ffffffff80064bcf>] .text.lock.spinlock+0x5/0x30 RSP: 0000:ffffffff804a9f18 EFLAGS: 00000286 RAX: 000000000000bbe3 RBX: ffffffff80321a00 RCX: 00000000000000ff RDX: 0000000000000000 RSI: ffff810001006400 RDI: ffffffff80321a80 RBP: ffffffff804a9e90 R08: 000000000301fa9c R09: ffffffff8045bee8 R10: 0000000000000001 R11: 0000000000000010 R12: ffffffff8005dc8e R13: ffff810001006400 R14: ffffffff80079390 R15: ffffffff804a9e90 FS: 000000004171e940(0000) GS:ffffffff8042a000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000042ae2000 CR3: 000000042addb000 CR4: 00000000000006e0 Call Trace: <IRQ> [<ffffffff800a0e3a>] __rcu_process_callbacks+0xe4/0x1a1 [<ffffffff80096d3d>] tasklet_action+0x8f/0x12b [<ffffffff80012562>] __do_softirq+0x89/0x133 [<ffffffff8005e2fc>] call_softirq+0x1c/0x28 [<ffffffff8006d636>] do_softirq+0x2c/0x7d [<ffffffff8006bdf3>] default_idle+0x0/0x50 [<ffffffff8005dc8e>] apic_timer_interrupt+0x66/0x6c <EOI> [<ffffffff8006be1c>] default_idle+0x29/0x50 [<ffffffff80048fc8>] cpu_idle+0x95/0xb8 [<ffffffff80465809>] start_kernel+0x220/0x225 [<ffffffff8046522f>] _sinittext+0x22f/0x236 BUG: soft lockup - CPU#5 stuck for 60s! [swapper:0] CPU 5: Modules linked in: stap_ce81c73dec031bff99dc861d2e5c0fd8_14132(U) stap_b1a252a1d53a35fa01915881e7637775_6997(U) nfsd exportfs nfs_acl auth_rpcgss autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc cpufreq_ondemand powernow_k8 freq_table mperf be2iscsi ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic ipv6 xfrm_nalgo crypto_api uio cxgb3i libcxgbi cxgb3 8021q libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi loop dm_multipath scsi_dh video backlight sbs power_meter i2c_ec dell_wmi wmi button battery asus_acpi acpi_memhotplug ac parport_pc lp parport i2c_piix4 i2c_core tpm_tis tpm amd64_edac_mod bnx2 k10temp tpm_bios serio_raw edac_mc sg hwmon pcspkr dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod qla2xxx scsi_transport_fc shpchp mptsas mptscsih mptbase scsi_transport_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 0, comm: swapper Tainted: G ---- 2.6.18-264.el5 #1 RIP: 0010:[<ffffffff80064bcc>] [<ffffffff80064bcc>] .text.lock.spinlock+0x2/0x30 RSP: 0018:ffff81022fc17f18 EFLAGS: 00000286 RAX: 000000000000bbe3 RBX: ffffffff80321a00 RCX: ffff810237234fd8 RDX: 0000000000000000 RSI: ffff810237074100 RDI: ffffffff80321a80 RBP: ffff81022fc17e90 R08: 000000000301fa9e R09: ffff81022fc13e48 R10: 0000000000000001 R11: 0000000000000246 R12: ffffffff8005dc8e R13: ffff810237074100 R14: ffffffff80079390 R15: ffff81022fc17e90 FS: 00002aab92c1caf0(0000) GS:ffff810237255e40(0000) knlGS:00000000f7fbe6c0 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 00002b013a08da08 CR3: 000000022e4ac000 CR4: 00000000000006e0 Call Trace: <IRQ> [<ffffffff800a0e3a>] __rcu_process_callbacks+0xe4/0x1a1 [<ffffffff80096d3d>] tasklet_action+0x8f/0x12b [<ffffffff80012562>] __do_softirq+0x89/0x133 [<ffffffff8005e2fc>] call_softirq+0x1c/0x28 [<ffffffff8006d636>] do_softirq+0x2c/0x7d [<ffffffff8006bdf3>] default_idle+0x0/0x50 [<ffffffff8005dc8e>] apic_timer_interrupt+0x66/0x6c <EOI> [<ffffffff8006be1c>] default_idle+0x29/0x50 [<ffffffff80048fc8>] cpu_idle+0x95/0xb8 [<ffffffff80078a9a>] start_secondary+0x479/0x488 BUG: soft lockup - CPU#4 stuck for 60s! [kstopmachine:7930] CPU 4: Modules linked in: stap_ce81c73dec031bff99dc861d2e5c0fd8_14132(U) stap_b1a252a1d53a35fa01915881e7637775_6997(U) nfsd exportfs nfs_acl auth_rpcgss autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc cpufreq_ondemand powernow_k8 freq_table mperf be2iscsi ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic ipv6 xfrm_nalgo crypto_api uio cxgb3i libcxgbi cxgb3 8021q libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi loop dm_multipath scsi_dh video backlight sbs power_meter i2c_ec dell_wmi wmi button battery asus_acpi acpi_memhotplug ac parport_pc lp parport i2c_piix4 i2c_core tpm_tis tpm amd64_edac_mod bnx2 k10temp tpm_bios serio_raw edac_mc sg hwmon pcspkr dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod qla2xxx scsi_transport_fc shpchp mptsas mptscsih mptbase scsi_transport_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 7930, comm: kstopmachine Tainted: G ---- 2.6.18-264.el5 #1 RIP: 0010:[<ffffffff80064bcc>] [<ffffffff80064bcc>] .text.lock.spinlock+0x2/0x30 RSP: 0000:ffff810107bcff18 EFLAGS: 00000286 RAX: 000000000000bbe3 RBX: ffffffff80321a00 RCX: 00000000000000ff RDX: 0000000000000000 RSI: ffff810001017100 RDI: ffffffff80321a80 RBP: ffff810107bcfe90 R08: 000000000301fa9c R09: ffffffff8005e2fc R10: 0000000000000001 R11: ffff810418706860 R12: ffffffff8005dc8e R13: ffff810001017100 R14: ffffffff80079390 R15: ffff810107bcfe90 FS: 00002b536f20daf0(0000) GS:ffff810107b19140(0000) knlGS:00000000f7cd46c0 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 00000000424cf958 CR3: 000000021631e000 CR4: 00000000000006e0 Call Trace: <IRQ> [<ffffffff800a0e3a>] __rcu_process_callbacks+0xe4/0x1a1 [<ffffffff80096d3d>] tasklet_action+0x8f/0x12b [<ffffffff80012562>] __do_softirq+0x89/0x133 [<ffffffff8005e2fc>] call_softirq+0x1c/0x28 [<ffffffff8006d636>] do_softirq+0x2c/0x7d [<ffffffff8005dc8e>] apic_timer_interrupt+0x66/0x6c <EOI> [<ffffffff8000ad2a>] kmem_cache_alloc+0x6e/0x76 [<ffffffff8001f9a6>] copy_process+0x7d0/0x15eb [<ffffffff800a0b1a>] alloc_pid+0x228/0x2c4 [<ffffffff80031074>] do_fork+0x69/0x1c1 [<ffffffff800a2be7>] keventd_create_kthread+0x0/0xc4 [<ffffffff80063177>] wait_for_completion+0x8f/0xa2 [<ffffffff800a2be7>] keventd_create_kthread+0x0/0xc4 [<ffffffff8005df3d>] kernel_thread+0x81/0xeb [<ffffffff800a2be7>] keventd_create_kthread+0x0/0xc4 [<ffffffff800b3713>] stopmachine+0x0/0x62 [<ffffffff8005dfa7>] child_rip+0x0/0x11 [<ffffffff800b35f7>] do_stop+0x75/0x191 [<ffffffff800b3582>] do_stop+0x0/0x191 [<ffffffff80032732>] kthread+0xfe/0x132 [<ffffffff8005dfb1>] child_rip+0xa/0x11 [<ffffffff800a2be7>] keventd_create_kthread+0x0/0xc4 [<ffffffff80032634>] kthread+0x0/0x132 [<ffffffff8005dfa7>] child_rip+0x0/0x11
additional info: I run the reproducer in a tight loop about three hours.
I suspect this is a separate problem, because timers don't appear in any of the backtraces. I'd appreciate it if you opened a new bug to track this, along with the two systemtap scripts apparently being run together. This part hints at a locking problem somewhere else in the system: :stap_b1a252a1d53a35fa01915881e7637775_6997:lookup_bad_addr+0x35/0xbf [<ffffffff887d432b>]
Agreed. This doesn't look related to the original issue. Note that it's not actually necessary to use systemtap to reproduce the original problem. I reproduced it without systemtap by running the script in a tight loop. systemtap might make the problem more reproducible by changing the timing, but it really has nothing to do with this particular bug.
change the status to Verified and file a new bug 709190 to track the locking problem according to above comments.
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Under certain circumstances, a crash in the kernel could occur due to a race condition in the lockd_down function, which did not wait for the lockd process to come down. With this update, the lockd_down function has been fixed, and the kernel no longer crashes.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-1065.html