Description of problem: apparent race condition in runtime/procfs.c:_stp_rmdir_proc_module when killing two staprun processes with a kill -9. Version-Release number of selected component (if applicable): RHEL4U6 systemtap-0.6.2-1.el4 kernel-2.6.9-67.0.15.ELsmp How reproducible: Unsure Steps to Reproduce: 1. Start two systemtap scripts at the same time 2. Use kill -9 with multiple process ids on the command line to kill at the same time 3. See if you get the panic Actual results: Expected results: Additional info: Crash occurs in fs/proc/generic.c:remove_proc_entry, calling the BUG_ON(de->subdir) macro. Here is the code fragment: for (p = &parent->subdir; *p; p=&(*p)->next ) { if (!proc_match(len, fn, *p)) continue; de = *p; *p = de->next; de->next = NULL; if (S_ISDIR(de->mode)) parent->nlink--; proc_kill_inodes(de); de->nlink = 0; BUG_ON(de->subdir); if (!atomic_read(&de->count)) free_proc_entry(de); else { de->deleted = 1; printk("remove_proc_entry: %s/%s busy, count=%d\n", parent->name, de->name, atomic_read(&de->count)\ ); } break; SYSTEM MAP: /var/crash/crash_analysis/debug_x86_64/boot/System.map-2.6.9-67.0.15.ELsmp DEBUG KERNEL: /var/crash/crash_analysis/debug_x86_64/usr/lib/debug/lib/modules/2.6.9-67.0.15.ELsmp/vmlinux (2.6.9-67.0.15.ELsmp) DUMPFILE: /var/crash/crash_dumps/ap12.1236890303.dmp CPUS: 8 DATE: Thu Mar 12 16:38:23 2009 UPTIME: 25 days, 02:31:04 LOAD AVERAGE: 7.48, 8.65, 8.42 TASKS: 916 NODENAME: chip6.acs.census.gov RELEASE: 2.6.9-67.0.15.ELsmp VERSION: #1 SMP Tue Apr 22 13:58:43 EDT 2008 MACHINE: x86_64 (2193 Mhz) MEMORY: 33 GB PANIC: "vmdump" PID: 2868 COMMAND: "staprun" TASK: 104bf77a7f0 [THREAD_INFO: 1040baf0000] CPU: 2 STATE: TASK_RUNNING (PANIC) ----------- [cut here ] --------- [please bite here ] --------- Kernel BUG at generic:712 invalid operand: 0000 [1] SMP CPU 2 Modules linked in: stap_a5058ef9256374d79b204f5489fce41b_6002(U) stap_034dd3d28d1d58dff817bd97d399e87f_3085(U) loop sr_mod ip_vs nfsd exportfs parport_pc lp parport autofs4 i2c_dev i2c_core lock_dlm(U) gfs(U) lock_harness(U) nfs lockd dlm(U) cman(U) md5 ipv6 ipt_state ip_conntrack iptable_filter ip_tables hw_random k8_edac edac_mc nfs_acl sunrpc egenera_nmi(U) egenera_veth(U) sd_mod egenera_vscsi(U) scsi_mod egenera_vmdump(U) egenera_dumpdev(U) egenera_ipmi(U) egenera_base(U) egenera_virtual_bus(U) egenera_fs(U) dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod Pid: 2868, comm: staprun Tainted: PF 2.6.9-67.0.15.ELsmp RIP: 0010:[<ffffffff801b0f4e>] <ffffffff801b0f4e>{remove_proc_entry+340} RSP: 0000:000001040baf1e48 EFLAGS: 00010202 RAX: 0000000000000000 RBX: 00000102381de500 RCX: 0000000000000000 RDX: 000000000000000a RSI: 000001021c68b219 RDI: ffffffff804f4700 RBP: 000001021c68b180 R08: ffffffffa03c71b5 R09: 0000000000000080 R10: 0000000000000080 R11: ffffffff801d232b R12: 00000102381de400 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 0000002a9557f440(0000) GS:ffffffff804f3680(0000) knlGS:00000000f6e03ba0 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000003c105142d8 CR3: 0000000438fc0000 CR4: 00000000000006e0 Process staprun (pid: 2868, threadinfo 000001040baf0000, task 00000104bf77a7f0) Stack: ffffffffa03cdb00 0000000000000000 ffffffff803f0c60 ffffffffa03c71b5 ffffffff803ed1a0 ffffffffa03cdb00 0000000000000000 ffffffffa03c2875 0000010440000000 ffffffffa03c28b2 Call Trace:<ffffffffa03c2875>{:stap_034dd3d28d1d58dff817bd97d399e87f_3085:_stp_rmdir_proc_module+182} <ffffffffa03c28b2>{:stap_034dd3d28d1d58dff817bd97d399e87f_3085:_stp_unregister_ctl_channel+48} <ffffffffa03c6e13>{:stap_034dd3d28d1d58dff817bd97d399e87f_3085:_stp_transport_close+30} <ffffffff8014ed74>{sys_delete_module+487} <ffffffff8011036e>{tracesys+113} <ffffffff801103ce>{tracesys+209} Code: 0f 0b 72 de 32 80 ff ff ff ff c8 02 8b 85 80 00 00 00 85 c0 RIP <ffffffff801b0f4e>{remove_proc_entry+340} RSP <000001040baf1e48> crash> bt 2868 PID: 2868 TASK: 104bf77a7f0 CPU: 2 COMMAND: "staprun" #0 [1040baf1c40] dump_execute at ffffffffa010c483 #1 [1040baf1c80] __do_vmdump at ffffffffa010c556 #2 [1040baf1ca0] try_crashdump at ffffffff8014c7dd #3 [1040baf1cb0] die at ffffffff80111c04 #4 [1040baf1cd0] do_invalid_op at ffffffff80111fcc #5 [1040baf1d90] error_exit at ffffffff80110d91 [exception RIP: remove_proc_entry+340] RIP: ffffffff801b0f4e RSP: 000001040baf1e48 RFLAGS: 00010202 RAX: 0000000000000000 RBX: 00000102381de500 RCX: 0000000000000000 RDX: 000000000000000a RSI: 000001021c68b219 RDI: ffffffff804f4700 RBP: 000001021c68b180 R8: ffffffffa03c71b5 R9: 0000000000000080 R10: 0000000000000080 R11: ffffffff801d232b R12: 00000102381de400 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000 #6 [1040baf1e40] remove_proc_entry at ffffffff801b0f3f #7 [1040baf1e80] _stp_rmdir_proc_module at ffffffffa03c2875 #8 [1040baf1e90] _stp_unregister_ctl_channel at ffffffffa03c28b2 #9 [1040baf1ec0] _stp_transport_close at ffffffffa03c6e13 #10 [1040baf1ee0] sys_delete_module at ffffffff8014ed74 #11 [1040baf1f80] tracesys at ffffffff801103ce (via system_call) RIP: 0000003c0fbc9e79 RSP: 0000007fbffff768 RFLAGS: 00000246 RAX: ffffffffffffffda RBX: ffffffff801103ce RCX: ffffffffffffffff RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000509060 RBP: 0000000000000000 R8: 0000000000000000 R9: 0000000000000200 R10: 0000000000000004 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000000 ORIG_RAX: 00000000000000b0 CS: 0033 SS: 002b
The fix for http://sourceware.org/bugzilla/show_bug.cgi?id=5042 seemed to cause this problem. Currently, the code only checks to see that /proc/systemtap is not being used before deleting /proc/systemtap. But, it may be unused while another module has a directory under /proc/systemtap. The solution is to add a guard to ensure that there are no other files under /proc/systemtap. diff --git a/runtime/procfs.c b/runtime/procfs.c index d6b7533..dfbe2de 100644 --- a/runtime/procfs.c +++ b/runtime/procfs.c @@ -59,7 +59,8 @@ static void _stp_rmdir_proc_module(void) /* modules to use it, even though it will not show up */ /* in directory listings. */ - if (atomic_read(&_stp_proc_stap->count) == 0) { + if ((atomic_read(&_stp_proc_stap->count) == 0) && + (!_stp_proc_stap->subdir)) { remove_proc_entry("systemtap", NULL); _stp_proc_stap = NULL; }
Fixed in upstream commit 83eaf9b. Besides checking to make sure there are no other files in /proc/systemtap before removal, the transport directory is locked before removal of '/proc/systemtap/${MODULE}' also.
Created attachment 398842 [details] Proposed patch
Here's a link to the upstream commit (that I mentioned earlier) that more fully fixes this problem: <http://sources.redhat.com/git/gitweb.cgi?p=systemtap.git;a=commitdiff;h=83eaf9b5f3d1f1287b7962efc6bc703357b984ba> In includes (basically) the same thing as your proposed patch plus better locking.
Created attachment 398912 [details] Updated patch Ported upstream commit 83eaf9b to 4.9. Upstream uses _stp_lock_transport_dir(), but 4.9 uses _stp_lock_debugfs() - so I used 4.9 locking.
Your updated patch looks reasonable to backport to systemtap-0.6.2.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2011-0229.html