Bug 510282
Summary: | systemtap panics kernel when killing concurrent staprun processes | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Marc Milgram <mmilgram> |
Component: | systemtap | Assignee: | Frank Ch. Eigler <fche> |
Status: | CLOSED ERRATA | QA Contact: | BaseOS QE <qe-baseos-auto> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 5.3 | CC: | dsmith, fche, mjw, mmilgram, pmuller, tao |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | 490234 | Environment: | |
Last Closed: | 2010-03-30 09:05:37 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 515829 | ||
Bug Blocks: | 499522 |
Description
Marc Milgram
2009-07-08 15:08:49 UTC
A customer hit this issue in Red Hat Enterprise Linux 5.3: After several stap processes including procfs probe points are stopped, kernel panic sometimes occurs. Kernel panic occurred at proc_match@fs/proc/generic.c:755 because of invalid pointer dereference. We attach the file(panic_message.log) of kernel panic message. In addition, the following BUG messages are produced to console. This occurs because one of running stap processes remove /proc/systemtap directory, while another stap process uses it. This BUG messages are always produced in this situation, but kernel panic does not always occur. ------------------------------------------------------------------------------- BUG: warning at fs/proc/generic.c:764/remove_proc_entry() (Tainted: G ) Call Trace: [<ffffffff80101d38>] remove_proc_entry+0x17d/0x1e3 [<ffffffff88498bc5>] :stap_6c500b418cf00cac628a4a64c97f2716_312:_stp_rmdir_proc_module+0x79/0xca [<ffffffff88498cb0>] :stap_6c500b418cf00cac628a4a64c97f2716_312:systemtap_module_exit+0x63/0xcc [<ffffffff88498d95>] :stap_6c500b418cf00cac628a4a64c97f2716_312:_stp_cleanup_and_exit+0x77/0x79 [<ffffffff88498e28>] :stap_6c500b418cf00cac628a4a64c97f2716_312:_stp_work_queue+0x91/0x96 [<ffffffff8004d139>] run_workqueue+0x94/0xe4 [<ffffffff800499ba>] worker_thread+0x0/0x122 [<ffffffff8009d909>] keventd_create_kthread+0x0/0xc4 [<ffffffff80049aaa>] worker_thread+0xf0/0x122 [<ffffffff8008a461>] default_wake_function+0x0/0xe [<ffffffff8009d909>] keventd_create_kthread+0x0/0xc4 [<ffffffff8009d909>] keventd_create_kthread+0x0/0xc4 [<ffffffff80032360>] kthread+0xfe/0x132 [<ffffffff8005dfb1>] child_rip+0xa/0x11 [<ffffffff8009d909>] keventd_create_kthread+0x0/0xc4 [<ffffffff80032262>] kthread+0x0/0x132 [<ffffffff8005dfa7>] child_rip+0x0/0x11 ------------------------------------------------------------------------------- Version-Release number of selected component: RHEL5.3 GA kernel-2.6.18-128.el5 systemtap-0.7.2-2.el5 How reproducible: After stopping several running stap processes including procfs probe points, it can be reproduced in several tries. Steps to Reproduce: 1. Start two stap scripts including procfs probe points like as follows: ---- procfs.stp ------- probe procfs("a").read{ $value = "a"; } ----------------------- 2. Stop (by kill or Ctrl-C) two stap processes. This issue is reproduced by following script. --- procfs_panic_reproduce.sh ----------------------------- #!/bin/sh while : do stap -e 'probe procfs("a").read{$value="a"}' & sleep 1 stap -e 'probe procfs("b").read{$value="a"}' & sleep 1 for i in `pgrep stap` do kill -INT $i sleep 1 done done ------------------------------------------------- Actual results: Kernel panic sometimes occurs. Expected results: Kernel panic does not occur. Business impact: SystemTap is very usefull tool to analyze kernel behavior and it's important to support for enterprise user. This problem loses the convenience of SystemTap. Hardware info: Hardware independent Upstream systemtap includes a fix for this, which may be backported/rebased for 5.5. Event posted on 09-30-2009 10:53am EDT by mmilgram Hi Furuta-san, We are waiting for approval. The BZ is on the GSS 5.5 proposed list. I asked jwest for more information, and he indicated that he would look into it. There is not much more that I can do. As far as I can tell, there is not even a decision between rebasing, or porting the individual fix. Sorry that I can't be more helpful. Marc Milgram Internal Status set to 'Waiting on Engineering' This event sent from IssueTracker by mmilgram issue 314430 I've tested the testcase in comment #1 against the fix in commit 83eaf9b. Everything worked correctly. This fix is not present in systemtap-0.9.7-5.el5, but could be backported. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2010-0308.html |