Bug 229112

Summary: Kernel oops when tracing multiple processes using ptrace
Product: [Fedora] Fedora Reporter: Magnus Vesterlund <magnus_vesterlund>
Component: kernelAssignee: Roland McGrath <roland>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 6CC: cebbert, davej, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-06-25 11:43:04 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On:    
Bug Blocks: 232676    
Attachments:
Description Flags
Test case
none
Oops message, tracing 64-bit process.
none
Oops message, tracing 64-bit process.
none
Oops message, tracing 32-bit process.
none
Oops message from 2.6.20-1.2933.fc6
none
Five oops messages in .tar.gz.
none
five oops messages as plain text none

Description Magnus Vesterlund 2007-02-17 04:32:51 EST
Description of problem:
I get a kernel oops when tracing multiple processes using ptrace and terminating
one of the tracing processes with ctrl-c. The computer is then completely hung.

Version-Release number of selected component (if applicable):
2.6.19-1.2911.fc6 for x86_64

How reproducible:
Happens almost every time the attached test case is run.

Steps to Reproduce:
1. Untar the attached test case, "cd ptrace-oops", "make".
2. Open 4 terminals, cd to ptrace-oops in all of them.
3. Run "./target" in a terminal, run "./monitor <pid>" with the pid printed by
the target program in another terminal. Do the same thing in the other pair of
terminals.
4. Press ctrl-c in one of the terminals running the monitor program.
  
Actual results:
Kernel oops, completely hung computer.

Expected results:
No kernel oops.

Additional info:
Comment 1 Magnus Vesterlund 2007-02-17 04:32:51 EST
Created attachment 148265 [details]
Test case
Comment 2 Chuck Ebbert 2007-02-17 13:20:18 EST
Please post the oops messages.
Comment 3 Magnus Vesterlund 2007-02-18 08:40:09 EST
I am getting a couple of different oops messages. They also differ slightly
between tracing a 64-bit process and a 32-bit process. Attaching a few samples.
Comment 4 Magnus Vesterlund 2007-02-18 08:41:32 EST
Created attachment 148287 [details]
Oops message, tracing 64-bit process.
Comment 5 Magnus Vesterlund 2007-02-18 08:42:45 EST
Created attachment 148288 [details]
Oops message, tracing 64-bit process.
Comment 6 Magnus Vesterlund 2007-02-18 08:44:17 EST
Created attachment 148289 [details]
Oops message, tracing 32-bit process.
Comment 7 Chuck Ebbert 2007-03-20 11:11:58 EDT
Major ptrace/utrace update is in 2.6.20-1.2933.fc6.
Please test.
Comment 8 Magnus Vesterlund 2007-03-20 17:33:30 EDT
It still oopses and locks up the machine, but with a different oops message.
Comment 9 Magnus Vesterlund 2007-03-20 17:35:12 EDT
Created attachment 150537 [details]
Oops message from 2.6.20-1.2933.fc6
Comment 10 Chuck Ebbert 2007-03-26 10:54:53 EDT
Test kernels (version 1.2937) for this issue are at:

http://people.redhat.com/cebbert

Please test and report back.
Comment 11 Magnus Vesterlund 2007-03-26 16:40:07 EDT
This kernel also oopses, but not immediately when I press ctrl-c like it the
earlier kernels. The oops usually comes a few seconds later.

I get more varied oops messages, but they usually have this call trace:

Call Trace:
 <IRQ>  [<ffffffff80295c95>] __rcu_process_callbacks+0x12d/0x1bc
 [<ffffffff80295d47>] rcu_process_callbacks+0x23/0x43
 [<ffffffff8028c3fc>] tasklet_action+0x53/0x9d
 [<ffffffff8025b23c>] call_softirq+0x1c/0x28
 [<ffffffff80211fc0>] __do_softirq+0x55/0xc3
 [<ffffffff8025b23c>] call_softirq+0x1c/0x28
 <EOI>  [<ffffffff8028c2ea>] ksoftirqd+0x0/0xbf
 [<ffffffff802684d2>] do_softirq+0x2c/0x85
 [<ffffffff8028c349>] ksoftirqd+0x5f/0xbf
 [<ffffffff80231852>] kthread+0xd0/0xff
 [<ffffffff8025aec8>] child_rip+0xa/0x12
 [<ffffffff80231782>] kthread+0x0/0xff
 [<ffffffff8025aebe>] child_rip+0x0/0x12

Just tell me if you want a few complete oops messages.
Comment 12 Chuck Ebbert 2007-03-26 16:41:59 EDT
Yes, we need the full oops messages.
Comment 13 Magnus Vesterlund 2007-03-26 16:56:58 EDT
Created attachment 150956 [details]
Five oops messages in .tar.gz.
Comment 14 Chuck Ebbert 2007-03-26 17:03:17 EDT
Created attachment 150959 [details]
five oops messages as plain text
Comment 15 Chuck Ebbert 2007-06-22 10:52:36 EDT
Magnus, kernel 2962 has a new utrace update. Can you test it?
Comment 16 Magnus Vesterlund 2007-06-23 03:51:03 EDT
The problem seems to be fixed in 2962, I can not reproduce the oops. 

I also tried kernel-2.6.21-1.3228.fc7.x86_64, but I can still reproduce the oops
with that kernel.
Comment 17 Chuck Ebbert 2007-06-25 11:43:04 EDT
I just applied the same fixes to the F7 kernel, so it should be OK in the next
release.