Bug 229112

Summary: Kernel oops when tracing multiple processes using ptrace
Product: [Fedora] Fedora Reporter: Magnus Vesterlund <magnus_vesterlund>
Component: kernelAssignee: Roland McGrath <roland>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 6CC: cebbert, davej, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-06-25 15:43:04 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 232676    
Attachments:
Description Flags
Test case
none
Oops message, tracing 64-bit process.
none
Oops message, tracing 64-bit process.
none
Oops message, tracing 32-bit process.
none
Oops message from 2.6.20-1.2933.fc6
none
Five oops messages in .tar.gz.
none
five oops messages as plain text none

Description Magnus Vesterlund 2007-02-17 09:32:51 UTC
Description of problem:
I get a kernel oops when tracing multiple processes using ptrace and terminating
one of the tracing processes with ctrl-c. The computer is then completely hung.

Version-Release number of selected component (if applicable):
2.6.19-1.2911.fc6 for x86_64

How reproducible:
Happens almost every time the attached test case is run.

Steps to Reproduce:
1. Untar the attached test case, "cd ptrace-oops", "make".
2. Open 4 terminals, cd to ptrace-oops in all of them.
3. Run "./target" in a terminal, run "./monitor <pid>" with the pid printed by
the target program in another terminal. Do the same thing in the other pair of
terminals.
4. Press ctrl-c in one of the terminals running the monitor program.
  
Actual results:
Kernel oops, completely hung computer.

Expected results:
No kernel oops.

Additional info:

Comment 1 Magnus Vesterlund 2007-02-17 09:32:51 UTC
Created attachment 148265 [details]
Test case

Comment 2 Chuck Ebbert 2007-02-17 18:20:18 UTC
Please post the oops messages.

Comment 3 Magnus Vesterlund 2007-02-18 13:40:09 UTC
I am getting a couple of different oops messages. They also differ slightly
between tracing a 64-bit process and a 32-bit process. Attaching a few samples.

Comment 4 Magnus Vesterlund 2007-02-18 13:41:32 UTC
Created attachment 148287 [details]
Oops message, tracing 64-bit process.

Comment 5 Magnus Vesterlund 2007-02-18 13:42:45 UTC
Created attachment 148288 [details]
Oops message, tracing 64-bit process.

Comment 6 Magnus Vesterlund 2007-02-18 13:44:17 UTC
Created attachment 148289 [details]
Oops message, tracing 32-bit process.

Comment 7 Chuck Ebbert 2007-03-20 15:11:58 UTC
Major ptrace/utrace update is in 2.6.20-1.2933.fc6.
Please test.

Comment 8 Magnus Vesterlund 2007-03-20 21:33:30 UTC
It still oopses and locks up the machine, but with a different oops message.

Comment 9 Magnus Vesterlund 2007-03-20 21:35:12 UTC
Created attachment 150537 [details]
Oops message from 2.6.20-1.2933.fc6

Comment 10 Chuck Ebbert 2007-03-26 14:54:53 UTC
Test kernels (version 1.2937) for this issue are at:

http://people.redhat.com/cebbert

Please test and report back.


Comment 11 Magnus Vesterlund 2007-03-26 20:40:07 UTC
This kernel also oopses, but not immediately when I press ctrl-c like it the
earlier kernels. The oops usually comes a few seconds later.

I get more varied oops messages, but they usually have this call trace:

Call Trace:
 <IRQ>  [<ffffffff80295c95>] __rcu_process_callbacks+0x12d/0x1bc
 [<ffffffff80295d47>] rcu_process_callbacks+0x23/0x43
 [<ffffffff8028c3fc>] tasklet_action+0x53/0x9d
 [<ffffffff8025b23c>] call_softirq+0x1c/0x28
 [<ffffffff80211fc0>] __do_softirq+0x55/0xc3
 [<ffffffff8025b23c>] call_softirq+0x1c/0x28
 <EOI>  [<ffffffff8028c2ea>] ksoftirqd+0x0/0xbf
 [<ffffffff802684d2>] do_softirq+0x2c/0x85
 [<ffffffff8028c349>] ksoftirqd+0x5f/0xbf
 [<ffffffff80231852>] kthread+0xd0/0xff
 [<ffffffff8025aec8>] child_rip+0xa/0x12
 [<ffffffff80231782>] kthread+0x0/0xff
 [<ffffffff8025aebe>] child_rip+0x0/0x12

Just tell me if you want a few complete oops messages.

Comment 12 Chuck Ebbert 2007-03-26 20:41:59 UTC
Yes, we need the full oops messages.


Comment 13 Magnus Vesterlund 2007-03-26 20:56:58 UTC
Created attachment 150956 [details]
Five oops messages in .tar.gz.

Comment 14 Chuck Ebbert 2007-03-26 21:03:17 UTC
Created attachment 150959 [details]
five oops messages as plain text

Comment 15 Chuck Ebbert 2007-06-22 14:52:36 UTC
Magnus, kernel 2962 has a new utrace update. Can you test it?

Comment 16 Magnus Vesterlund 2007-06-23 07:51:03 UTC
The problem seems to be fixed in 2962, I can not reproduce the oops. 

I also tried kernel-2.6.21-1.3228.fc7.x86_64, but I can still reproduce the oops
with that kernel.

Comment 17 Chuck Ebbert 2007-06-25 15:43:04 UTC
I just applied the same fixes to the F7 kernel, so it should be OK in the next
release.