Bug 229112 - Kernel oops when tracing multiple processes using ptrace
Summary: Kernel oops when tracing multiple processes using ptrace
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 6
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Roland McGrath
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks: 232676
TreeView+ depends on / blocked
 
Reported: 2007-02-17 09:32 UTC by Magnus Vesterlund
Modified: 2007-11-30 22:11 UTC (History)
3 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2007-06-25 15:43:04 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Test case (880 bytes, application/octet-stream)
2007-02-17 09:32 UTC, Magnus Vesterlund
no flags Details
Oops message, tracing 64-bit process. (11.58 KB, text/plain)
2007-02-18 13:41 UTC, Magnus Vesterlund
no flags Details
Oops message, tracing 64-bit process. (5.32 KB, text/plain)
2007-02-18 13:42 UTC, Magnus Vesterlund
no flags Details
Oops message, tracing 32-bit process. (5.24 KB, text/plain)
2007-02-18 13:44 UTC, Magnus Vesterlund
no flags Details
Oops message from 2.6.20-1.2933.fc6 (5.22 KB, text/plain)
2007-03-20 21:35 UTC, Magnus Vesterlund
no flags Details
Five oops messages in .tar.gz. (2.90 KB, application/octet-stream)
2007-03-26 20:56 UTC, Magnus Vesterlund
no flags Details
five oops messages as plain text (14.93 KB, text/plain)
2007-03-26 21:03 UTC, Chuck Ebbert
no flags Details

Description Magnus Vesterlund 2007-02-17 09:32:51 UTC
Description of problem:
I get a kernel oops when tracing multiple processes using ptrace and terminating
one of the tracing processes with ctrl-c. The computer is then completely hung.

Version-Release number of selected component (if applicable):
2.6.19-1.2911.fc6 for x86_64

How reproducible:
Happens almost every time the attached test case is run.

Steps to Reproduce:
1. Untar the attached test case, "cd ptrace-oops", "make".
2. Open 4 terminals, cd to ptrace-oops in all of them.
3. Run "./target" in a terminal, run "./monitor <pid>" with the pid printed by
the target program in another terminal. Do the same thing in the other pair of
terminals.
4. Press ctrl-c in one of the terminals running the monitor program.
  
Actual results:
Kernel oops, completely hung computer.

Expected results:
No kernel oops.

Additional info:

Comment 1 Magnus Vesterlund 2007-02-17 09:32:51 UTC
Created attachment 148265 [details]
Test case

Comment 2 Chuck Ebbert 2007-02-17 18:20:18 UTC
Please post the oops messages.

Comment 3 Magnus Vesterlund 2007-02-18 13:40:09 UTC
I am getting a couple of different oops messages. They also differ slightly
between tracing a 64-bit process and a 32-bit process. Attaching a few samples.

Comment 4 Magnus Vesterlund 2007-02-18 13:41:32 UTC
Created attachment 148287 [details]
Oops message, tracing 64-bit process.

Comment 5 Magnus Vesterlund 2007-02-18 13:42:45 UTC
Created attachment 148288 [details]
Oops message, tracing 64-bit process.

Comment 6 Magnus Vesterlund 2007-02-18 13:44:17 UTC
Created attachment 148289 [details]
Oops message, tracing 32-bit process.

Comment 7 Chuck Ebbert 2007-03-20 15:11:58 UTC
Major ptrace/utrace update is in 2.6.20-1.2933.fc6.
Please test.

Comment 8 Magnus Vesterlund 2007-03-20 21:33:30 UTC
It still oopses and locks up the machine, but with a different oops message.

Comment 9 Magnus Vesterlund 2007-03-20 21:35:12 UTC
Created attachment 150537 [details]
Oops message from 2.6.20-1.2933.fc6

Comment 10 Chuck Ebbert 2007-03-26 14:54:53 UTC
Test kernels (version 1.2937) for this issue are at:

http://people.redhat.com/cebbert

Please test and report back.


Comment 11 Magnus Vesterlund 2007-03-26 20:40:07 UTC
This kernel also oopses, but not immediately when I press ctrl-c like it the
earlier kernels. The oops usually comes a few seconds later.

I get more varied oops messages, but they usually have this call trace:

Call Trace:
 <IRQ>  [<ffffffff80295c95>] __rcu_process_callbacks+0x12d/0x1bc
 [<ffffffff80295d47>] rcu_process_callbacks+0x23/0x43
 [<ffffffff8028c3fc>] tasklet_action+0x53/0x9d
 [<ffffffff8025b23c>] call_softirq+0x1c/0x28
 [<ffffffff80211fc0>] __do_softirq+0x55/0xc3
 [<ffffffff8025b23c>] call_softirq+0x1c/0x28
 <EOI>  [<ffffffff8028c2ea>] ksoftirqd+0x0/0xbf
 [<ffffffff802684d2>] do_softirq+0x2c/0x85
 [<ffffffff8028c349>] ksoftirqd+0x5f/0xbf
 [<ffffffff80231852>] kthread+0xd0/0xff
 [<ffffffff8025aec8>] child_rip+0xa/0x12
 [<ffffffff80231782>] kthread+0x0/0xff
 [<ffffffff8025aebe>] child_rip+0x0/0x12

Just tell me if you want a few complete oops messages.

Comment 12 Chuck Ebbert 2007-03-26 20:41:59 UTC
Yes, we need the full oops messages.


Comment 13 Magnus Vesterlund 2007-03-26 20:56:58 UTC
Created attachment 150956 [details]
Five oops messages in .tar.gz.

Comment 14 Chuck Ebbert 2007-03-26 21:03:17 UTC
Created attachment 150959 [details]
five oops messages as plain text

Comment 15 Chuck Ebbert 2007-06-22 14:52:36 UTC
Magnus, kernel 2962 has a new utrace update. Can you test it?

Comment 16 Magnus Vesterlund 2007-06-23 07:51:03 UTC
The problem seems to be fixed in 2962, I can not reproduce the oops. 

I also tried kernel-2.6.21-1.3228.fc7.x86_64, but I can still reproduce the oops
with that kernel.

Comment 17 Chuck Ebbert 2007-06-25 15:43:04 UTC
I just applied the same fixes to the F7 kernel, so it should be OK in the next
release.


Note You need to log in before you can comment on or make changes to this bug.