Bug 462704 - PTRACE_KILL does not kill the child process, rather than the child starts running freely.
PTRACE_KILL does not kill the child process, rather than the child starts run...
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
All Linux
high Severity urgent
: ---
: ---
Assigned To: Kernel Maintainer List
Fedora Extras Quality Assurance
Depends On: 455060
  Show dependency treegraph
Reported: 2008-09-18 09:58 EDT by Anton Arapov
Modified: 2014-06-18 04:02 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2008-10-16 08:32:03 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
reproducer (1.21 KB, application/octet-stream)
2008-09-18 09:58 EDT, Anton Arapov
no flags Details

  None (edit)
Description Anton Arapov 2008-09-18 09:58:17 EDT
Created attachment 317081 [details]

+++ This bug was initially created as a clone of Bug #455060 +++

When the parent process sends a PTRACE_KILL to the child that has been
stopped by SIGTRAP (initiated by PTRACE_TRACEME), the child is not
killed, rather than starts running freely.

This kernel bug is present on FC7, RH 5, 5u1, and 5u2, using x86,
x86-64, or Power processors. On the other hand this problem is not
present on e.g. SUSE 10.1, 10.2, and RH 4u5. This implies to us that
the working systems have kernel equal to or less than 2.6.16. the
failing systems have kernel equal to or newer than 2.6.18.

This problem reproduces with e.g. both gcc and PGI compilers. The
reproducer here uses gcc 4.3.0.

The reproducer package consists of two program's: the 'user' code
simplestat_g.out, and the master 'Debugger' code (test_TV.c). First
the master code forks a child and the child then sets
PTRACE_TRACEME. The child then starts running
exec(./simplestat_g.out). The master waits at wait(), and immediately
sends PTRACE_KILL to the child. As result, the child should exit, and
never actually execute and let loose simplestat_g.out.

We suspect that this might be a race condition in the kernel, possibly
a race condition between setting a SIGKILL signal against the child
process and letting it run so it gets killed.

This kernel problem prevents TotalView Debugger from debugging any
'-static' compiled programs on these platforms. We consider this bug
as a critical bug in the kernel and hope that it would be fixed as a
very high priority.

For more details, please see the reproducer codes, particularly



# User's prog, w/ (or w/o) -static, e.g. here statically linked
/home/compilers/gnu/gcc/4.3.0/x86_64-linux/bin/gcc -g -static -o
simplestat_g.out simple.c -lm

# Mini Debugger prog, executing simplestat_g and trying to PTRACE_KILL it
/home/compilers/gnu/gcc/4.3.0/x86_64-linux/bin/gcc -o a.out test_TV.c



Sample output:

FAILING execution, RH 5u1, x86-64:

rhel51-x8664:/home/seppo/Bugs/Bug_11153 > ./a.out
CHILD: PTRACE_TRACEME at 0 :: return code 0 
PARENT: WAIT status 1407 from PID 28777 
PARENT: status -> CHILD stopped, by signal  5 
PARENT: Sent PTRACE_KILL to 28777 :: return code 0 

rhel51-x8664:/home/seppo/Bugs/Bug_11153 > counter 0
counter 1
counter 2
counter 3
counter 4
counter 5
counter 6
counter 7
counter 8
counter 9

rhel51-x8664:/home/seppo/Bugs/Bug_11153 > 

rhel51-x8664:/home/seppo/Bugs/Bug_11153 > uname -a
Linux rhel51-x8664.totalviewtech.com 2.6.18-53.el5 #1 SMP Wed Oct 10 16:34:19
EDT 2007 x86_64 x86_64 x86_64 GNU/Linux
rhel51-x8664:/home/seppo/Bugs/Bug_11153 > 

SUCCESFULL execution, SUSE 10 SP1, x86-64:

gari:/home/seppo/Bugs/Bug_11153 > ./a.out
CHILD: PTRACE_TRACEME at 0 :: return code 0 
PARENT: WAIT status 1407 from PID 29369 
PARENT: status -> CHILD stopped, by signal  5 
PARENT: Sent PTRACE_KILL to 29369 :: return code 0 

gari:/home/seppo/Bugs/Bug_11153 > uname -a
Linux gari #1 SMP Wed May 3 04:53:23 UTC 2006 x86_64 x86_64
x86_64 GNU/Linux
gari:/home/seppo/Bugs/Bug_11153 > 
Comment 1 John Poelstra 2008-09-23 18:58:56 EDT
Is there a compelling reason why the key comment to this bug is private?

Does this bug apply to Fedora rawhide?
Comment 2 Anton Arapov 2008-09-24 04:24:09 EDT
shouldn't be private... overlooked this.

yes, it's apply to rawhide since it has Roland's utrace patch.
Comment 3 Roland McGrath 2008-10-14 15:04:14 EDT
I don't think the current rawhide kernel has this problem.  Can you verify?
Comment 4 Anton Arapov 2008-10-16 08:29:57 EDT
has been fixed in 2.6.27-13.fc10.x86_64
now, works just fine. :)

Note You need to log in before you can comment on or make changes to this bug.