Bug 246330

Summary: ptrace looses track of a forked child
Product: [Fedora] Fedora Reporter: John D. Ramsdell <ramsdell>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: low Docs Contact:
Priority: low    
Version: 7CC: roland
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: 2.6.22.1-27.fc7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-07-23 19:03:19 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Attachments:
Description Flags
Shows ptrace loosing track of a forked process none

Description John D. Ramsdell 2007-06-30 13:41:55 UTC
Description of problem:

I wrote a program that uses ptrace to obtain the pid of each forked
child before the child starts running.  It is used to perform system
call tracing while following forks using the audit system similar to
"strace -f".  The program was working correctly in October of 2006 on
an up-to-date Fedora Core 6 system.

On Fedora 7, children that never fork die unexpectedly.  These
children stop twice with SIGSTOP, and by the second time they are
continued, they disappear.

Version-Release number of selected component (if applicable):

Linux goo 2.6.21-1.3228.fc7 #1 SMP Tue Jun 12 15:37:31 EDT 2007 i686 i686 i386
GNU/Linux

How reproducible:

Use enclosed test program

Steps to Reproduce:
1. make ptracefork
2. ./ptracefork
3.
  
Actual results:

[ramsdell@goo ~]$ uname -a
Linux goo 2.6.21-1.3228.fc7 #1 SMP Tue Jun 12 15:37:31 EDT 2007 i686 i686 i386
GNU/Linux
[ramsdell@goo ~]$ ./ptracefork
Child is 29160

Wait status for 29160 is 1407 (0x57f)
Process 29160 stopped with signal = 5
Set options on 29160 due to SIGTRAP without a child

Wait status for 29160 is 66943 (0x1057f)
Process 29160 stopped with signal = 5
Process 29160 forked 29161

Wait status for 29161 is 4991 (0x137f)
Process 29161 stopped with signal = 19

Wait status for 29160 is 66943 (0x1057f)
Process 29160 stopped with signal = 5
Process 29160 forked 29162

Wait status for 29162 is 4991 (0x137f)
Process 29162 stopped with signal = 19

Wait status for 29161 is 4991 (0x137f)
Process 29161 stopped with signal = 19
ptrace(PTRACE_CONT, ...): No such process
[ramsdell@goo ~]$ 

Expected results:

bash-3.2$ make ptracefork
cc     ptracefork.c   -o ptracefork
bash-3.2$ uname -a
Linux oolong 2.4.21-47.0.1.ELsmp #1 SMP Fri Oct 13 17:56:20 EDT 2006 i686 i686
i386 GNU/Linux
bash-3.2$ ./ptracefork
Child is 20502

Wait status for 20502 is 1407 (0x57f)
Process 20502 stopped with signal = 5
Set options on 20502 due to SIGTRAP without a child

Wait status for 20502 is 4991 (0x137f)
Process 20502 stopped with signal = 19

Wait status for 20502 is 66943 (0x1057f)
Process 20502 stopped with signal = 5
Process 20502 forked 20503

Wait status for 20503 is 4991 (0x137f)
Process 20503 stopped with signal = 19

Wait status for 20502 is 66943 (0x1057f)
Process 20502 stopped with signal = 5
Process 20502 forked 20504

Wait status for 20503 is 4991 (0x137f)
Process 20503 stopped with signal = 19

Wait status for 20503 is 0 (0x0)

Wait status for 20504 is 4991 (0x137f)
Process 20504 stopped with signal = 19

Wait status for 20504 is 4991 (0x137f)
Process 20504 stopped with signal = 19

Wait status for 20504 is 1407 (0x57f)
Process 20504 stopped with signal = 5
Process 20504 forked 20503

Wait status for 20504 is 4991 (0x137f)
Process 20504 stopped with signal = 19
      1       1       4

Wait status for 20504 is 0 (0x0)

Wait status for 20502 is 4479 (0x117f)
Process 20502 stopped with signal = 17

Wait status for 20502 is 0 (0x0)
bash-3.2$ 

Additional info:

Comment 1 John D. Ramsdell 2007-06-30 13:41:56 UTC
Created attachment 158292 [details]
Shows ptrace loosing track of a forked process

Comment 2 Chuck Ebbert 2007-07-02 22:27:54 UTC
Kernel 3241 has new ptrace code -- we should have a copy available for testing
tomorrow.


Comment 3 John D. Ramsdell 2007-07-03 11:30:17 UTC
I just finished setting up a rawhide machine.

[ramsdell@drawlight audit]$ uname -a
Linux drawlight 2.6.21-1.3243.fc8 #1 SMP Sat Jun 30 18:36:29 EDT 2007 i686 i686
i386 GNU/Linux
[ramsdell@drawlight audit]$ 

Testing shows that the bug disappeared with this version of the kernel.

Please add my test program to that which is used to perform regression testing
for the kernel.


Comment 4 Chuck Ebbert 2007-07-03 17:08:49 UTC
(In reply to comment #3)
> I just finished setting up a rawhide machine.
> 
> [ramsdell@drawlight audit]$ uname -a
> Linux drawlight 2.6.21-1.3243.fc8 #1 SMP Sat Jun 30 18:36:29 EDT 2007 i686 i686
> Testing shows that the bug disappeared with this version of the kernel.
> 
> Please add my test program to that which is used to perform regression testing
> for the kernel.

If the bug is fixed in rawhide then it's probably also fixed in the 1.3241.fc7
kernel as well, since they have the same utrace patches now.



Comment 5 Chuck Ebbert 2007-07-06 20:35:59 UTC
Patches went into kernel 2.6.21-1.3255.fc7

Comment 6 John D. Ramsdell 2007-07-23 18:58:15 UTC
Yum update on my Fedora 7 machine provided me with kernel 2.6.22.1-27.fc7.  This
kernel does not exhibit the bug, so the utrace patches must have fixed my
problem too.

Thanks all.