Bug 142899

Summary: waitpid stops seeing child process once gdb attaches to it
Product: Red Hat Enterprise Linux 2.1 Reporter: Mikhail Kruk <mkruk>
Component: kernelAssignee: Jim Paradis <jparadis>
Status: CLOSED WONTFIX QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 2.1CC: jakub, mwesley, peterm, riel
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-12-05 22:53:03 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Mikhail Kruk 2004-12-14 22:38:50 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.3)
Gecko/20040910

Description of problem:
I have a simple program (I'll try to attach it) which forks a child.
The child sleeps and the parent calls waitpid(childpid, &status,
WNOHANG) in a loop. While child sleeps waitpid returns 0 like it
should.  Then I go and attach to the child with a debugger. When I do
that the parent's behavior change: waitpid starts returnign -1 and
errno is set to 10 (no child processes). Once I restart child in the
debugger waitpid goes backto normal behavior. 
This is bad because there is no way to tell with waitpid whether child
process is being debugger or is gone.
RH ES 3 does not have this problem.

I also tried changing WNOHANG to WNOHANG | WUNTRACED 
and looking at WIFSTOPPED() but that didn't change anything and
WIFSTOPPED is always 0.

Version-Release number of selected component (if applicable):
glibc-2.2.4-32.18, kernel-2.4.9-e.57

How reproducible:
Always

Steps to Reproduce:
1. Run the sample program
2. Attach to the child process with gdb
3. Profit!
    

Actual Results:  waited: pid 0 status: 134514001 errno: 0 0
...
waited: pid -1 status: 134514001 errno: 10 0
...
waited: pid 1736 status: 0 errno: 0 0

Expected Results:  waited: pid 0 status: 134514001 errno: 0 0
...
waited: pid 1736 status: 0 errno: 0 0

Additional info:

#include <stdlib.h>  
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/wait.h>
#include <fcntl.h>
#include <errno.h>

main()
{
	pid_t pid, wpid;
	int status;
	
	pid = fork();
	if( pid != 0 )
	{
		printf("parent of child: %d\n", pid);
		while(1)
		{
			errno = 0;
			wpid = waitpid(pid, &status, WNOHANG | WUNTRACED);
			printf("waited: pid: %d status: %d errno: %d %d\n", wpid, status,
errno, WIFSTOPPED(status));
			sleep(1);
		}
	}
	else
	{
		int i;
		for (i = 0; i < 15; i++)
		{
			sleep(1);
		}
		printf("child done\n");
		exit(0);
	}
}

Comment 1 Jakub Jelinek 2004-12-15 09:22:15 UTC
glibc's waitpid just results in a syscall with no argument changes whatsoever.

Comment 3 Jakub Jelinek 2004-12-20 14:47:19 UTC
What I meant that waitpid issues are not a bug in glibc, but in the kernel
(if it is a bug at all).  Because all glibc does is issue a syscall if the
user calls waitpid.

Comment 5 Jim Paradis 2005-12-05 22:53:03 UTC
This issue is outside the scope of the current support status for RHEL2.1.  No
fix is planned.