Bug 137154

Summary: "waitid(POSIX Interface)" cannot run properly.
Product: Red Hat Enterprise Linux 4 Reporter: L3support <linux-sid>
Component: kernelAssignee: Jason Baron <jbaron>
Status: CLOSED ERRATA QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 4.0CC: davej, drepper, halligan, jakub, jturner, knoel, nagahama, pleiades-si, roland, tao, tburke
Target Milestone: ---   
Target Release: ---   
Hardware: ia64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-06-08 15:12:49 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description L3support 2004-10-26 10:58:56 UTC
Description of problem:
"waitid" cannot run properly when
we run our test program calling "waitid" on RHEL4
(Please refer to the following "Steps to Reproduce" for the test program.)

Following error messages are shown when we run the test programe on RHEL4.

opt = 2, errno = 95
waitid 1 error: Operation not supported

We investigated about error message.The result is as follows.
 -- "errno 95" is "EOPNOTSUPP", "opt 2" is "WUNTRACED".
 -- There is no description of error code "EOPNOTSUPP" in manual.

Moreover, even if it specified "WSTOPPED" instead of 
the 4th argument "WUNTRACED" of "waitid", 
similarly the result was an abnormal end.

We think that there is no error in the test program.
Because "waitid" can run properly when we run the test program on RHEL3.

Isn't there the compatibility of "waitid" between RHEL4 and RHEL3 ?

Version-Release number of selected component (if applicable):

How reproducible:
always

Steps to Reproduce:
1.# gcc ./waitid_samp.c -o waitid_samp.o
2.# ./waitid_samp.o

-----
// test program: waitid_samp.c

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <string.h>
#include <errno.h>

int main(int argc, char **argv)
{
	pid_t pid;
	siginfo_t infop;

	memset(&infop, 0x00, sizeof(infop));

	if( ( pid = fork()) == 0 ){
		sleep(120);
		printf(" child now exit\n");
		exit(0);
	}
	else {
		if( pid < 0 ){
			perror("fork error");
			exit(1);
		}
	}

	if (kill(pid, SIGSTOP) < 0) {
		perror("kill 1 error");
		exit(2);
	}

	//if (waitid(P_ALL, 0, &infop, WSTOPPED) < 0) {
	if (waitid(P_ALL, 0, &infop, WUNTRACED) < 0) {
		//printf(" opt = %d, errno = %d\n",WSTOPPED,errno);
		printf(" opt = %d, errno = %d\n",WUNTRACED,errno);
		perror("waitid 1 error");
		exit(3);
	}

	if (kill(pid, SIGCONT) < 0) {
		perror("kill 2 error");
		exit(4);
	}

	//if (waitid(P_ALL, 0, &infop, WEXITED) < 0) {
	if (waitid(P_ALL, 0, &infop, WUNTRACED) < 0) {
		//printf(" opt = %d, errno = %d\n",WEXITED,errno);
		printf(" opt = %d, errno = %d\n",WUNTRACED,errno);
		perror("waitid 2 error");
		exit(5);
	}

	printf(" parent now exit\n");
	exit(0);
}
-----
 
Actual results:
opt = 2, errno = 95
waitid 1 error: Operation not supported

Expected results:
"waitid" run on RHEL4 properly.

Additional info:

Comment 1 L3support 2004-11-19 07:31:45 UTC
Could you tell me status of this problem ?

Comment 2 L3support 2004-12-03 04:09:09 UTC
We'd like to know status of this problem.

Comment 3 Ulrich Drepper 2004-12-10 04:53:57 UTC
WUNTRACED is no flag which can be used with waitid().  In fact, it has
the same value as WSTOPPED.  Therefore in the example code, the second
waitid() call just waits again for stopped processes.  There is none,
so it waits until the child process is gone.  After it might fail. 
This is what I see with a recent kernel.

Why you se EOPNOTSUPP I can only guess.  You probably have a kernel
without the necessary system call.  Try again with a recent kernel.

Comment 4 Roland McGrath 2004-12-13 20:29:07 UTC
I think the problem here turned out to be that RHEL4 glibc got built on some
architectures without the syscall number available, including ia64.
So what the test sees is the waitid emulation code using waitpid in libc.
That returns ENOTSUP (aka EOPNOTSUPP) when called without WEXITED in the flag
bits.  On some architectures, I think the kernel may still be missing the
syscall itself.

Comment 5 L3support 2005-01-13 10:10:50 UTC
Thank you for your comment.

We tried run with a recent kernel, 
similarly the result was an abnormal end.
---------------------------------
opt = 2, errno = 95
waitid 1 error: Operation not supported
---------------------------------

We would like to know that
why the result(RHEL3 and RHEL4) is different.

That cause is bug or feature ?
---------------------------------
RHEL3: normal end
RHEL4: abnormal end (refer to above)
---------------------------------

Comment 6 Ulrich Drepper 2005-01-13 10:24:55 UTC
There is a difference since RHEL3 has only an incomplete emulation of
waitid() while the RHEL4 kernel has a real implementation.  As said in
comment #4, the glibc version seems to have been built with incomplete
headers, so that the syscall isn't recognized.

Still, the main problem is that incorrect flags are used.  There is no
guarantee whatsoever that RHEL3 and 4 are bug-compatible, i.e., broken
software is not guaranteed to work on RHEL4 even though it might have
worked on RHEL3.

This bug should be closed as NOTABUG.

Comment 7 Jakub Jelinek 2005-01-13 10:47:21 UTC
glibc-2.3.3-90 and above use the waitid system call on {i386,x86_64,ia64} if
available in the kernel (s390* and ppc* don't have a syscall number for
waitid syscall assigned yet even in the current upstream sources).
Were you testing with glibc-2.3.3-90 or later and a recent RHEL4 kernel?

Comment 8 RHEL4-L3support 2005-01-19 01:36:41 UTC
Add RHEL4-L3support team in Japan

Comment 9 JoAnne K. Halligan 2005-01-23 22:48:38 UTC
This bug is awaiting an update and additional information from Fujitsu. 


Comment 10 Jay Turner 2005-01-24 15:07:24 UTC
The same behavior occurs on IA64 with:

glibc-2.3.4-2.ia64
kernel-2.6.9-5.EL.ia64

But the point still remains that WUNTRACED isn't a valid flag for waitid().

Comment 15 Jakub Jelinek 2005-01-24 21:50:22 UTC
Unfortunately, it seems even 2.6.9-5.EL misses
http://linux.bkbits.net:8080/linux-2.5/gnupatch@418bd5c9Extop70uFoZXcWLWN_Cz1g
(as well as addition of that syscall for ppc*/s390*, but for them it needs
first an upstream change).

Comment 22 Tim Powers 2005-06-08 15:12:49 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2005-420.html