Red Hat Bugzilla – Bug 127849
application breaks under RHEL3, possibly because of SIGCHLD workaround
Last modified: 2007-11-30 17:07:02 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6)
Description of problem:
Application 'nxserver' (compiled for RH9) is exiting unexpectantly
immediately after the following error message from the kernel is
kernel: application bug: nxserver(4788) has SIGCHLD set to SIG_IGN bu
t calls wait().
Jul 14 11:12:17 is-fletch kernel: (see the NOTES section of 'man 2
wait'). Workaround activated
I understand that you are not supporting NX, but something you have
changed in the kernel may well be breaking a properly working
application. Could it be something that was backported into the
Sometimes the parent process that is involved in this call considers
the child to have closed normally and reports the child process
terminating normally. Sometimes the parent process looses track of
the child process completely -- whether that is due to the particular
workaround chosen or whether the child process actually aborts is
anyone's guess. Possibly it's related to the the child process
exiting before the call to waitpid() is even initiated by the parent.
Maybe some new behavior exibited in the signal handling prevents a
zombie from being created.
(I've read the notes in the man page as well as much as I could fine
about this issue online).
The vendor of 'nx' (nomachine) is participating in looking for the
source of this bug.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Install RHEL3
2. Download the eval 'nxserver' and nxclient from www.nomachine.com
3. Install them an establish a connection from a remote machine to the
server you just configured.
4. Observe /var/log/messages and notice that correctly authenticated
session never startup an X environment. They either shut down
immediately or hang for around 60 seconds and then shutdown.
Actual Results: See above.
Expected Results: An X Windows login session should have been
established and the selected X environment should have started up
(either Gnome, KDE or other)
I have seen references to this kernel message elsewhere, but very
little specific information on the workaround.
Hello, Jim. When the 'nxserver' application is compiled on RHEL3,
does the problem still occur? (I'm not sure whether we guarantee
application-binary-compatibility with RHL 9, but it would be nice
to remove this variable from the equation.)
the sigchld issue is that it's not valid to call wait() (and by
extension library functions that call wait) when you've set SIGCHILD
to SIGIGN. That will cause deadlocks in case the child gets reaped by
init before the wait() executes. Older kernels sort of kinda tolerated
this, NPTL does not, but tries to work around it somewhat in our kernels.