Bug 114447 - exit system call does not exit though process in status end and parent waits
exit system call does not exit though process in status end and parent waits
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
i686 Linux
medium Severity medium
: ---
: ---
Assigned To: Arjan van de Ven
Brian Brock
Depends On:
  Show dependency treegraph
Reported: 2004-01-28 06:10 EST by Albert Fluegel
Modified: 2007-04-18 13:02 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2004-09-30 11:41:49 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Albert Fluegel 2004-01-28 06:10:31 EST
Description of problem:
It sometimes happens, that a process is in the system call
exit(), is in status end (according to ps -o ...,wchan,...),
the parent has called wait4... but it's child does not end.
This is very annoying problem. It seems to depend on the
kernel version, how often it happens. With
2.4.18-27 and earlier it happened sometimes, say every
1000th program started on 200 machines (i.e. statistically
every 200000th process). With 2.4.20-8 it happened every
some 100th process on one machine i.e. every 500th process.
With 2.4.20-20 we didn't see it for quite some time. With
2.4.20-28 the problem is back with about the same rate
like with 2.4.18-27. The machines are all dual processors.
The problem occurs with heavily nonlinear increasing
likelyness with increasing processor speed. From 2.8 GHz Xeon
or faster it happens MUCH more often than on slower machines.
More experiences, whose significance is unclear:
It seems only to happen when the processes are started as
sub-processes (with some shells inbetween) of rshd. Often
the problem occurred when installing RPMs with the rpm
Command. I've seen a very interesting behaviour here: Whether
the problem showed up depended on the current working directory
where the rpm program started. This way the exit hung nearly
every time:
(pwd is e.g. /)
rpm -i <options> /path/to/some/NFS/directory/kernel-some-version.rpm
but this way it worked:
cd /path/to/some/NFS/directory && rpm -i <options> kernel-some-version.rpm

Version-Release number of selected component (if applicable):
see above

How reproducible:
spread lots of jobs using rsh to a lot of machines

Steps to Reproduce:
1.see above, sorry, that i don't provide rsh using scripts and so on here
Actual results:
with a certain probability exit does not return

Expected results:
exit returns, parent gets exit data through some wait

Additional info:
see above.
Comment 1 Bugzilla owner 2004-09-30 11:41:49 EDT
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/

Note You need to log in before you can comment on or make changes to this bug.