Bug 60921

Summary: ps -e sometimes does not list a process that is alive
Product: [Retired] Red Hat Linux Reporter: Wan-Teh Chang <wtc>
Component: kernelAssignee: Arjan van de Ven <arjanv>
Status: CLOSED CURRENTRELEASE QA Contact: Aaron Brown <abrown>
Severity: low Docs Contact:
Priority: medium    
Version: 6.2   
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-09-30 15:39:25 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Wan-Teh Chang 2002-03-09 01:51:31 UTC
Description of problem:
We are using the command
  ps -e | grep $PID >/dev/null
to determine if the process $PID is still alive.

We found that sometimes this command exits with a nonzero
status (which means $PID is not found in the output of
ps -e) even though the process $PID is alive.

How reproducible:
Sometimes

Steps to Reproduce:
1. Start a process.
2. The primordial thread of the process writes the pid to a file.
3. Read the file to get the pid.
4. Execute the command
     ps -e | grep $PID >/dev/null || echo "process not detectable"
5. Execute the command
     kill $PID
   to terminate the process (even if it is not alive).
5. The process has a signal handler for SIGTERM that writes
   "SIGTERM received\n" to fd 1.

Sometimes, we see both the "process not detectable" and
"SIGTERM received" messages, indicating that ps -e fails
to list a process that is still alive.

Additional info:

This problem occurs very rarely on Linux.  It never occurs
on other Unix platforms (Solaris, HP-UX, AIX, and OSF1).

The process we are monitoring is a multithreaded process
that creates 8 threads.  I don't know if this piece of
information is relevant.

We are now using the command
  kill -0 $PID >/dev/null 2>/dev/null || echo "process not detectable"
to detect whether the process $PID is still alive.  This
method has been reliable on Linux and other Unix platforms.

Comment 1 Michael K. Johnson 2002-03-27 23:32:55 UTC
procps just shows what the kernel tells it; I suspect that if there is
a bug here, it's in the kernel.

Comment 2 Bugzilla owner 2004-09-30 15:39:25 UTC
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/