Bug 60921 - ps -e sometimes does not list a process that is alive
Summary: ps -e sometimes does not list a process that is alive
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: kernel
Version: 6.2
Hardware: i686
OS: Linux
medium
low
Target Milestone: ---
Assignee: Arjan van de Ven
QA Contact: Aaron Brown
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2002-03-09 01:51 UTC by Wan-Teh Chang
Modified: 2008-08-01 16:22 UTC (History)
0 users

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2004-09-30 15:39:25 UTC
Embargoed:


Attachments (Terms of Use)

Description Wan-Teh Chang 2002-03-09 01:51:31 UTC
Description of problem:
We are using the command
  ps -e | grep $PID >/dev/null
to determine if the process $PID is still alive.

We found that sometimes this command exits with a nonzero
status (which means $PID is not found in the output of
ps -e) even though the process $PID is alive.

How reproducible:
Sometimes

Steps to Reproduce:
1. Start a process.
2. The primordial thread of the process writes the pid to a file.
3. Read the file to get the pid.
4. Execute the command
     ps -e | grep $PID >/dev/null || echo "process not detectable"
5. Execute the command
     kill $PID
   to terminate the process (even if it is not alive).
5. The process has a signal handler for SIGTERM that writes
   "SIGTERM received\n" to fd 1.

Sometimes, we see both the "process not detectable" and
"SIGTERM received" messages, indicating that ps -e fails
to list a process that is still alive.

Additional info:

This problem occurs very rarely on Linux.  It never occurs
on other Unix platforms (Solaris, HP-UX, AIX, and OSF1).

The process we are monitoring is a multithreaded process
that creates 8 threads.  I don't know if this piece of
information is relevant.

We are now using the command
  kill -0 $PID >/dev/null 2>/dev/null || echo "process not detectable"
to detect whether the process $PID is still alive.  This
method has been reliable on Linux and other Unix platforms.

Comment 1 Michael K. Johnson 2002-03-27 23:32:55 UTC
procps just shows what the kernel tells it; I suspect that if there is
a bug here, it's in the kernel.

Comment 2 Bugzilla owner 2004-09-30 15:39:25 UTC
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/



Note You need to log in before you can comment on or make changes to this bug.