Description of problem: We are using the command ps -e | grep $PID >/dev/null to determine if the process $PID is still alive. We found that sometimes this command exits with a nonzero status (which means $PID is not found in the output of ps -e) even though the process $PID is alive. How reproducible: Sometimes Steps to Reproduce: 1. Start a process. 2. The primordial thread of the process writes the pid to a file. 3. Read the file to get the pid. 4. Execute the command ps -e | grep $PID >/dev/null || echo "process not detectable" 5. Execute the command kill $PID to terminate the process (even if it is not alive). 5. The process has a signal handler for SIGTERM that writes "SIGTERM received\n" to fd 1. Sometimes, we see both the "process not detectable" and "SIGTERM received" messages, indicating that ps -e fails to list a process that is still alive. Additional info: This problem occurs very rarely on Linux. It never occurs on other Unix platforms (Solaris, HP-UX, AIX, and OSF1). The process we are monitoring is a multithreaded process that creates 8 threads. I don't know if this piece of information is relevant. We are now using the command kill -0 $PID >/dev/null 2>/dev/null || echo "process not detectable" to detect whether the process $PID is still alive. This method has been reliable on Linux and other Unix platforms.
procps just shows what the kernel tells it; I suspect that if there is a bug here, it's in the kernel.
Thanks for the bug report. However, Red Hat no longer maintains this version of the product. Please upgrade to the latest version and open a new bug if the problem persists. The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, and if you believe this bug is interesting to them, please report the problem in the bug tracker at: http://bugzilla.fedora.us/