RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 835838 - Process does not exist, but "/bin/kill -0" and "kill -0" return 0
Summary: Process does not exist, but "/bin/kill -0" and "kill -0" return 0
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: util-linux-ng
Version: 6.3
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Karel Zak
QA Contact: qe-baseos-daemons
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-06-27 09:13 UTC by David Tonhofer
Modified: 2012-06-30 17:57 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-06-28 10:51:49 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Bash session (7.47 KB, text/plain)
2012-06-27 09:13 UTC, David Tonhofer
no flags Details

Description David Tonhofer 2012-06-27 09:13:14 UTC
Created attachment 594724 [details]
Bash session

Description of problem:
=======================

A process with PID 6156 has been forcefully terminated. The process is not in the process list.

However, "kill -0" insists that the process exists and does so as long as a SIGTERM has not been sent.

See bash session attached.

Version-Release number of selected component (if applicable):
=============================================================

procps-3.2.8-23.el6.x86_64

How reproducible:
=================

Not reproduced yet.

Expected results:
=================

No invisible process

Comment 2 Jaromír Cápík 2012-06-28 09:59:33 UTC
Hello David.

The kill command is a bit tricky. It exists in two separate upstream sources (procps and util-linux) and it's currently disabled in the procps build to avoid conflicts. As it's shipped with util-linux only, I'm going to change the component to util-linux.

Regards,
Jaromir.

Comment 3 Karel Zak 2012-06-28 10:51:49 UTC
David,

as you can see from your strace output there is no issue with kill(1).

It seems that the 6156 in your example is thread ID and the TID is
possible to use for kill(2) syscall.

The threads are not included in the "ls /proc" output (not included in readdir()
) to avoid performance problems on many systems with huge number
of threads, but you can address the threads directly (e.g. ls /proc/<TID>).

For example gnome-terminal with four threads:

$ ps -eLf | grep 2554
UID        PID  PPID   LWP  C NLWP STIME TTY          TIME CM
kzak      2554  1994  2554  0    4 Jun27 ?        00:00:02 gnome-terminal 
kzak      2554  1994  2557  0    4 Jun27 ?        00:00:00 gnome-terminal 
kzak      2554  1994  2558  0    4 Jun27 ?        00:00:00 gnome-terminal 
kzak      2554  1994  2561  0    4 Jun27 ?        00:00:00 gnome-terminal 

The important is LWP column (thread ID).

$ ls /proc/2554/task/                                                      
2554  2557  2558  2561

Let's play with thread 2557:

$ cat /proc/2554/task/2557/comm 
gnome-terminal

$ ls /proc | grep 2557   # <<< nothing !

but the task is accessible from top-level /proc directory if
full path is specified

$ cat /proc/2557/comm                                                     
gnome-terminal

... and now kill:

$ strace -e kill kill -0 2557                                               
kill(2557, SIG_0)                       = 0
+++ exited with 0 +++
 
success. Not a bug from my point of view.

Comment 4 David Tonhofer 2012-06-30 17:57:26 UTC
Hi Karel,

> It seems that the 6156 in your example is thread ID and the TID is possible to use for kill(2) syscall.

Forehead slap! I didn't think of this at all. Good grief. Maybe I'm too old school. 

The problem occurred because one of our scripts was of the opinion that the PID an earlier instance had written to a pidfile/lockfile was still valid and it just refused to continue. It seems that the space of PIDs is becoming somewhat constrained now that it is shared with TIDs. Time to move PID/TID to a sparsely populated 128bit space. Maybe.

On the other hand, there is still a problem somewhere..

- The strace shows a call to "kill" but:
- The kill(2) manpage doesn't mention threads at all.
- There is a specially designed tgkill(2) to signal threads.
  Its manpage says; "By contrast, kill(2) can only be used to send a 
  signal to a process (i.e., thread group) as a whole, and the signal will be
  delivered to an arbitrary thread within that process.)"

So either the kill(2) manpage and the tgkill(2) are wrong / need to be completed or there is an implementation problem with the kill syscall.

Best regards,

-- David


Note You need to log in before you can comment on or make changes to this bug.