Bug 137387 - checkpid() does not check for the death of ALL threads belonging to a process.
checkpid() does not check for the death of ALL threads belonging to a process.
Status: CLOSED RAWHIDE
Product: Fedora
Classification: Fedora
Component: initscripts (Show other bugs)
2
All Linux
medium Severity medium
: ---
: ---
Assigned To: Bill Nottingham
Brock Organ
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2004-10-27 19:41 EDT by Ian Macdonald
Modified: 2014-03-16 22:49 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-10-27 22:56:32 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
This fixes the issue described in this bug. (267 bytes, patch)
2004-10-27 20:11 EDT, Ian Macdonald
no flags Details | Diff

  None (edit)
Description Ian Macdonald 2004-10-27 19:41:40 EDT
Description of problem:

checkpid() in /etc/init.d/functions is broken. It returns success if
any of a program's threads is no longer running. It should return
success only if all threads from a given program are no longer running.

This fails if one upgrades to OpenLDAP 2.2, as it's no longer possible
to shut down slurpd. slurpd fails to die on the TERM sent from
killproc(), but checkpid() erroneously reports that slurpd has been
shut down, because some (but not all) of its threads are no longer
active in /proc. For this reason, fallthrough to issuing a KILL never
happens.


Version-Release number of selected component (if applicable):


How reproducible:

Every time.

Steps to Reproduce:
1. Install OpenLDAP 2.2 as a master server, using slurpd for
replication. You can use the RPM from FC3.
2. Attempt to shut down the LDAP service, using the init script.
  
Actual results:

Observe with ps(1) that slurpd is still running.

Expected results:

slurpd should have been KILLed when it refused to be TERMinated.

Additional info:

Although OpenLDAP 2.2 is not a part of FC2, this problem has the
potential to occur with any threaded daemon that responds to a TERM by
shutting down some, but not all, of its threads.

I haven't checked the initscripts in FC3, but I suspect the problem is
still there in checkpid().
Comment 1 Ian Macdonald 2004-10-27 20:11:23 EDT
Created attachment 105876 [details]
This fixes the issue described in this bug.

Using this version of checkpid(), daemons which stop some, but not all, threads
after receiving a TERM are later properly sent a KILL to finish the job off.
Comment 2 Bill Nottingham 2004-10-27 22:56:32 EDT
This is fixed in current development packages (7.85 and later.)

Note You need to log in before you can comment on or make changes to this bug.