Bug 137387 - checkpid() does not check for the death of ALL threads belonging to a process.
Summary: checkpid() does not check for the death of ALL threads belonging to a process.
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: initscripts
Version: 2
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Bill Nottingham
QA Contact: Brock Organ
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2004-10-27 23:41 UTC by Ian Macdonald
Modified: 2014-03-17 02:49 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2004-10-28 02:56:32 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
This fixes the issue described in this bug. (267 bytes, patch)
2004-10-28 00:11 UTC, Ian Macdonald
no flags Details | Diff

Description Ian Macdonald 2004-10-27 23:41:40 UTC
Description of problem:

checkpid() in /etc/init.d/functions is broken. It returns success if
any of a program's threads is no longer running. It should return
success only if all threads from a given program are no longer running.

This fails if one upgrades to OpenLDAP 2.2, as it's no longer possible
to shut down slurpd. slurpd fails to die on the TERM sent from
killproc(), but checkpid() erroneously reports that slurpd has been
shut down, because some (but not all) of its threads are no longer
active in /proc. For this reason, fallthrough to issuing a KILL never
happens.


Version-Release number of selected component (if applicable):


How reproducible:

Every time.

Steps to Reproduce:
1. Install OpenLDAP 2.2 as a master server, using slurpd for
replication. You can use the RPM from FC3.
2. Attempt to shut down the LDAP service, using the init script.
  
Actual results:

Observe with ps(1) that slurpd is still running.

Expected results:

slurpd should have been KILLed when it refused to be TERMinated.

Additional info:

Although OpenLDAP 2.2 is not a part of FC2, this problem has the
potential to occur with any threaded daemon that responds to a TERM by
shutting down some, but not all, of its threads.

I haven't checked the initscripts in FC3, but I suspect the problem is
still there in checkpid().

Comment 1 Ian Macdonald 2004-10-28 00:11:23 UTC
Created attachment 105876 [details]
This fixes the issue described in this bug.

Using this version of checkpid(), daemons which stop some, but not all, threads
after receiving a TERM are later properly sent a KILL to finish the job off.

Comment 2 Bill Nottingham 2004-10-28 02:56:32 UTC
This is fixed in current development packages (7.85 and later.)


Note You need to log in before you can comment on or make changes to this bug.