Bug 726596 - initscripts may kill different process while stop if it uses pid file
Summary: initscripts may kill different process while stop if it uses pid file
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: initscripts
Version: 14
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Bill Nottingham
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-07-29 07:26 UTC by masanari iida
Modified: 2014-03-17 03:28 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-08-16 14:51:56 UTC
Type: ---


Attachments (Terms of Use)

Description masanari iida 2011-07-29 07:26:40 UTC
Description of problem:
initscripts may have a possibility to kill different process 
while stop the process, in case using pid file.

Version-Release number of selected component (if applicable):
initscripts-9.20.2-1.fc14.1.i686

How reproducible:
Always

Steps to Reproduce:
This example try to show the developer that startup script
always trust the contents of PID file without checking it.
In real life scenario, see "Additional info" section.

(1) Start up sshd

(2) Run top(1)

(3) Find out pid of the top(1).
In this case, pid 3728.

(4) Edit /var/run/sshd.pid, replace the PID to top's pid.

(5) See the status of sshd.
#  /etc/init.d/sshd status
openssh-daemon (pid 3728) is running  

(6) Stop the sshd
# /etc/init.d/sshd stop
Stopping sshd:
 
Actual results:
(5)The script reports process PID 3728 is running, although 
it is not a sshd.

(6)The script kills top(1), because the process have pid 3728.

Expected results:
The startup script should check if pid within pid file is 
really related to the script. 

Additional info:
Following is an example scenario of how this problem happen 
in real life.

1. A daemon process (in this case sshd) segfault.
The pid file was not erased, because stop script was not run.

2. OS is still running, and some process reuse the same PID
that the sshd used to use.  
The process keep running like a daemon.

3. Admin stop the sshd.
The /etc/init.d/functions sees /var/run/sshd.pid file
then find out a pid, and kill the process.


This symptom may happen on Red Hat Enterprise Linux, as well.
This symptom may happen all scripts which make use of pid file
while "stop","status" and "restart".

Comment 1 masanari iida 2012-04-06 05:01:20 UTC
Hello again,
I had a troubleshooting case with 3rd party (I mean not Fedora nor RHEL)
startup script related to "wrong pid in pid file".

In short, the 3rd party script was not carefully coded, in my case.

(Detail)
The 3rd party startup script uses pidof immediately after the target daemon
started.  The daemon takes a few moment to finish initializing. 
And during the initialization, the daemon fork some child processes.
And the child processes will exit before fully initialized the daemon.

If pidof command executed during the process initialization, it may find
multiple processes with different PIDs.  Then pidof may pick one of them 
from /proc without knowing which one is parent or child.
In bad scenario, the pidof picked child process's PID, which is not exist 
after the daemon initialization. 
The time when script stop the script uses the pid file, but the pid within
the pid file may used by another process. Then the script kills the other process.

This case is "Blame the 3rd party script" scenario.
But I believe this type of accident can be avoided if the stop script verify
the contents of PID file is really the target process for kill or not.

Comment 2 Fedora End Of Life 2012-08-16 14:52:00 UTC
This message is a notice that Fedora 14 is now at end of life. Fedora 
has stopped maintaining and issuing updates for Fedora 14. It is 
Fedora's policy to close all bug reports from releases that are no 
longer maintained.  At this time, all open bugs with a Fedora 'version'
of '14' have been closed as WONTFIX.

(Please note: Our normal process is to give advanced warning of this 
occurring, but we forgot to do that. A thousand apologies.)

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, feel free to reopen 
this bug and simply change the 'version' to a later Fedora version.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we were unable to fix it before Fedora 14 reached end of life. If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora, you are encouraged to click on 
"Clone This Bug" (top right of this page) and open it against that 
version of Fedora.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping


Note You need to log in before you can comment on or make changes to this bug.