Bug 438449

Summary: /etc/rc.d/init.d/killall is racing with other "stops"
Product: [Fedora] Fedora Reporter: Michal Jaegermann <michal>
Component: opensshAssignee: Tomas Mraz <tmraz>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: low Docs Contact:
Priority: low    
Version: rawhideCC: archimerged, notting
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: openssh-5.0p1-1.fc9 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-04-08 03:13:19 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Bug Depends On:    
Bug Blocks: 235706    
Attachments:
Description Flags
Fixes this bug. /etc/init.d/sshd kills itself via killall sshd. none

Description Michal Jaegermann 2008-03-20 19:15:33 EDT
Description of problem:

It looks that when shutting down a system a script /etc/rc.d/init.d/killall
is supposed to run last after everything was stopped doing
a final sweep for strays.  Only it appears that with an upstart
scheme it does not wait and it races with other scripts producing
on a screen messages like

/etc/rc6.d/S00killall: line 16: 2743 Terminated /etc/init.d/$subsys stop

If that is a purely cosmetic issue then the above can be prevented
by redirecting all stdout and stderr output of 'for ... ; done' loop
in that script to /dev/null.  OTOH as this is racing, as indicated
by those error messages, then what are guarantees that something
would not be stopped too early?  'killall' is not doing any checks
(with this exception that it will not touch 'network').

Other possibility could be a removal but the script may turn out
to be really handy when one would want to get something close
to a real single-user mode. 

Version-Release number of selected component (if applicable):
upstart-0.3.9-9.fc9

How reproducible:
on every shutdown
Comment 1 archimerged Ark submedes 2008-04-07 14:17:30 EDT
Created attachment 301544 [details]
Fixes this bug.  /etc/init.d/sshd kills itself via killall sshd.

Add trap '' TERM before the killall and trap TERM after.
Comment 2 archimerged Ark submedes 2008-04-07 14:23:25 EDT
component should be changed to openssh
Comment 3 Michal Jaegermann 2008-04-07 14:54:55 EDT
> component should be changed to openssh

Is this truly the whole problem or you found (thanks!) one instance
where right now this is causing hiccups?  'killall' really used to
run as a final check and this does not seem to be the case anymore.
Comment 4 archimerged Ark submedes 2008-04-07 15:14:11 EDT
I put set -xv at front of /etc/init.d/killall and found that this is the script
that causes all of the "stopping <subsys> ... [ OK ]" messages on the console.

/usr/bin/killall is a completely different thing but it was related --
/etc/init.d/sshd wanted to kill /usr/sbin/sshd, but /usr/bin/killall kills the
script as well as any extra sshd's.  Seems to me I remember a bug just like this
circa Solaris 2.5 ...

The error message is somewhat misleading.  It wasn't line 16, but line 16 is the
beginning of the compound statement where the 'terminated' status code came
back, and /etc/init.d/$subsys stop was the text of the line which got the bad
status.  When I changed that to bash -xv /etc/init.d/$subsys, the problem
confusingly went away (since the process named bash, not sshd).
Comment 5 Michal Jaegermann 2008-04-07 15:30:33 EDT
> /usr/bin/killall is a completely different thing ...

Yes, I know.  Maybe I should be more carefull but I thought that
from the context it was clear that we are talking about killall in 
/etc/init.d/. See also a title for this bug report.
Comment 6 archimerged Ark submedes 2008-04-07 15:33:25 EDT
The comment about "there shouldn't be any" at the top of /etc/init.d/killall is
no longer accurate and should be revised.  That script is part of the
initscripts rpm.  (Or else if things were supposed to have been stopped by the
Knn links in /etc/init.d/rc6.d/ then there is a bug in upstart or something like
that.  /etc/init.d/rc6.d/S00killall used to run after the Knn scripts IIRC.)

Actually, it does sound like there is a bug, because killall kills subsystems in
alphabetical order, not in the order specified by the Knn symbolic links.

Probably need a new bug for that...
Comment 7 Tomas Mraz 2008-04-08 03:13:19 EDT
I have patched the sshd init script in rawhide.