Red Hat Bugzilla – Bug 244655
Trying to restart a hung/frozen sshd daemon doesn't show correct status
Last modified: 2009-06-19 18:56:29 EDT
Description of problem: ------------------------- openssh version: openssh-3.9p1-8.RHEL4.20 The sshd dameon got frozen and When "service sshd restart" is done, it displays that the stop/start has been successful. But infact the old sshd was not terminated. #service sshd restart Stopping sshd: [ OK ] Starting sshd: [ OK ] # service sshd start Starting sshd: [ OK ] Since old sshd hasn't been killed, when the new one tries to listen on port 22, it fails. This is because kill command used in killproc() just returns 0 after sending a signal. With that mechanism We can't detect if a completely wedged process is really killed or not. This is not obvious for the user since it displays OK... The failure only shows up in the log (/var/log/secure): Sep 5 20:29:33 n27 sshd[30475]: error: Bind to port 22 on 0.0.0.0 failed: Address already in use. Sep 5 20:29:33 n27 sshd[30475]: fatal: Cannot bind any address. Sep 5 20:29:33 n27 sshd[2841]: Received signal 15; terminating. The restart(), start() and stop() functions in sshd init-script should handle this scenario. They (or killproc fucntion ) should check for sshd pid once more after sending terminate signal and before trying to start a new sshd. Additional info: This has been corrected in RHEL5's initscripts. Instead of sending a killproc $SSHD -TERM, we send KILL (default). Can this also be applied in RHEL4 ?
Created attachment 157274 [details] initscripts-sshd_stop.patch
Please let me know if you need more details about this.
Let's fix this in rhel-4.6.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2007-0703.html