Bug 244655 - Trying to restart a hung/frozen sshd daemon doesn't show correct status
Summary: Trying to restart a hung/frozen sshd daemon doesn't show correct status
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: openssh   
(Show other bugs)
Version: 4.5
Hardware: All
OS: Linux
Target Milestone: ---
: ---
Assignee: Tomas Mraz
QA Contact: Brian Brock
Keywords: OtherQA
Depends On:
TreeView+ depends on / blocked
Reported: 2007-06-18 12:10 UTC by Jose Plans
Modified: 2009-06-19 22:56 UTC (History)
2 users (show)

Fixed In Version: RHSA-2007-0703
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2007-11-15 14:58:13 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
initscripts-sshd_stop.patch (289 bytes, patch)
2007-06-18 12:10 UTC, Jose Plans
no flags Details | Diff

External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2007:0703 normal SHIPPED_LIVE Moderate: openssh security and bug fix update 2007-11-15 14:57:55 UTC

Description Jose Plans 2007-06-18 12:10:49 UTC
Description of problem:

openssh version: openssh-3.9p1-8.RHEL4.20

The sshd dameon got frozen and When "service sshd restart" is done, it displays
that the stop/start has been successful. But infact the old sshd was not terminated.

#service sshd restart
Stopping sshd:                                             [  OK  ]
Starting sshd:                                             [  OK  ]

# service sshd start
Starting sshd:                                             [  OK  ]

Since old sshd hasn't been killed, when the new one tries to listen on port 22,
it fails. This is because kill command used in killproc() just returns 0 after
sending a signal.

With that mechanism We can't detect if a completely wedged process is really
killed or not.

This is not obvious for the user since it displays OK...

The failure only shows up in the log (/var/log/secure):

Sep  5 20:29:33 n27 sshd[30475]: error: Bind to port 22 on failed:
Address already in use.
Sep  5 20:29:33 n27 sshd[30475]: fatal: Cannot bind any address.
Sep  5 20:29:33 n27 sshd[2841]: Received signal 15; terminating.

The restart(), start() and stop() functions in sshd init-script should handle
this scenario. They (or killproc fucntion ) should check for sshd pid once more
after sending terminate signal and before trying to start a new sshd.

Additional info:
This has been corrected in RHEL5's initscripts. Instead of sending a killproc
$SSHD -TERM, we send KILL (default).

Can this also be applied in RHEL4 ?

Comment 1 Jose Plans 2007-06-18 12:10:49 UTC
Created attachment 157274 [details]

Comment 2 Jose Plans 2007-06-18 12:12:28 UTC
Please let me know if you need more details about this.

Comment 3 Tomas Mraz 2007-06-18 15:47:19 UTC
Let's fix this in rhel-4.6.

Comment 4 RHEL Product and Program Management 2007-06-18 15:55:31 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update

Comment 11 errata-xmlrpc 2007-11-15 14:58:13 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.