Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
For bugs related to Red Hat Enterprise Linux 4 product line. The current stable release is 4.9. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 244655

Summary: Trying to restart a hung/frozen sshd daemon doesn't show correct status
Product: Red Hat Enterprise Linux 4 Reporter: Jose Plans <jplans>
Component: opensshAssignee: Tomas Mraz <tmraz>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: low Docs Contact:
Priority: low    
Version: 4.5CC: mmayer, tao
Target Milestone: ---Keywords: OtherQA
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHSA-2007-0703 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-11-15 14:58:13 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
initscripts-sshd_stop.patch none

Description Jose Plans 2007-06-18 12:10:49 UTC
Description of problem:
-------------------------

openssh version: openssh-3.9p1-8.RHEL4.20


The sshd dameon got frozen and When "service sshd restart" is done, it displays
that the stop/start has been successful. But infact the old sshd was not terminated.


#service sshd restart
Stopping sshd:                                             [  OK  ]
Starting sshd:                                             [  OK  ]


# service sshd start
Starting sshd:                                             [  OK  ]


Since old sshd hasn't been killed, when the new one tries to listen on port 22,
it fails. This is because kill command used in killproc() just returns 0 after
sending a signal.

With that mechanism We can't detect if a completely wedged process is really
killed or not.

This is not obvious for the user since it displays OK...

The failure only shows up in the log (/var/log/secure):

Sep  5 20:29:33 n27 sshd[30475]: error: Bind to port 22 on 0.0.0.0 failed:
Address already in use.
Sep  5 20:29:33 n27 sshd[30475]: fatal: Cannot bind any address.
Sep  5 20:29:33 n27 sshd[2841]: Received signal 15; terminating.


The restart(), start() and stop() functions in sshd init-script should handle
this scenario. They (or killproc fucntion ) should check for sshd pid once more
after sending terminate signal and before trying to start a new sshd.

Additional info:
This has been corrected in RHEL5's initscripts. Instead of sending a killproc
$SSHD -TERM, we send KILL (default).

Can this also be applied in RHEL4 ?

Comment 1 Jose Plans 2007-06-18 12:10:49 UTC
Created attachment 157274 [details]
initscripts-sshd_stop.patch

Comment 2 Jose Plans 2007-06-18 12:12:28 UTC
Please let me know if you need more details about this.

Comment 3 Tomas Mraz 2007-06-18 15:47:19 UTC
Let's fix this in rhel-4.6.


Comment 4 RHEL Program Management 2007-06-18 15:55:31 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 11 errata-xmlrpc 2007-11-15 14:58:13 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2007-0703.html