Bug 1395836

Summary: systemctl restart/start sshd shows no error if start fails
Product: Red Hat Enterprise Linux 7 Reporter: Andrew Kuhlmann <akuhlman>
Component: opensshAssignee: Jakub Jelen <jjelen>
Status: CLOSED DUPLICATE QA Contact: BaseOS QE Security Team <qe-baseos-security>
Severity: high Docs Contact:
Priority: unspecified    
Version: 7.2CC: jjelen
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-21 13:27:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1291172    
Bug Blocks:    

Description Andrew Kuhlmann 2016-11-16 19:26:24 UTC
Description of problem:

SSH Daemon sporadically fails to restart due to PID file becoming out of sync with reality.

Initially around 80% of the time the service will come up with the following warning:

$ systemctl status sshd
* sshd.service - OpenSSH server daemon
   Loaded: loaded (/usr/lib/systemd/system/sshd.service; enabled; vendor preset: enabled)
   Active: active (running) since Tue 2016-11-15 23:10:55 UTC; 19h ago
     Docs: man:sshd(8)
           man:sshd_config(5)
 Main PID: 8071 (sshd)
   CGroup: /system.slice/sshd.service
           `-8071 /usr/sbin/sshd

Nov 15 23:10:55 server.local systemd[1]: Starting OpenSSH server daemon...
Nov 15 23:10:55 server.local systemd[1]: PID file /var/run/sshd.pid not readable (yet?) after start.
Nov 15 23:10:55 server.local systemd[1]: Started OpenSSH server daemon.

Around 20% of the time after a restart especially quickly successive restarts the pidfile will not reflect the actual process that is running. 

This results in a state where sshd is running but systemd doesn't know the pid so it is unable to kill it, and since sshd is holding onto port 22 (or whatever) it is unable to bind and the process dies.


Version-Release number of selected component (if applicable):
Latest RHEL7.2 RPM. 

How reproducible:
Happens very often during automated actions that require quick restarts of sshd service.


Steps to Reproduce:
1. Restart sshd service rapidly


Actual results:
sshd service in a broken state, but sshd is still running


Expected results:
sshd service should restart cleanly every time unless there is a config problem


Additional info:
This problem started after bug 1291172 was included in 6.6.1p1-26

Comment 2 Jakub Jelen 2016-11-21 13:27:52 UTC
This is already reported in a bug #1381997 we plan to address this issue soon.

Inability of systemd to track running sshd process makes us to add systemd patch to sshd, which worked just fine for a years this way.

*** This bug has been marked as a duplicate of bug 1381997 ***