Bug 1291172 - systemctl restart/start sshd shows no error if start fails
systemctl restart/start sshd shows no error if start fails
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: openssh (Show other bugs)
7.2
All Linux
medium Severity medium
: rc
: ---
Assigned To: Jakub Jelen
Stanislav Zidek
:
Depends On:
Blocks: 1203710 1296594 1313485 1395836
  Show dependency treegraph
 
Reported: 2015-12-14 03:49 EST by Thorsten Scherf
Modified: 2016-11-16 14:26 EST (History)
8 users (show)

See Also:
Fixed In Version: openssh-6.6.1p1-26.el7
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1398360 (view as bug list)
Environment:
Last Closed: 2016-11-03 16:18:45 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Thorsten Scherf 2015-12-14 03:49:07 EST
Description of problem:
systemctl restart|start sshd show no error message if restart/start fails. 


[root@xevws029 ~]# systemctl start sshd
[root@xevws029 ~]# 
[root@xevws029 ~]# systemctl status sshd
● sshd.service - OpenSSH server daemon
   Loaded: loaded (/usr/lib/systemd/system/sshd.service; enabled; vendor preset: enabled)
   Active: activating (auto-restart) (Result: exit-code) since Fri 2015-12-11 10:13:31 CET; 6s ago
     Docs: man:sshd(8)
           man:sshd_config(5)
  Process: 38753 ExecStart=/usr/sbin/sshd -D $OPTIONS (code=exited, status=255)
 Main PID: 38753 (code=exited, status=255)
   CGroup: /system.slice/sshd.service
           ├─38552 sshd: root@pts/0
           ├─38556 -bash
           └─38754 systemctl status sshd

Dec 11 10:13:31 xevws029.xeop.de systemd[1]: Unit sshd.service entered failed state.
Dec 11 10:13:31 xevws029.xeop.de systemd[1]: sshd.service failed.

Start failed because of a wrong sshd_config. But systemctl should show if a start has failed.


Version-Release number of selected component (if applicable):
Latest RHEL7.2 RPM. 

How reproducible:
Put invalid configuration directive into sshd_config, restart the server. No error message is printed on stdout. Check server status and see that the server is actually not running.


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

I think the problem is the "Type=simple" parameter which is used when the type is not explicitly defined. I think "Type=forking" would be a better match here.
Comment 2 Petr Lautrbach 2015-12-14 06:33:36 EST
(In reply to Thorsten Scherf from comment #0)
> I think the problem is the "Type=simple" parameter which is used when the
> type is not explicitly defined. I think "Type=forking" would be a better
> match here.

It probably doesn't matter if a service is forking or simple in this case. 'Type=simple' is used with '/usr/sbind/sshd -D' as it's simpler for systemd to track a main sshd process which doesn't do fork() & exec(). 

But we could change sshd.service to run '/usr/sbin/sshd -t' before the sshd daemon:


# cat /etc/systemd/system/sshd.service.d/execstartpre.conf
[Service]
ExecStartPre=/usr/sbin/sshd -t $OPTIONS

# echo "NonSense 1" >> /etc/ssh/sshd_config

# systemctl restart sshd                                  
Job for sshd.service failed. See 'systemctl status sshd.service' and 'journalctl -xn' for details.
Comment 3 Jakub Jelen 2015-12-14 10:19:54 EST
(In reply to Petr Lautrbach from comment #2)
> It probably doesn't matter if a service is forking or simple in this case.
> 'Type=simple' is used with '/usr/sbind/sshd -D' as it's simpler for systemd
> to track a main sshd process which doesn't do fork() & exec().

I don't know, but setting Type=forking helps systemctl to report errors (unlike the simple one). But it does not suit the description of the daemon behavior from manual pages and changing invocation is probably not a thing we would like to do.

> But we could change sshd.service to run '/usr/sbin/sshd -t' before the sshd
> daemon

This is actually good idea. The ExecStartPre makes the service fail hard if there is problem only in config, but it is checking the config fwice twice with every start.

I was also thinking about possibility to differentiate exit status for wrong configuration and let service fail hard. If I set

  RestartPreventExitStatus=255

"systemctl start sshd" is still not returning any failure, but the service is in failed state (instead of activating as before).

It looks for me like systemd problem. I think some insight from systemd maintainers would be useful.
Comment 4 Michal Sekletar 2015-12-16 04:17:03 EST
With Type=simple you always have a problem that basic startup (forking new process, setting up execution environment) succeeds but start of an *actual* daemon fails for some reason. However, failure happens "down the road" after systemd transitioned service to active-running state and systemctl command already returned no error to the user at the command line.

As for whether there is some bug in systemd as comment #3 suggests...Sure it might be the case. Please come up with simple reproducer which exhibits the problem. Frankly, I have a hard time understanding what is an actual problem you see and what is the behavior you expect.

At any rate, I think the best solution for sshd would be to have an actual integration with systemd, i.e. make sshd Type=notify service. See man 3 sd_notify for details. Patch for this should be very small, just couple lines of code and it can be easily maintained downstream in case upstream doesn't care for this.
Comment 5 Jakub Jelen 2015-12-17 05:10:12 EST
Michal, thanks for insight about possibilities. The notify type should work for sure. I can try to implement some patch for openssh just to give it a try. But to current state of systemd, I tested the reproducers once more on current RHEL7.2 and here are the results:

Forking which works for me can be simply put together by modifying [Service] section of sshd.service file:

Type=forking
PIDFile=/var/run/sshd.pid
ExecStart=/usr/sbin/sshd $OPTIONS

Issuing start/restart works just fine if the config is ok. If not, I am getting relevant error:

# systemctl start sshd
Job for sshd.service failed because the control process exited with error code. See "systemctl status sshd.service" and "journalctl -xe" for details.



> However, failure happens "down the road" after systemd transitioned service to active-running state and systemctl command already returned no error to the user at the command line.

I don't think this is the case. As mentioned in the original description, systemd knows about the exit status code (code=exited, status=255), but does not report it as a failure (left in activating state for auto-restart). I thought the auto-restart will be the root of problems, but getting rid of it didn't help either (from the original configuration):

#Restart=on-failure
#RestartSec=42s

# systemctl daemon-reload 
# systemctl restart sshd
# systemctl status sshd
[..]
   Active: failed (Result: exit-code) since Thu 2015-12-17 10:53:11 CET; 1s ago

Service is now reported as failed, but the error is not printed during start/restart (but we need that restart).
Comment 8 Tim Speetjens 2016-02-25 06:37:11 EST
My tests confirm that indeed changing the unit file to 'forking' and then no longer add '-D' option, work as expected.


[Unit]
Description=OpenSSH server daemon
Documentation=man:sshd(8) man:sshd_config(5)
After=network.target sshd-keygen.service
Wants=sshd-keygen.service

[Service]
Type=forking
EnvironmentFile=/etc/sysconfig/sshd
ExecStart=/usr/sbin/sshd $OPTIONS
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=on-failure
RestartSec=42s

[Install]
WantedBy=multi-user.target
Comment 14 errata-xmlrpc 2016-11-03 16:18:45 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2588.html

Note You need to log in before you can comment on or make changes to this bug.