Bug 1291172
Summary: | systemctl restart/start sshd shows no error if start fails | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Thorsten Scherf <tscherf> | |
Component: | openssh | Assignee: | Jakub Jelen <jjelen> | |
Status: | CLOSED ERRATA | QA Contact: | Stanislav Zidek <szidek> | |
Severity: | medium | Docs Contact: | ||
Priority: | medium | |||
Version: | 7.2 | CC: | cww, msekleta, plautrba, pvrabec, systemd-maint, tmraz, tscherf, tspeetje | |
Target Milestone: | rc | |||
Target Release: | --- | |||
Hardware: | All | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | openssh-6.6.1p1-26.el7 | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1398360 (view as bug list) | Environment: | ||
Last Closed: | 2016-11-03 20:18:45 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1203710, 1296594, 1313485, 1395836 |
Description
Thorsten Scherf
2015-12-14 08:49:07 UTC
(In reply to Thorsten Scherf from comment #0) > I think the problem is the "Type=simple" parameter which is used when the > type is not explicitly defined. I think "Type=forking" would be a better > match here. It probably doesn't matter if a service is forking or simple in this case. 'Type=simple' is used with '/usr/sbind/sshd -D' as it's simpler for systemd to track a main sshd process which doesn't do fork() & exec(). But we could change sshd.service to run '/usr/sbin/sshd -t' before the sshd daemon: # cat /etc/systemd/system/sshd.service.d/execstartpre.conf [Service] ExecStartPre=/usr/sbin/sshd -t $OPTIONS # echo "NonSense 1" >> /etc/ssh/sshd_config # systemctl restart sshd Job for sshd.service failed. See 'systemctl status sshd.service' and 'journalctl -xn' for details. (In reply to Petr Lautrbach from comment #2) > It probably doesn't matter if a service is forking or simple in this case. > 'Type=simple' is used with '/usr/sbind/sshd -D' as it's simpler for systemd > to track a main sshd process which doesn't do fork() & exec(). I don't know, but setting Type=forking helps systemctl to report errors (unlike the simple one). But it does not suit the description of the daemon behavior from manual pages and changing invocation is probably not a thing we would like to do. > But we could change sshd.service to run '/usr/sbin/sshd -t' before the sshd > daemon This is actually good idea. The ExecStartPre makes the service fail hard if there is problem only in config, but it is checking the config fwice twice with every start. I was also thinking about possibility to differentiate exit status for wrong configuration and let service fail hard. If I set RestartPreventExitStatus=255 "systemctl start sshd" is still not returning any failure, but the service is in failed state (instead of activating as before). It looks for me like systemd problem. I think some insight from systemd maintainers would be useful. With Type=simple you always have a problem that basic startup (forking new process, setting up execution environment) succeeds but start of an *actual* daemon fails for some reason. However, failure happens "down the road" after systemd transitioned service to active-running state and systemctl command already returned no error to the user at the command line. As for whether there is some bug in systemd as comment #3 suggests...Sure it might be the case. Please come up with simple reproducer which exhibits the problem. Frankly, I have a hard time understanding what is an actual problem you see and what is the behavior you expect. At any rate, I think the best solution for sshd would be to have an actual integration with systemd, i.e. make sshd Type=notify service. See man 3 sd_notify for details. Patch for this should be very small, just couple lines of code and it can be easily maintained downstream in case upstream doesn't care for this. Michal, thanks for insight about possibilities. The notify type should work for sure. I can try to implement some patch for openssh just to give it a try. But to current state of systemd, I tested the reproducers once more on current RHEL7.2 and here are the results:
Forking which works for me can be simply put together by modifying [Service] section of sshd.service file:
Type=forking
PIDFile=/var/run/sshd.pid
ExecStart=/usr/sbin/sshd $OPTIONS
Issuing start/restart works just fine if the config is ok. If not, I am getting relevant error:
# systemctl start sshd
Job for sshd.service failed because the control process exited with error code. See "systemctl status sshd.service" and "journalctl -xe" for details.
> However, failure happens "down the road" after systemd transitioned service to active-running state and systemctl command already returned no error to the user at the command line.
I don't think this is the case. As mentioned in the original description, systemd knows about the exit status code (code=exited, status=255), but does not report it as a failure (left in activating state for auto-restart). I thought the auto-restart will be the root of problems, but getting rid of it didn't help either (from the original configuration):
#Restart=on-failure
#RestartSec=42s
# systemctl daemon-reload
# systemctl restart sshd
# systemctl status sshd
[..]
Active: failed (Result: exit-code) since Thu 2015-12-17 10:53:11 CET; 1s ago
Service is now reported as failed, but the error is not printed during start/restart (but we need that restart).
See also https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=778913 and https://lists.debian.org/debian-ssh/2015/12/msg00072.html My tests confirm that indeed changing the unit file to 'forking' and then no longer add '-D' option, work as expected. [Unit] Description=OpenSSH server daemon Documentation=man:sshd(8) man:sshd_config(5) After=network.target sshd-keygen.service Wants=sshd-keygen.service [Service] Type=forking EnvironmentFile=/etc/sysconfig/sshd ExecStart=/usr/sbin/sshd $OPTIONS ExecReload=/bin/kill -HUP $MAINPID KillMode=process Restart=on-failure RestartSec=42s [Install] WantedBy=multi-user.target Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2016-2588.html |