Bug 1851478
Summary: | openssh-server is enabled, but does not get started on boot | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | customercare |
Component: | systemd | Assignee: | systemd-maint |
Status: | CLOSED UPSTREAM | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | urgent | Docs Contact: | |
Priority: | unspecified | ||
Version: | rawhide | CC: | crypto-team, dtardon, dwalsh, jfch, jjelen, lkundrak, lnykryn, mattias.ellert, msekleta, plautrba, ssahani, s, systemd-maint, tmraz, zbyszek |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2022-08-12 08:20:37 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
customercare
2020-06-26 16:47:40 UTC
Hard to say from what you provided so far. It works on the systems I have all around. Complete systemd debug logs from the boot when the service is not started would be helpful. I also have some devices where it works as expected, thats whats makes it worrysome. I had to reboot: $ journalctl -b 0 |grep sshd Jun 26 22:10:06 XXXXXXX systemd[1]: Reached target sshd-keygen.target. it's enabled: [root# ls -la /etc/systemd/system/multi-user.target.wants/sshd.service lrwxrwxrwx. 1 root root 36 13. Jun 23:30 /etc/systemd/system/multi-user.target.wants/sshd.service -> /usr/lib/systemd/system/sshd.service [root]# ls -la /etc/systemd/system/sockets.target.wants/sshd.socket lrwxrwxrwx. 1 root root 35 1. Jul 2014 /etc/systemd/system/sockets.target.wants/sshd.socket -> /usr/lib/systemd/system/sshd.socket there is no output to debug, it just does not get started, it gets ignored: $ sudo systemd-analyze unit-files | grep sshd ids: sshd@.service → /usr/lib/systemd/system/sshd@.service ids: anaconda-sshd.service → /usr/lib/systemd/system/anaconda-sshd.service ids: sshd-keygen@.service → /usr/lib/systemd/system/sshd-keygen@.service ids: sshd.service → /usr/lib/systemd/system/sshd.service ids: sshd.socket → /usr/lib/systemd/system/sshd.socket ids: sshd-keygen.target → /usr/lib/systemd/system/sshd-keygen.target aliases: anaconda-sshd.service ← anaconda-sshd.service aliases: sshd.service ← sshd.service aliases: sshd.socket ← sshd.socket aliases: sshd-keygen.target ← sshd-keygen.target aliases: sshd@.service ← sshd@.service aliases: sshd-keygen@.service ← sshd-keygen@.service $ sudo systemd-analyze blame 39.099s dracut-initqueue.service 38.759s systemd-cryptsetup@luks\x2dffbd61f2\x2d4c1e\x2d4dc8\x2db12b\x2dc89e9f69c9fa.service 33.846s systemd-cryptsetup@luks\x2dca1f019d\x2d39ce\x2d4cf8\x2db522\x2df6d3e63ebe2a.service 7.638s plymouth-quit-wait.service 4.125s NetworkManager-wait-online.service 2.822s abrtd.service 2.746s systemd-fsck@dev-mapper-luks\x2d7881f602\x2d9462\x2d497d\x2d810a\x2d7d6111ad6085.service 2.546s systemd-udev-settle.service 2.013s systemd-cryptsetup@luks\x2d6d45b281\x2dc56c\x2d40f0\x2dacf7\x2dc3058d4d6913.service 1.944s systemd-cryptsetup@luks\x2d384d6b27\x2d6263\x2d4d53\x2dbfce\x2de7e5bcd221b9.service 1.545s systemd-tmpfiles-clean.service 1.442s systemd-logind.service 1.392s systemd-cryptsetup@luks\x2d7881f602\x2d9462\x2d497d\x2d810a\x2d7d6111ad6085.service 1.349s lvm2-monitor.service 1.265s akmods.service 1.218s udisks2.service 1.106s restorecond.service 890ms initrd-switch-root.service 722ms dmraid-activation.service 717ms upower.service 603ms systemd-machined.service 581ms systemd-udevd.service 530ms systemd-journald.service 520ms systemd-fsck-root.service 512ms accounts-daemon.service 412ms cups.service 388ms smartd.service 355ms dnf-makecache.service 353ms libvirtd.service 324ms fwupd.service 317ms smb.service 301ms user 295ms user 292ms ModemManager.service 231ms polkit.service 227ms avahi-daemon.service 225ms nscd.service 201ms rtkit-daemon.service 191ms systemd-journal-flush.service 189ms switcheroo-control.service 175ms logrotate.service 145ms initrd-parse-etc.service 123ms dbus-broker.service 119ms systemd-vconsole-setup.service 117ms nmb.service 110ms systemd-tmpfiles-setup.service 101ms dracut-cmdline.service 93ms systemd-udev-trigger.service 89ms vboxdrv.service 88ms sata_home.mount 85ms NetworkManager.service 76ms packagekit.service 70ms chronyd.service 67ms sysstat-summary.service 60ms readonly-root.service 58ms lm_sensors.service 52ms rsyslog.service 50ms systemd-fsck@dev-disk-by\x2duuid-221608f2\x2d5914\x2d4619\x2d9ef7\x2d6dfddf233fd4.service 47ms plymouth-switch-root.service 46ms systemd-tmpfiles-setup-dev.service 46ms akmods-shutdown.service 44ms sysstat.service 44ms colord.service 40ms systemd-binfmt.service 39ms netcf-transaction.service Try to boot with systemd debug on kernel commandline [1]. sshd depends on network.target and is invoked by multi-user.target. This is also something to look into whether these two are successfully started/enabled. [1] https://freedesktop.org/wiki/Software/systemd/Debugging/#ifyoucangetashell Everything else starts as it should, but not sshd: [root]# systemctl status network.target ● network.target - Network Loaded: loaded (/usr/lib/systemd/system/network.target; static; vendor preset: disabled) Active: active since Mon 2020-06-29 09:08:46 CEST; 1h 46min ago Docs: man:systemd.special(7) https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget Jun 29 09:08:46 XXXXXXX systemd[1]: Reached target Network. [root]# systemctl status multi-user.target ● multi-user.target - Multi-User System Loaded: loaded (/usr/lib/systemd/system/multi-user.target; static; vendor preset: disabled) Active: active since Mon 2020-06-29 09:08:54 CEST; 1h 46min ago Docs: man:systemd.special(7) Jun 29 09:08:54 XXXXXXX systemd[1]: Reached target Multi-User System. [root]# systemctl status sshd ● sshd.service - OpenSSH server daemon Loaded: loaded (/usr/lib/systemd/system/sshd.service; enabled; vendor preset: enabled) Active: inactive (dead) Docs: man:sshd(8) man:sshd_config(5) I booted the system with all output enabled and debug-shell enabled ( complete fail is you ask me ) and could not find any warning/error message while booting. I reinstalled openssh-server package, no change. I checked the boot.log for my last boot and ssh does not get mentioned at all. Systemd seems to ignore sshd at all. (out of interest: why does sshd-keygen gets started on each boot?) No sign of sshd at all : # cat /var/log/boot.log|grep ssh [ OK ] Created slice system-sshd\x2dkeygen.slice. [ OK ] Reached target sshd-keygen.target. [ OK ] Created slice system-sshd\x2dkeygen.slice. [ OK ] Reached target sshd-keygen.target. [ OK ] Created slice system-sshd\x2dkeygen.slice. [ OK ] Reached target sshd-keygen.target. drwxr-xr-x. 2 root root 4096 29. Jun 11:21 . drwxr-xr-x. 24 root root 4096 3. Jun 22:34 .. lrwxrwxrwx. 1 root root 37 18. Okt 2016 abrtd.service -> /usr/lib/systemd/system/abrtd.service lrwxrwxrwx. 1 root root 49 3. Dez 2017 abrt-journal-core.service -> /usr/lib/systemd/system/abrt-journal-core.service lrwxrwxrwx. 1 root root 41 18. Okt 2016 abrt-oops.service -> /usr/lib/systemd/system/abrt-oops.service lrwxrwxrwx. 1 root root 43 18. Okt 2016 abrt-vmcore.service -> /usr/lib/systemd/system/abrt-vmcore.service lrwxrwxrwx. 1 root root 41 18. Okt 2016 abrt-xorg.service -> /usr/lib/systemd/system/abrt-xorg.service lrwxrwxrwx. 1 root root 38 1. Jul 2014 akmods.service -> /usr/lib/systemd/system/akmods.service lrwxrwxrwx. 1 root root 47 1. Jul 2014 akmods-shutdown.service -> /usr/lib/systemd/system/akmods-shutdown.service lrwxrwxrwx. 1 root root 35 12. Dez 2013 atd.service -> /usr/lib/systemd/system/atd.service lrwxrwxrwx. 1 root root 38 12. Dez 2013 auditd.service -> /usr/lib/systemd/system/auditd.service lrwxrwxrwx. 1 root root 44 12. Dez 2013 avahi-daemon.service -> /usr/lib/systemd/system/avahi-daemon.service lrwxrwxrwx. 1 root root 39 12. Dez 2013 chronyd.service -> /usr/lib/systemd/system/chronyd.service lrwxrwxrwx. 1 root root 37 12. Dez 2013 crond.service -> /usr/lib/systemd/system/crond.service lrwxrwxrwx. 1 root root 33 12. Dez 2013 cups.path -> /usr/lib/systemd/system/cups.path lrwxrwxrwx. 1 root root 36 5. Mai 14:01 cups.service -> /usr/lib/systemd/system/cups.service lrwxrwxrwx. 1 root root 39 12. Jun 2018 dbxtool.service -> /usr/lib/systemd/system/dbxtool.service lrwxrwxrwx. 1 root root 40 28. Dez 2018 fail2ban.service -> /usr/lib/systemd/system/fail2ban.service lrwxrwxrwx. 1 root root 56 27. Nov 2019 flatpak-add-fedora-repos.service -> /usr/lib/systemd/system/flatpak-add-fedora-repos.service lrwxrwxrwx. 1 root root 42 12. Dez 2013 irqbalance.service -> /usr/lib/systemd/system/irqbalance.service lrwxrwxrwx. 1 root root 40 1. Jul 2014 libvirtd.service -> /usr/lib/systemd/system/libvirtd.service lrwxrwxrwx. 1 root root 42 1. Jul 2014 lm_sensors.service -> /usr/lib/systemd/system/lm_sensors.service lrwxrwxrwx. 1 root root 41 12. Dez 2013 mdmonitor.service -> /usr/lib/systemd/system/mdmonitor.service lrwxrwxrwx. 1 root root 44 12. Dez 2013 ModemManager.service -> /usr/lib/systemd/system/ModemManager.service lrwxrwxrwx. 1 root root 49 12. Mai 2015 netcf-transaction.service -> /usr/lib/systemd/system/netcf-transaction.service lrwxrwxrwx. 1 root root 46 12. Dez 2013 NetworkManager.service -> /usr/lib/systemd/system/NetworkManager.service ################################################################################################ lrwxrwxrwx. 1 root root 34 22. Nov 2014 nfs.target -> /usr/lib/systemd/system/nfs.target ################################################################################################ lrwxrwxrwx. 1 root root 35 4. Jun 13:12 nmb.service -> /usr/lib/systemd/system/nmb.service lrwxrwxrwx. 1 root root 36 18. Nov 2017 nscd.service -> /usr/lib/systemd/system/nscd.service lrwxrwxrwx. 1 root root 40 12. Dez 2013 remote-fs.target -> /usr/lib/systemd/system/remote-fs.target lrwxrwxrwx. 1 root root 43 1. Jul 2014 restorecond.service -> /usr/lib/systemd/system/restorecond.service lrwxrwxrwx. 1 root root 36 12. Dez 2013 rngd.service -> /usr/lib/systemd/system/rngd.service lrwxrwxrwx. 1 root root 39 1. Jul 2014 rsyslog.service -> /usr/lib/systemd/system/rsyslog.service lrwxrwxrwx. 1 root root 38 1. Jul 2014 smartd.service -> /usr/lib/systemd/system/smartd.service lrwxrwxrwx. 1 root root 35 4. Jun 13:12 smb.service -> /usr/lib/systemd/system/smb.service ###################### disabled/enabled sshd.service to remove odd entries: lrwxrwxrwx. 1 root root 36 29. Jun 11:21 sshd.service -> /usr/lib/systemd/system/sshd.service lrwxrwxrwx. 1 root root 39 30. Sep 2016 sysstat.service -> /usr/lib/systemd/system/sysstat.service lrwxrwxrwx. 1 root root 39 22. Jun 10:01 vboxdrv.service -> /usr/lib/systemd/system/vboxdrv.service [root]# systemctl disable nfs-server nfs.target is still there.. # rpm -qf /usr/lib/systemd/system/nfs.target Fehler: Datei /usr/lib/systemd/system/nfs.target: Datei oder Verzeichnis nicht gefunden Fail2ban does not get started too.. As sshd depends on sshd-keygen ( not sure why ): # systemctl status sshd-keygen.target ● sshd-keygen.target Loaded: loaded (/usr/lib/systemd/system/sshd-keygen.target; static; vendor preset: disabled) Active: active since Mon 2020-06-29 11:13:56 CEST; 18min ago Jun 29 11:13:56 XXXXXXXXXXXXX systemd[1]: Reached target sshd-keygen.target. Ok, one interessting observation: sshd seems to get started looooooooooong after the boot and something stops journald and rsyslogd from logging the events correctly. 4h after a boot on the 28th, sshd shows first signs of activity in the logs. Something is very odd here. > (out of interest: why does sshd-keygen gets started on each boot?)
It makes sure ssh host keys are generated before the start of sshd service. It is most important on first boot, but it makes sure they are regenerated in case they are removed for some reason.
Obvious follow-up question would be if you have some specific configuration in this host where you observe the issue, that would be different than in other hosts.
No, nothing special. I don't see this being caused by openssh. You pointed out that also fail2ban fails to start so I assume there is something wrong in systemd. If not, they will be hopefully able to point you to right direction how to debug this further. fail2ban has openssh as a start dependency, if openssh does not start, fail2ban will neither. But yes, a systemd debug intervention sounds good :D Update: Since 29th of June the server seems to get started at boottime again, which seems to get caused by a reinstall of openssh-server on the 29th : # grep ssh dnf.rpm.log 2020-06-03T20:34:25Z SUBDEBUG Upgraded: x11-ssh-askpass-1.2.4.1-26.fc30.x86_64 2020-06-03T20:38:28Z SUBDEBUG Upgraded: apache-sshd-1:2.2.0-2.fc30.noarch 2020-06-03T20:41:54Z SUBDEBUG Upgraded: qemu-block-ssh-2:3.1.1-2.fc30.x86_64 2020-06-03T20:43:16Z SUBDEBUG Upgraded: libssh2-1.9.0-3.fc30.x86_64 2020-06-03T20:46:14Z SUBDEBUG Upgraded: openssh-clients-8.0p1-5.fc30.x86_64 2020-06-03T20:46:14Z SUBDEBUG Upgraded: openssh-8.0p1-5.fc30.x86_64 2020-06-03T20:46:28Z SUBDEBUG Upgraded: libssh-0.9.3-1.fc30.x86_64 2020-06-03T20:46:58Z SUBDEBUG Upgraded: libssh-config-0.9.3-1.fc30.noarch 2020-06-29T09:07:41Z SUBDEBUG Reinstall: openssh-server-8.1p1-1.fc31.x86_64 2020-06-29T09:07:42Z SUBDEBUG Reinstalled: openssh-server-8.1p1-1.fc31.x86_64 -- Reboot -- Jun 29 11:59:24 XXXXXXX systemd[1]: Starting OpenSSH server daemon... Jun 29 11:59:24 XXXXXXX sshd[8366]: Server listening on 0.0.0.0 port 22. Jun 29 11:59:24 XXXXXXX sshd[8366]: Server listening on :: port 22. Jun 29 11:59:24 XXXXXXX systemd[1]: Started OpenSSH server daemon. -- Reboot -- Jun 30 14:30:08 XXXXXXX systemd[1]: Starting OpenSSH server daemon... Jun 30 14:30:08 XXXXXXX sshd[18743]: Server listening on 0.0.0.0 port 22. Jun 30 14:30:08 XXXXXXX sshd[18743]: Server listening on :: port 22. Jun 30 14:30:08 XXXXXXX systemd[1]: Started OpenSSH server daemon. -- Reboot -- Jul 01 08:46:53 XXXXXXX systemd[1]: Starting OpenSSH server daemon... Jul 01 08:46:53 XXXXXXX sshd[1742]: Server listening on 0.0.0.0 port 22. Jul 01 08:46:53 XXXXXXX sshd[1742]: Server listening on :: port 22. Jul 01 08:46:53 XXXXXXX systemd[1]: Started OpenSSH server daemon. Not sure how I missed it, but from comment #2, it looks like you had both sshd.socket and sshd.service enabled (do I read it right? Can you try to re-enable the socket and reproduce the bug?). This is something that is not handled well in systemd (these conflicts) and systemd during boot time removes both of them from the boot sequence for some weird reason. We were reporting this to systemd before but no luck in getting this resolved. See the RHEL bug #1327332. Reading again through this, sshd.socket conflicts with sshd.service, but not the other way round. Could you try if you add the Conflicts in the other direction too, if it will work better? sshd is running and sshd.socket is dead now. Not sure, if it was running, while sshd.service was not. $ systemctl status sshd.socket ● sshd.socket - OpenSSH Server Socket Loaded: loaded (/usr/lib/systemd/system/sshd.socket; enabled; vendor preset: disabled) Active: inactive (dead) Docs: man:sshd(8) man:sshd_config(5) Listen: [::]:22 (Stream) Accepted: 0; Connected: 0; If this will solve the problem, i will disable sshd.socket sshd.socket is disabled by default so I assume it was enabled for some reason. This command from comment #2 confirms that it was enabled: [root]# ls -la /etc/systemd/system/sockets.target.wants/sshd.socket lrwxrwxrwx. 1 root root 35 1. Jul 2014 /etc/systemd/system/sockets.target.wants/sshd.socket -> /usr/lib/systemd/system/sshd.socket Reinstall of openssh probably returned it back to normal and disabled it. It is disabled by distro, but enabled on this system: Loaded: loaded (/usr/lib/systemd/system/sshd.socket; enabled; vendor preset: disabled) I think, the reinstall moved it back "infront" of sshd.socket, which now fails to start because sshd is running, it's mutal exclusive, isn't it? $ systemctl start sshd.socket ... Jul 02 14:59:35 X systemd[26256]: sshd.socket: Failed to create listening socket ([::]:22): Address already in use Jul 02 14:59:35 X systemd[1]: sshd.socket: Failed to receive listening socket ([::]:22): Input/output error Jul 02 14:59:35 X systemd[1]: sshd.socket: Failed to listen on sockets: Input/output error Jul 02 14:59:35 X systemd[1]: sshd.socket: Failed with result 'resources'. Jul 02 14:59:35 X systemd[1]: Failed to listen on OpenSSH Server Socket. perfect. I will disable it permanently. Right. Good to hear it is resolved now. But from time to time somebody appears who enables both of these and then neither of them works. There is some description what Conflicts= keyword does with units but from it it sounds like removing both of the conflicting service and socket is expected behavior (srsly?). Lets keep it open so systemd developers can have a look into this. This message is a reminder that Fedora 31 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora 31 on 2020-11-24. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '31'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 31 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. This message is a reminder that Fedora 32 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora 32 on 2021-05-25. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '32'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 32 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. There's a few different angles to this issue. 1. It is possible to do 'systemctl enable sshd.socket sshd.service' even though they have Conflicts We don't do any kind of detection for this when enabling units. Only when an actual transaction is built, systemd will notice conflicting start jobs and report the issue and not start some of the units. In this particular case it is indeed unambiguous and could be "seen" fairly easily. But in general it's not possible to figure out, short of executing the transaction on a running system. Units can have Condition* settings, which means that they might not be started despite being enabled. Also, such a situation might actually be intentional, for example when one of the targets is intended to be started later. (For example, we could have a situation where sshd.service is started as part of multi-user.target, but when the user later on starts some special other target, they want sshd.service stopped and the sshd.socket started, to reduce resource usage. So it might be OK to have two units Conflict and without any Conditon*, and one Wanted by a target that also Wants the other unit.) And from yet another side, additional units might be created by generators, and create additional dependency problems, even though no such problems were visible in the static units on disk. So I think it's reasonable not do such verification at enable time: it would be fairly hard to do correctly, and would certainly yield false positives and not catch other cases anyway. Verification when building the transaction must suffice. 2. Systemctl kicks both sshd.socket and sshd.service from the transaction. This is definitely a bug. Upstream issue: https://github.com/systemd/systemd/issues/19650. 3. sshd.socket is stupid. sshd.socket that starts a separate sshd instance for each connection is something almost nobody wants. It's inefficient, and doesn't allow to switch to permanently running sshd very nicely, and it's confusing. The only reason this was added is because sshd does not support socket activation. Let's maybe also fix that: https://bugzilla.redhat.com/show_bug.cgi?id=1961785 (bumping to rawhide as you confirmed that this still reproduces in git master) |