Red Hat Bugzilla – Full Text Bug Listing
|Summary:||Restarting sshd kills active connections|
|Product:||[Fedora] Fedora||Reporter:||Ben Webb <ben>|
|Component:||openssh||Assignee:||Jan F. Chadima <jchadima>|
|Status:||CLOSED NOTABUG||QA Contact:||Fedora Extras Quality Assurance <extras-qa>|
|Version:||16||CC:||jchadima, mattias.ellert, mgrepl, michal, plautrba, tmraz|
|Fixed In Version:||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2011-11-28 04:39:24 EST||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
Description Ben Webb 2011-11-23 14:25:42 EST
Description of problem: If sshd is restarted with 'sudo systemctl restart sshd.service' not only is the sshd binary killed, but all children. This forcibly logs out anybody currently connected via ssh. Also, if sshd is being upgraded by yum over an ssh connection, cleanup of the old openssh-server package fails, because the script tries to restart sshd (and thus kills the session, including yum). The old package must be manually removed with 'rpm -e --noscripts'. This seems to be a problem with the systemd unit for sshd introduced in F16; restarts work OK on F15 or F14 systems via the old init scripts. Version-Release number of selected component (if applicable): openssh-server-5.8p2-21.fc16.x86_64 How reproducible: Always. Steps to Reproduce: 1. ssh myserver 2. myserver$ sudo systemctl restart sshd.service Actual results: myserver$ sudo systemctl restart sshd.service Connection to myserver closed by remote host. Connection to myserver closed. Expected results: Main sshd process is restarted but active sessions are unaffected. Additional info: The main sshd process is at least successfully restarted, so we can log back in. But the cleanup is a nuisance. The problem seems to be that in F16, everything (including connected ssh sessions) ends up in the sshd.service cgroup: myserver$ systemctl status sshd.service sshd.service - OpenSSH server daemon Loaded: loaded (/lib/systemd/system/sshd.service; enabled) Active: active (running) since Mon, 21 Nov 2011 07:02:32 -0800; 2 days ago Main PID: 12450 (sshd) CGroup: name=systemd:/system/sshd.service ├ 11258 sshd: ben [priv] ├ 11261 sshd: ben@pts/0 ├ 11262 -bash ├ 11284 systemctl status sshd.service └ 12450 /usr/sbin/sshd -D Whereas on a F15 machine only the main sshd service is in there: f15server$ systemctl status sshd.service sshd.service - LSB: Start up the OpenSSH server daemon Loaded: loaded (/etc/rc.d/init.d/sshd) Active: active (running) since Wed, 23 Nov 2011 11:01:40 -0800; 19min ago Process: 10394 ExecStop=/etc/rc.d/init.d/sshd stop (code=exited, status=0/SUCCESS) Process: 10405 ExecStart=/etc/rc.d/init.d/sshd start (code=exited, status=0/SUCCESS) Main PID: 10412 (sshd) CGroup: name=systemd:/system/sshd.service └ 10412 /usr/sbin/sshd
Comment 1 Tomas Mraz 2011-11-23 15:06:20 EST
This happens only when there is no pam_systemd in the /etc/pam.d/password-auth. What's in your /etc/pam.d/password-auth?
Comment 2 Ben Webb 2011-11-23 15:32:43 EST
(In reply to comment #1) > This happens only when there is no pam_systemd in the /etc/pam.d/password-auth. Ah, that's it, thanks. Our configuration files were inherited from pre-systemd days. With pam_systemd added in, sshd restarts work successfully now.
Comment 3 Michal Jaegermann 2011-11-27 13:57:53 EST
(In reply to comment #1) > This happens only when there is no pam_systemd in the /etc/pam.d/password-auth. > What's in your /etc/pam.d/password-auth? Apparently this is only a part of a story. On a system I just switched from F14 to F16 I do have '-session optional pam_systemd.so' in /etc/pam.d/password-auth. Still '/bin/systemctl try-restart sshd.service' immediately drops all connections. Moreover this left me with the following after the last updates: Nov 27 11:02:08 Updated: glibc-common-2.14.90-19.x86_64 Nov 27 11:02:13 Updated: glibc-2.14.90-19.x86_64 Nov 27 11:02:13 Updated: openssh-5.8p2-22.fc16.x86_64 Nov 27 11:02:15 Updated: glibc-headers-2.14.90-19.x86_64 Nov 27 11:02:16 Updated: glibc-devel-2.14.90-19.x86_64 Nov 27 11:02:17 Updated: openssh-server-5.8p2-22.fc16.x86_64 Nov 27 11:02:17 Updated: openssh-clients-5.8p2-22.fc16.x86_64 and no transaction cleanup so all these are now duplicates with a strange exception of glibc-devel. To an added attraction an attempt to run yum-complete-transaction to cleanup that mess ended up with: Transaction size changed - this means we are not doing the same transaction as we were before. Aborting and disabling this transaction. Very nice, indeed! It does not matter if in /etc/pam.d/password-auth I have -session optional pam_systemd.so or session optional pam_systemd.so Effects if "try-restart" are exactly the same. BTW - I tried to find out in pam documentation what "-session" may mean, as opposed to "session" and I am still in a dark. Curiously enough my rawhide installation, continuously updated for a very long time, and with a similar password-auth, is NOT killing ssh connection on this "try-restart".
Comment 4 Michal Jaegermann 2011-11-27 14:41:25 EST
Hm, on rawhide openssh happens to be now openssh-server-5.9p1-13.fc17 while the one updated on F16 is openssh-5.8p2-22.fc16. OTOH I do not remember this problem on rawhide for a long, long time. To make it even more annoying it is also impossible to run 'package-cleanup --cleandupes' on a remote machine as this not only drops connections but also abandons a transaction so 'rpm -e --noscripts ...' is required. I do not see any real differences between password-auth from rawhide (no problems with sshd restarts) and F16.
Comment 5 Tomas Mraz 2011-11-28 02:29:30 EST
Michal, do I understand it right that you have password-auth in /etc/pam.d/sshd and pam_systemd in /etc/pam.d/password-auth. And still if you do 'systemctl try-restart sshd.service' it will drop your ssh connection? That would be a bug in the systemd or pam_systemd then.
Comment 6 Tomas Mraz 2011-11-28 02:31:28 EST
Also the '-' before the pam entry means that if the module is missing on the system the pam library will not report it in the syslog. It is documented in the pam.conf(5) manpage.
Comment 7 Tomas Mraz 2011-11-28 04:39:24 EST
Let's track this in bug 757545 as the original reporter of this bug did not have the pam_systemd in the configuration.