Description of problem: In https://github.com/cockpit-project/cockpit/issues/18310 we got a report of leaked sss_ssh_knownhostsproxy processes which eat up quite a lot of RAM and keep SSH connections open to target hosts even after the parent ssh client went away. The user logs in to cockpit locally, then starts a remote cockpit session through SSH (cockpit-ssh in particular, which uses libssh), then logs out. Logging out SIGTERMs the cockpit-ssh process. That then goes away, but the sss_ssh_knownhostsproxy child doesn't exit, but gets reparented to pid 1. It also keeps the SSH connection open still. Version-Release number of selected component (if applicable): sssd-common-2.8.2-1.fc37.x86_64 libssh-0.10.4-2.fc37.x86_64 cockpit-bridge-289-1.fc37.x86_64 How reproducible: Always Steps to Reproduce: 1. Join a machine to a FreeIPA domain, and log in as IPA user. This should create /etc/ssh/ssh_config.d/04-ipa.conf with a ProxyCommand for sss_ssh_knownhostsproxy 2. Set up an SSH key and add it to ~/.ssh/authorized_keys; you should be able to do "ssh `hostname`" *without* an "unknown host key" prompt (thanks to sss_ssh_knownhostsproxy) and *without* a password prompt (due to using key login). 3. dnf install cockpit-bridge 3. Run an SSH session through libssh, and kill it: (printf '\n\n\n\n\n\n'; sleep 20) | /usr/libexec/cockpit-ssh `hostname` & sleep 1 && pkill -e cockpit-ssh Actual results: The SSH logind session hangs on shutdown: Since: Tue 2023-04-11 05:22:06 UTC; 1min 36s ago Leader: 2935 TTY: web console Remote: ::ffff:172.27.0.2 Service: cockpit; type web; class user State: closing Unit: session-11.scope └─3025 /usr/bin/sss_ssh_knownhostsproxy -p 22 x0.cockpit.lan The cockpit-ssh process is gone, but there are three leaked processes: admin@c+ 5572 0.0 0.8 16624 5632 pts/1 S 07:40 0:00 /usr/bin/sss_ssh_knownhostsproxy -p 22 x0.cockpit.lan root 5573 0.0 2.0 47060 13184 ? Ss 07:40 0:00 sshd: admin [priv] admin@c+ 5594 0.0 1.1 47060 7320 ? S 07:40 0:00 sshd: admin@notty strace -p 5572 says restart_syscall(<... resuming interrupted read ...> but it's not clear from what it tries to read. This does *not* reproduce with "ssh `hostname` sleep 20" and killing that ssh process. So this is some condition that only libssh triggers. I know that this isn't an ideal reproducer for you. Do you have some idea how to debug that further? Enable some debug logging or so? (it's an user process, so it can't log to /var/log/sssd/) Thanks!
The man page for ssh_config says: ProxyCommand Specifies the command to use to connect to the server. The command string extends to the end of the line, and is executed using the user's shell ‘exec’ directive to avoid a lingering shell process. Arguments to ProxyCommand accept the tokens described in the TOKENS section. The command can be basically anything, and should read from its standard input and write to its standard output. It should eventually connect an sshd(8) server running on some machine, or execute sshd -i somewhere. Host key management will be done using the Hostname of the host being connected (defaulting to the name typed by the user). Setting the command to none disables this option entirely. Note that CheckHostIP is not available for connects with a proxy command. This directive is useful in conjunction with nc(1) and its proxy support. For example, the following directive would connect via an HTTP proxy at 192.0.2.0: ProxyCommand /usr/bin/nc -X connect -x 192.0.2.0:8080 %h %p * https://man.openbsd.org/cgi-bin/man.cgi/OpenBSD-current/man5/ssh_config.5 I am not expert but maybe libssh should take care of "closing" `ProxyCommand`
Alexey: Good call -- but indeed RHEL 8 doesn't support it yet (we would really like to use it for Cockpit as well, but it's annoying that we can't yet). Thanks Lukas for pointing out! Indeed I reproduced this completely independently of sssd. New reproducer: 1. dnf install cockpit-bridge netcat 2. Set up an SSH key and add it to ~/.ssh/authorized_keys; you should be able to do "ssh localhost" *without* an "unknown host key" prompt (i.e. accept it for the first time) and *without* a password prompt (due to using key login). 3. Set up a dummy ProxyCommand config (make sure to do this with a test user account: printf 'Host dummyproxy\nHostname localhost\nProxyCommand nc %%h %%p\n' > ~/.ssh/config 4. Run a proxied SSH session through ssh(1), and ensure that it works: ssh dummyproxy Check that `pgrep -a nc` shows the `nc localhost 22` proxy command launched by ssh. 5. Run an SSH session through libssh (using cockpit-ssh as client), and kill it: (printf '\n\n\n\n\n\n'; sleep 20) | /usr/libexec/cockpit-ssh dummyproxy & sleep 1 && pkill -e cockpit-ssh After 5, `pgrep -a nc` shows the leaked `nc` process. But now I realize that libssh probably shouldn't install a SIGTERM signal handler to clean this up, as that's awkward in libraries. I suppose this should happen in cockpit-ssh and shut down the SSH connection properly.
https://github.com/cockpit-project/cockpit/pull/18632
FEDORA-2023-bc7e3718bc has been submitted as an update to Fedora 38. https://bodhi.fedoraproject.org/updates/FEDORA-2023-bc7e3718bc
FEDORA-2023-363cf1cea2 has been submitted as an update to Fedora 37. https://bodhi.fedoraproject.org/updates/FEDORA-2023-363cf1cea2
FEDORA-2023-bc7e3718bc has been pushed to the Fedora 38 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-bc7e3718bc` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-bc7e3718bc See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2023-363cf1cea2 has been pushed to the Fedora 37 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-363cf1cea2` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-363cf1cea2 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2023-bc7e3718bc has been pushed to the Fedora 38 stable repository. If problem still persists, please make note of it in this bug report.