Bug 2185785 - sss_ssh_knownhostsproxy does not exit after disconnect from libssh, leaks memory
Summary: sss_ssh_knownhostsproxy does not exit after disconnect from libssh, leaks memory
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: cockpit
Version: 37
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Martin Pitt
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-04-11 07:46 UTC by Martin Pitt
Modified: 2023-04-28 02:35 UTC (History)
15 users (show)

Fixed In Version: cockpit-290-1.fc38
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-04-28 02:35:41 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github cockpit-project cockpit issues 18310 0 None closed cockpit-bridge leaks sss_ssh_knownhostsproxy and memory 2023-04-13 15:24:42 UTC

Description Martin Pitt 2023-04-11 07:46:36 UTC
Description of problem: In https://github.com/cockpit-project/cockpit/issues/18310 we got a report of leaked sss_ssh_knownhostsproxy processes  which eat up quite a lot of RAM and keep SSH connections open to target hosts even after the parent ssh client went away.

The user logs in to cockpit locally, then starts a remote cockpit session through SSH (cockpit-ssh in particular, which uses libssh), then logs out. Logging out SIGTERMs the cockpit-ssh process. That then goes away, but the sss_ssh_knownhostsproxy child doesn't exit, but gets reparented to pid 1. It also keeps the SSH connection open still.

Version-Release number of selected component (if applicable):

sssd-common-2.8.2-1.fc37.x86_64
libssh-0.10.4-2.fc37.x86_64
cockpit-bridge-289-1.fc37.x86_64

How reproducible: Always

Steps to Reproduce:
1. Join a machine to a FreeIPA domain, and log in as IPA user. This should create /etc/ssh/ssh_config.d/04-ipa.conf with a ProxyCommand for sss_ssh_knownhostsproxy
2. Set up an SSH key and add it to ~/.ssh/authorized_keys; you should be able to do "ssh `hostname`" *without* an "unknown host key" prompt (thanks to sss_ssh_knownhostsproxy) and *without* a password prompt (due to using key login).
3. dnf install cockpit-bridge
3. Run an SSH session through libssh, and kill it:
   (printf '\n\n\n\n\n\n'; sleep 20) | /usr/libexec/cockpit-ssh `hostname` & sleep 1 && pkill -e cockpit-ssh

Actual results:

The SSH logind session hangs on shutdown:

           Since: Tue 2023-04-11 05:22:06 UTC; 1min 36s ago
          Leader: 2935
             TTY: web console
          Remote: ::ffff:172.27.0.2
         Service: cockpit; type web; class user
           State: closing
            Unit: session-11.scope
                  └─3025 /usr/bin/sss_ssh_knownhostsproxy -p 22 x0.cockpit.lan

The cockpit-ssh process is gone, but there are three leaked processes:

admin@c+    5572  0.0  0.8  16624  5632 pts/1    S    07:40   0:00 /usr/bin/sss_ssh_knownhostsproxy -p 22 x0.cockpit.lan
root        5573  0.0  2.0  47060 13184 ?        Ss   07:40   0:00 sshd: admin [priv]
admin@c+    5594  0.0  1.1  47060  7320 ?        S    07:40   0:00 sshd: admin@notty

strace -p 5572 says

    restart_syscall(<... resuming interrupted read ...>

but it's not clear from what it tries to read.

This does *not* reproduce with "ssh `hostname` sleep 20" and killing that ssh process. So this is some condition that only libssh triggers.

I know that this isn't an ideal reproducer for you. Do you have some idea how to debug that further? Enable some debug logging or so? (it's an user process, so it can't log to /var/log/sssd/)

Thanks!

Comment 1 Lukas Slebodnik 2023-04-11 14:29:33 UTC
The man page for ssh_config says:


ProxyCommand
    Specifies the command to use to connect to the server. The command string extends to the end of the line, and is executed using the user's shell ‘exec’ directive to avoid a lingering shell process.

    Arguments to ProxyCommand accept the tokens described in the TOKENS section. The command can be basically anything, and should read from its standard input and write to its standard output. It should eventually connect an sshd(8) server running on some machine, or execute sshd -i somewhere. Host key management will be done using the Hostname of the host being connected (defaulting to the name typed by the user). Setting the command to none disables this option entirely. Note that CheckHostIP is not available for connects with a proxy command.

    This directive is useful in conjunction with nc(1) and its proxy support. For example, the following directive would connect via an HTTP proxy at 192.0.2.0:

    ProxyCommand /usr/bin/nc -X connect -x 192.0.2.0:8080 %h %p

* https://man.openbsd.org/cgi-bin/man.cgi/OpenBSD-current/man5/ssh_config.5

I am not expert but maybe libssh should take care of "closing" `ProxyCommand`

Comment 4 Martin Pitt 2023-04-13 15:24:43 UTC
Alexey: Good call -- but indeed RHEL 8 doesn't support it yet (we would really like to use it for Cockpit as well, but it's annoying that we can't yet).

Thanks Lukas for pointing out! Indeed I reproduced this completely independently of sssd.

New reproducer:

1. dnf install cockpit-bridge netcat
2. Set up an SSH key and add it to ~/.ssh/authorized_keys; you should be able to do "ssh localhost" *without* an "unknown host key" prompt (i.e. accept it for the first time) and *without* a password prompt (due to using key login).
3. Set up a dummy ProxyCommand config (make sure to do this with a test user account:
   printf 'Host dummyproxy\nHostname localhost\nProxyCommand nc %%h %%p\n' > ~/.ssh/config

4. Run a proxied SSH session through ssh(1), and ensure that it works:
   ssh dummyproxy

   Check that `pgrep -a nc` shows the `nc localhost 22` proxy command launched by ssh.

5. Run an SSH session through libssh (using cockpit-ssh as client), and kill it:
   (printf '\n\n\n\n\n\n'; sleep 20) | /usr/libexec/cockpit-ssh dummyproxy & sleep 1 && pkill -e cockpit-ssh

After 5, `pgrep -a nc` shows the leaked `nc` process.

But now I realize that libssh probably shouldn't install a SIGTERM signal handler to clean this up, as that's awkward in libraries. I suppose this should happen in cockpit-ssh and shut down the SSH connection properly.

Comment 6 Fedora Update System 2023-04-19 12:45:39 UTC
FEDORA-2023-bc7e3718bc has been submitted as an update to Fedora 38. https://bodhi.fedoraproject.org/updates/FEDORA-2023-bc7e3718bc

Comment 7 Fedora Update System 2023-04-19 12:46:46 UTC
FEDORA-2023-363cf1cea2 has been submitted as an update to Fedora 37. https://bodhi.fedoraproject.org/updates/FEDORA-2023-363cf1cea2

Comment 8 Fedora Update System 2023-04-20 04:29:50 UTC
FEDORA-2023-bc7e3718bc has been pushed to the Fedora 38 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-bc7e3718bc`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-bc7e3718bc

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 9 Fedora Update System 2023-04-20 06:08:48 UTC
FEDORA-2023-363cf1cea2 has been pushed to the Fedora 37 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-363cf1cea2`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-363cf1cea2

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 10 Fedora Update System 2023-04-28 02:35:41 UTC
FEDORA-2023-bc7e3718bc has been pushed to the Fedora 38 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.