RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1972266 - ssh remote commands hang on CentOS8 commands if ForwardX11 is default
Summary: ssh remote commands hang on CentOS8 commands if ForwardX11 is default
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: systemd
Version: CentOS Stream
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: beta
: ---
Assignee: Jacek Migacz
QA Contact: Frantisek Sumsal
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-06-15 14:34 UTC by Paul Raines
Modified: 2022-12-15 07:30 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-12-15 07:30:46 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Output of: ssh -vvv -E /tmp/sshbad.txt baldur "echo DONE" (16.70 KB, text/plain)
2021-06-16 14:16 UTC, Paul Raines
no flags Details
Output of: ssh -x -vvv -E /tmp/sshgood.txt baldur "echo DONE" (15.46 KB, text/plain)
2021-06-16 14:17 UTC, Paul Raines
no flags Details
Output of: ssh -vvv -E /tmp/looksgoodssh.txt bz1972266 "echo DONE" (29.11 KB, text/plain)
2021-07-12 19:40 UTC, Jacek Migacz
no flags Details

Description Paul Raines 2021-06-15 14:34:47 UTC
Description of problem:

Doing something as simple as

  ssh remotehost "echo DONE"

will hang forever if the remotehost is CentOS 8 Stream and X11 forwarding (-Y) is on by default on client and X11Forwarding yes on server.  Using 'ssh -x' will work.  Works fine if remotehost is CentOS7

Version-Release number of selected component (if applicable):

openssh-8.0p1-9.el8.x86_64

How reproducible:

Works this way on all CentOS8 boxes I have tested

Steps to Reproduce:
1. Setup ForwardX11 on client machine in ssh_config and X11Forwarding yes on remotehost CentOS8 box
2. Run from client: ssh remotehost 'echo DONE'


Actual results:

Command does echo DONE but hangs never returning to the prompt on the client

Expected results:

Command should return to prompt on client

Additional Info:

This causes hangs where remote ssh commands are run from scripts in packages 
such as the vglconnect script in VirtualGL.  So this is not a simple matter of telling users to use the -x option.

This is not a problem when the server is a CentOS7 box configured the same way (or a Ubuntu 18.04 or 20.04 box)

I suspect this has something to do with all the extra crap started by systemd on a ssh login now.  This is what I see on remotehost for user before running the ssh command 

[root@baldur ~]# pstree -u raines
No processes found.

and then right after while it is hung

[root@baldur ~]# pstree -u raines
dbus-daemon───{dbus-daemon}

dbus-launch

sshd

systemd─┬─(sd-pam)
        ├─dbus-daemon───{dbus-daemon}
        ├─gvfsd───3*[{gvfsd}]
        ├─gvfsd-fuse───5*[{gvfsd-fuse}]
        └─pulseaudio───{pulseaudio}

Something in that is probably using the X11 forward tunnel causing ssh to think it cannot safely terminate the connection.  But wierdly just doing 'ssh remotehost' to get a interactive shell on the remotehost and then doing 'exit' will not hang (thought it does take about 2-3 seconds to exit fully).

Comment 1 Dmitry Belyavskiy 2021-06-16 11:36:37 UTC
Could you please provide Debug logs at level 3 for both good and bad cases from both ends?

Comment 2 Paul Raines 2021-06-16 14:16:04 UTC
Created attachment 1791558 [details]
Output of: ssh -vvv -E /tmp/sshbad.txt baldur "echo DONE"

Output of: ssh -vvv -E /tmp/sshbad.txt baldur "echo DONE"
followed by Control-C when it hung

Comment 3 Paul Raines 2021-06-16 14:17:21 UTC
Created attachment 1791559 [details]
Output of: ssh -x -vvv -E /tmp/sshgood.txt baldur "echo DONE"

Output of: ssh -x -vvv -E /tmp/sshgood.txt baldur "echo DONE"

which returns back to the client shell just fine

Comment 4 Jakub Jelen 2021-06-16 20:22:28 UTC
Interesting. The logs say the x11 channel is closed. Can you check if it works with older versions of CentOS 8, for example openssh-8.0p1-7.el8 or older?

Comment 5 Paul Raines 2021-06-16 21:08:05 UTC
I downgraded as far as openssh-8.0p1-5.el8 and still the same issue.

While it was hung, on the remote box as root I ran this

    [root@nymeria ~]# pstree -u raines -p
    dbus-daemon(125764)───{dbus-daemon}(125765)
    
    dbus-launch(125762)
    
    sshd(125675)
    
    systemd(125654)─┬─(sd-pam)(125660)
                    ├─dbus-daemon(125715)───{dbus-daemon}(125717)
                    ├─gvfsd(125720)─┬─{gvfsd}(125722)
                    │               ├─{gvfsd}(125723)
                    │               └─{gvfsd}(125724)
                    ├─gvfsd-fuse(125726)─┬─{gvfsd-fuse}(125731)
                    │                    ├─{gvfsd-fuse}(125733)
                    │                    ├─{gvfsd-fuse}(125734)
                    │                    ├─{gvfsd-fuse}(125738)
                    │                    └─{gvfsd-fuse}(125743)
                    └─pulseaudio(125673)───{pulseaudio}(125754)
    [root@nymeria ~]# kill 125764 125762 125654
    [root@nymeria ~]# pstree -u raines -p
    dbus-launch(125762)
    
    sshd(125675)
    [root@nymeria ~]# kill 125762
    -bash: kill: (125762) - No such process
    [root@nymeria ~]# pstree -u raines -p
    No processes found.

After that last dbus-launch died the sshd finally closed and returned to prompt on the client machine

I then tried it again killing just that solo dbus-launchpad without the dbus-daemon child procs and it also worked to make the client fully exit back to the shell prompt (and then those other processes died too).

    [root@nymeria ~]# pstree -u raines -p
    dbus-daemon(126031)───{dbus-daemon}(126032)

    dbus-launch(126030)

    sshd(125941)

    systemd(125918)─┬─(sd-pam)(125923)
                    ├─dbus-daemon(125983)───{dbus-daemon}(125985)
                    ├─gvfsd(125988)─┬─{gvfsd}(125990)
                    │               ├─{gvfsd}(125991)
                    │               └─{gvfsd}(125992)
                    ├─gvfsd-fuse(125994)─┬─{gvfsd-fuse}(125999)
                    │                    ├─{gvfsd-fuse}(126000)
                    │                    ├─{gvfsd-fuse}(126002)
                    │                    ├─{gvfsd-fuse}(126004)
                    │                    └─{gvfsd-fuse}(126011)
                    └─pulseaudio(125939)───{pulseaudio}(126020)
    [root@nymeria ~]# kill 126030
    [root@nymeria ~]# kill 126030
    [root@nymeria ~]# pstree -u raines -p
    No processes found.

Comment 6 Paul Raines 2021-06-16 21:17:23 UTC
I will add this works to stop the hang but is obviously unsafe:

   sisu[0]:raines$ ssh nymeria "echo DONE; pkill -u raines dbus-launch"
   DONE
   sisu[0]:raines$

Comment 7 Dmitry Belyavskiy 2021-06-17 07:52:44 UTC
Looks like some sort of systemd issues, so I currently tend to change the component.

Comment 8 Jacek Migacz 2021-07-12 19:40:23 UTC
Created attachment 1800928 [details]
Output of: ssh -vvv -E /tmp/looksgoodssh.txt bz1972266 "echo DONE"

Comment 9 Jacek Migacz 2021-07-12 19:41:57 UTC
I can't reproduce it on Centos 8 Stream (with xauth installed)

[centos8stream ~]$ grep PRETTY_NAME /etc/os-release
PRETTY_NAME="CentOS Stream 8"

[centos8stream ~]$ dnf info openssh | grep ^Source
Source       : openssh-8.0p1-9.el8.src.rpm

[centos8stream ~]$ dnf info systemd | grep -m 1 ^Source
Source       : systemd-239-48.el8.src.rpm

[centos8stream ~]$ sudo grep X11Forwarding /etc/ssh/sshd_config | grep -v ^#
X11Forwarding yes

Attaching output of `ssh -vvv -E /tmp/looksgoodssh.txt bz1972266 "echo DONE"`.

Comment 10 Paul Raines 2021-07-12 20:27:44 UTC
We have the same versions as you on two systems I just tested but it still hangs for me

Are you sure you have a working X server on your client?  'ssh -Y' will hang while 'ssh -x' will not.
Try running 'ssh -Y remotehost xdpyinfo' as an example.

There is no 'dbus-launch' process on the remote host when you 'ssh -x'.  There is with 'ssh -Y' and killing it on the remote side will result in releasing the hang.


Here are the user processes on the remote side with 'ssh -x'

[root@omega ~]# pstree -u raines -p
sshd(32281)───bash(32289)

systemd(32257)─┬─(sd-pam)(32260)
               ├─dbus-daemon(32331)───{dbus-daemon}(32333)
               ├─gvfsd(32335)─┬─{gvfsd}(32337)
               │              ├─{gvfsd}(32339)
               │              └─{gvfsd}(32340)
               ├─gvfsd-fuse(32342)─┬─{gvfsd-fuse}(32346)
               │                   ├─{gvfsd-fuse}(32347)
               │                   ├─{gvfsd-fuse}(32348)
               │                   ├─{gvfsd-fuse}(32349)
               │                   └─{gvfsd-fuse}(32356)
               └─pulseaudio(32280)───{pulseaudio}(32385)


Here are the user processes on the remote side with 'ssh -Y'

dbus-daemon(32597)───{dbus-daemon}(32598)

dbus-launch(32596)

sshd(32480)───bash(32490)

systemd(32457)─┬─(sd-pam)(32460)
               ├─dbus-daemon(32550)───{dbus-daemon}(32552)
               ├─gvfsd(32555)─┬─{gvfsd}(32557)
               │              ├─{gvfsd}(32558)
               │              └─{gvfsd}(32559)
               ├─gvfsd-fuse(32561)─┬─{gvfsd-fuse}(32566)
               │                   ├─{gvfsd-fuse}(32567)
               │                   ├─{gvfsd-fuse}(32568)
               │                   ├─{gvfsd-fuse}(32569)
               │                   └─{gvfsd-fuse}(32573)
               └─pulseaudio(32478)───{pulseaudio}(32529)

Not sure what else may be relevant.  We have systems configured to use sssd in PAM where sssd is using LDAP.  User home dirs are on NFS. However the problem even happens with the local root account between machines.

Just tried on my CentOS Stream 8 laptop at home with almost no post-install customizations just doing

   ssh -Y localhost xdpyinfo

and it hangs till I Ctrl-C or kill the dbus-launch process.

Comment 11 Jacek Migacz 2021-07-13 17:30:49 UTC
I was able to reproduce with no LDAP nor NFS; so these seems to be not relevant.

Comment 13 RHEL Program Management 2022-12-15 07:30:46 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.


Note You need to log in before you can comment on or make changes to this bug.