Bug 1972266
| Summary: | ssh remote commands hang on CentOS8 commands if ForwardX11 is default | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Paul Raines <raines> |
| Component: | systemd | Assignee: | Jacek Migacz <jmigacz> |
| Status: | CLOSED WONTFIX | QA Contact: | Frantisek Sumsal <fsumsal> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | CentOS Stream | CC: | bstinson, jjelen, jwboyer, msekleta, ron, systemd-maint-list |
| Target Milestone: | beta | Flags: | pm-rhel:
mirror+
|
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-12-15 07:30:46 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Attachments: | |||
Could you please provide Debug logs at level 3 for both good and bad cases from both ends? Created attachment 1791558 [details]
Output of: ssh -vvv -E /tmp/sshbad.txt baldur "echo DONE"
Output of: ssh -vvv -E /tmp/sshbad.txt baldur "echo DONE"
followed by Control-C when it hung
Created attachment 1791559 [details]
Output of: ssh -x -vvv -E /tmp/sshgood.txt baldur "echo DONE"
Output of: ssh -x -vvv -E /tmp/sshgood.txt baldur "echo DONE"
which returns back to the client shell just fine
Interesting. The logs say the x11 channel is closed. Can you check if it works with older versions of CentOS 8, for example openssh-8.0p1-7.el8 or older? I downgraded as far as openssh-8.0p1-5.el8 and still the same issue.
While it was hung, on the remote box as root I ran this
[root@nymeria ~]# pstree -u raines -p
dbus-daemon(125764)───{dbus-daemon}(125765)
dbus-launch(125762)
sshd(125675)
systemd(125654)─┬─(sd-pam)(125660)
├─dbus-daemon(125715)───{dbus-daemon}(125717)
├─gvfsd(125720)─┬─{gvfsd}(125722)
│ ├─{gvfsd}(125723)
│ └─{gvfsd}(125724)
├─gvfsd-fuse(125726)─┬─{gvfsd-fuse}(125731)
│ ├─{gvfsd-fuse}(125733)
│ ├─{gvfsd-fuse}(125734)
│ ├─{gvfsd-fuse}(125738)
│ └─{gvfsd-fuse}(125743)
└─pulseaudio(125673)───{pulseaudio}(125754)
[root@nymeria ~]# kill 125764 125762 125654
[root@nymeria ~]# pstree -u raines -p
dbus-launch(125762)
sshd(125675)
[root@nymeria ~]# kill 125762
-bash: kill: (125762) - No such process
[root@nymeria ~]# pstree -u raines -p
No processes found.
After that last dbus-launch died the sshd finally closed and returned to prompt on the client machine
I then tried it again killing just that solo dbus-launchpad without the dbus-daemon child procs and it also worked to make the client fully exit back to the shell prompt (and then those other processes died too).
[root@nymeria ~]# pstree -u raines -p
dbus-daemon(126031)───{dbus-daemon}(126032)
dbus-launch(126030)
sshd(125941)
systemd(125918)─┬─(sd-pam)(125923)
├─dbus-daemon(125983)───{dbus-daemon}(125985)
├─gvfsd(125988)─┬─{gvfsd}(125990)
│ ├─{gvfsd}(125991)
│ └─{gvfsd}(125992)
├─gvfsd-fuse(125994)─┬─{gvfsd-fuse}(125999)
│ ├─{gvfsd-fuse}(126000)
│ ├─{gvfsd-fuse}(126002)
│ ├─{gvfsd-fuse}(126004)
│ └─{gvfsd-fuse}(126011)
└─pulseaudio(125939)───{pulseaudio}(126020)
[root@nymeria ~]# kill 126030
[root@nymeria ~]# kill 126030
[root@nymeria ~]# pstree -u raines -p
No processes found.
I will add this works to stop the hang but is obviously unsafe: sisu[0]:raines$ ssh nymeria "echo DONE; pkill -u raines dbus-launch" DONE sisu[0]:raines$ Looks like some sort of systemd issues, so I currently tend to change the component. Created attachment 1800928 [details] Output of: ssh -vvv -E /tmp/looksgoodssh.txt bz1972266 "echo DONE" I can't reproduce it on Centos 8 Stream (with xauth installed) [centos8stream ~]$ grep PRETTY_NAME /etc/os-release PRETTY_NAME="CentOS Stream 8" [centos8stream ~]$ dnf info openssh | grep ^Source Source : openssh-8.0p1-9.el8.src.rpm [centos8stream ~]$ dnf info systemd | grep -m 1 ^Source Source : systemd-239-48.el8.src.rpm [centos8stream ~]$ sudo grep X11Forwarding /etc/ssh/sshd_config | grep -v ^# X11Forwarding yes Attaching output of `ssh -vvv -E /tmp/looksgoodssh.txt bz1972266 "echo DONE"`. We have the same versions as you on two systems I just tested but it still hangs for me
Are you sure you have a working X server on your client? 'ssh -Y' will hang while 'ssh -x' will not.
Try running 'ssh -Y remotehost xdpyinfo' as an example.
There is no 'dbus-launch' process on the remote host when you 'ssh -x'. There is with 'ssh -Y' and killing it on the remote side will result in releasing the hang.
Here are the user processes on the remote side with 'ssh -x'
[root@omega ~]# pstree -u raines -p
sshd(32281)───bash(32289)
systemd(32257)─┬─(sd-pam)(32260)
├─dbus-daemon(32331)───{dbus-daemon}(32333)
├─gvfsd(32335)─┬─{gvfsd}(32337)
│ ├─{gvfsd}(32339)
│ └─{gvfsd}(32340)
├─gvfsd-fuse(32342)─┬─{gvfsd-fuse}(32346)
│ ├─{gvfsd-fuse}(32347)
│ ├─{gvfsd-fuse}(32348)
│ ├─{gvfsd-fuse}(32349)
│ └─{gvfsd-fuse}(32356)
└─pulseaudio(32280)───{pulseaudio}(32385)
Here are the user processes on the remote side with 'ssh -Y'
dbus-daemon(32597)───{dbus-daemon}(32598)
dbus-launch(32596)
sshd(32480)───bash(32490)
systemd(32457)─┬─(sd-pam)(32460)
├─dbus-daemon(32550)───{dbus-daemon}(32552)
├─gvfsd(32555)─┬─{gvfsd}(32557)
│ ├─{gvfsd}(32558)
│ └─{gvfsd}(32559)
├─gvfsd-fuse(32561)─┬─{gvfsd-fuse}(32566)
│ ├─{gvfsd-fuse}(32567)
│ ├─{gvfsd-fuse}(32568)
│ ├─{gvfsd-fuse}(32569)
│ └─{gvfsd-fuse}(32573)
└─pulseaudio(32478)───{pulseaudio}(32529)
Not sure what else may be relevant. We have systems configured to use sssd in PAM where sssd is using LDAP. User home dirs are on NFS. However the problem even happens with the local root account between machines.
Just tried on my CentOS Stream 8 laptop at home with almost no post-install customizations just doing
ssh -Y localhost xdpyinfo
and it hangs till I Ctrl-C or kill the dbus-launch process.
I was able to reproduce with no LDAP nor NFS; so these seems to be not relevant. After evaluating this issue, there are no plans to address it further or fix it in an upcoming release. Therefore, it is being closed. If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened. |
Description of problem: Doing something as simple as ssh remotehost "echo DONE" will hang forever if the remotehost is CentOS 8 Stream and X11 forwarding (-Y) is on by default on client and X11Forwarding yes on server. Using 'ssh -x' will work. Works fine if remotehost is CentOS7 Version-Release number of selected component (if applicable): openssh-8.0p1-9.el8.x86_64 How reproducible: Works this way on all CentOS8 boxes I have tested Steps to Reproduce: 1. Setup ForwardX11 on client machine in ssh_config and X11Forwarding yes on remotehost CentOS8 box 2. Run from client: ssh remotehost 'echo DONE' Actual results: Command does echo DONE but hangs never returning to the prompt on the client Expected results: Command should return to prompt on client Additional Info: This causes hangs where remote ssh commands are run from scripts in packages such as the vglconnect script in VirtualGL. So this is not a simple matter of telling users to use the -x option. This is not a problem when the server is a CentOS7 box configured the same way (or a Ubuntu 18.04 or 20.04 box) I suspect this has something to do with all the extra crap started by systemd on a ssh login now. This is what I see on remotehost for user before running the ssh command [root@baldur ~]# pstree -u raines No processes found. and then right after while it is hung [root@baldur ~]# pstree -u raines dbus-daemon───{dbus-daemon} dbus-launch sshd systemd─┬─(sd-pam) ├─dbus-daemon───{dbus-daemon} ├─gvfsd───3*[{gvfsd}] ├─gvfsd-fuse───5*[{gvfsd-fuse}] └─pulseaudio───{pulseaudio} Something in that is probably using the X11 forward tunnel causing ssh to think it cannot safely terminate the connection. But wierdly just doing 'ssh remotehost' to get a interactive shell on the remotehost and then doing 'exit' will not hang (thought it does take about 2-3 seconds to exit fully).