RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2089362 - gio process invoked from /usr/libexec/dbus-1/dbus-kill-process-with-session never quits
Summary: gio process invoked from /usr/libexec/dbus-1/dbus-kill-process-with-session n...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: dbus
Version: 8.6
Hardware: Unspecified
OS: Linux
medium
high
Target Milestone: rc
: ---
Assignee: Ray Strode [halfline]
QA Contact: Petr Schindler
URL:
Whiteboard:
Depends On:
Blocks: 2097784 2136054
TreeView+ depends on / blocked
 
Reported: 2022-05-23 13:57 UTC by GV
Modified: 2022-12-08 19:18 UTC (History)
8 users (show)

Fixed In Version: dbus-1.12.8-23.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2097784 2136054 (view as bug list)
Environment:
Last Closed: 2022-11-08 10:53:59 UTC
Type: Bug
Target Upstream Version:
Embargoed:
pm-rhel: mirror+


Attachments (Terms of Use)
Proposed fix (5.26 KB, patch)
2022-06-16 15:12 UTC, Ray Strode [halfline]
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-122999 0 None None None 2022-05-23 14:16:02 UTC
Red Hat Product Errata RHBA-2022:7769 0 None None None 2022-11-08 10:54:09 UTC

Description GV 2022-05-23 13:57:10 UTC
Description of problem:
For each ssh session on a remote machine, an "gio monitor -f /run/systemd/sessions/X" is spawned that never quits.

Version-Release number of selected component (if applicable):
dbus-x11-1.12.8-18.el8.x86_64

How reproducible:


Steps to Reproduce:
On terminal 1:
$ ssh root@remote-host
root@remote-host ~]# ps auxww | grep remote-user
root         962  0.0  0.0  12140  1076 pts/0    S+   16:41   0:00 grep --color=auto remote-user
[root@remote-host ~]# who
root     pts/0        2022-05-23 16:41 (10.0.2.2)
[root@remote-host ~]# w
 16:41:43 up 1 min,  1 user,  load average: 0.00, 0.00, 0.00
USER     TTY      FROM             LOGIN@   IDLE   JCPU   PCPU WHAT
root     pts/0    10.0.2.2         16:41    2.00s  0.00s  0.00s w
[root@remote-host ~]#

Switching to terminal 2
$ for ((i=1; i<=30; i++)); do echo $i; ssh remote-host "/usr/bin/true; sleep 1"; done1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

Switching back to terminal 1:
[root@remote-host ~]# ps auxww | grep remote-user
remote-user      971  0.0  0.0  89528  9372 ?        Ss   16:42   0:00 /usr/lib/systemd/systemd --user
remote-user      973  0.0  0.0 221720  4408 ?        S    16:42   0:00 (sd-pam)
remote-user     1025  0.0  0.0 143972  7416 ?        Sl   16:42   0:00 gio monitor -f /run/systemd/sessions/3
remote-user     1085  0.0  0.0 143972  7284 ?        Sl   16:42   0:00 gio monitor -f /run/systemd/sessions/5
remote-user     1146  0.0  0.0 143972  7264 ?        Sl   16:42   0:00 gio monitor -f /run/systemd/sessions/6
remote-user     1207  0.0  0.0 143972  7332 ?        Sl   16:42   0:00 gio monitor -f /run/systemd/sessions/7
remote-user     1265  0.0  0.0 143972  7348 ?        Sl   16:42   0:00 gio monitor -f /run/systemd/sessions/8
remote-user     1328  0.0  0.0 143972  7428 ?        Sl   16:42   0:00 gio monitor -f /run/systemd/sessions/9
remote-user     1386  0.0  0.0 143972  7232 ?        Sl   16:42   0:00 gio monitor -f /run/systemd/sessions/10
remote-user     1445  0.0  0.0 143972  7356 ?        Sl   16:42   0:00 gio monitor -f /run/systemd/sessions/11
remote-user     1508  0.0  0.0 143972  7396 ?        Sl   16:42   0:00 gio monitor -f /run/systemd/sessions/12
remote-user     1566  0.0  0.0 143972  7368 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/13
remote-user     1626  0.0  0.0 143972  7300 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/14
remote-user     1686  0.0  0.0 143972  7360 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/15
remote-user     1746  0.0  0.0 143972  7360 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/16
remote-user     1806  0.0  0.0 143972  7296 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/17
remote-user     1867  0.0  0.0 143972  7296 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/18
remote-user     1926  0.0  0.0 143972  7280 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/19
remote-user     1986  0.0  0.0 143972  7512 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/20
remote-user     2045  0.0  0.0 143972  7336 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/21
remote-user     2108  0.0  0.0 143972  7228 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/22
remote-user     2165  0.0  0.0 143972  7412 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/23
remote-user     2225  0.0  0.0 143972  7356 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/24
remote-user     2285  0.0  0.0 143972  7328 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/25
remote-user     2345  0.0  0.0 143972  7288 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/26
remote-user     2406  0.0  0.0 143972  7336 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/27
remote-user     2466  0.0  0.0 143972  7360 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/28
remote-user     2526  0.0  0.0 143972  7288 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/29
remote-user     2586  0.0  0.0 143972  7308 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/30
remote-user     2645  0.0  0.0 143972  7244 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/31
remote-user     2705  0.0  0.0 143972  7280 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/32
remote-user     2765  0.0  0.0 143972  7280 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/33
root        2777  0.0  0.0  12140  1176 pts/0    R+   16:44   0:00 grep --color=auto remote-user
[root@remote-host ~]# who
root     pts/0        2022-05-23 16:41 (10.0.2.2)
[root@remote-host ~]# w
 16:44:22 up 4 min,  1 user,  load average: 0.00, 0.00, 0.00
USER     TTY      FROM             LOGIN@   IDLE   JCPU   PCPU WHAT
root     pts/0    10.0.2.2         16:41    2.00s  0.00s  0.00s w
[root@remote-host ~]# sleep 60
[root@remote-host ~]# ps auxww | grep remote-user
remote-user      971  0.0  0.0  89528  9372 ?        Ss   16:42   0:00 /usr/lib/systemd/systemd --user
remote-user      973  0.0  0.0 221720  4408 ?        S    16:42   0:00 (sd-pam)
remote-user     1025  0.0  0.0 143972  7416 ?        Sl   16:42   0:00 gio monitor -f /run/systemd/sessions/3
remote-user     1085  0.0  0.0 143972  7284 ?        Sl   16:42   0:00 gio monitor -f /run/systemd/sessions/5
remote-user     1146  0.0  0.0 143972  7264 ?        Sl   16:42   0:00 gio monitor -f /run/systemd/sessions/6
remote-user     1207  0.0  0.0 143972  7332 ?        Sl   16:42   0:00 gio monitor -f /run/systemd/sessions/7
remote-user     1265  0.0  0.0 143972  7348 ?        Sl   16:42   0:00 gio monitor -f /run/systemd/sessions/8
remote-user     1328  0.0  0.0 143972  7428 ?        Sl   16:42   0:00 gio monitor -f /run/systemd/sessions/9
remote-user     1386  0.0  0.0 143972  7232 ?        Sl   16:42   0:00 gio monitor -f /run/systemd/sessions/10
remote-user     1445  0.0  0.0 143972  7356 ?        Sl   16:42   0:00 gio monitor -f /run/systemd/sessions/11
remote-user     1508  0.0  0.0 143972  7396 ?        Sl   16:42   0:00 gio monitor -f /run/systemd/sessions/12
remote-user     1566  0.0  0.0 143972  7368 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/13
remote-user     1626  0.0  0.0 143972  7300 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/14
remote-user     1686  0.0  0.0 143972  7360 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/15
remote-user     1746  0.0  0.0 143972  7360 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/16
remote-user     1806  0.0  0.0 143972  7296 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/17
remote-user     1867  0.0  0.0 143972  7296 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/18
remote-user     1926  0.0  0.0 143972  7280 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/19
remote-user     1986  0.0  0.0 143972  7512 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/20
remote-user     2045  0.0  0.0 143972  7336 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/21
remote-user     2108  0.0  0.0 143972  7228 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/22
remote-user     2165  0.0  0.0 143972  7412 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/23
remote-user     2225  0.0  0.0 143972  7356 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/24
remote-user     2285  0.0  0.0 143972  7328 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/25
remote-user     2345  0.0  0.0 143972  7288 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/26
remote-user     2406  0.0  0.0 143972  7336 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/27
remote-user     2466  0.0  0.0 143972  7360 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/28
remote-user     2526  0.0  0.0 143972  7288 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/29
remote-user     2586  0.0  0.0 143972  7308 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/30
remote-user     2645  0.0  0.0 143972  7244 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/31
remote-user     2705  0.0  0.0 143972  7280 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/32
remote-user     2765  0.0  0.0 143972  7280 ?        Sl   16:43   0:00 gio monitor -f /run/systemd/sessions/33
root        2785  0.0  0.0  12140  1068 pts/0    S+   16:46   0:00 grep --color=auto remote-user
[root@rl8 ~]# lsof | grep systemd/sessions
systemd-l  596                   root   18r     FIFO               0,23      0t0      19005 /run/systemd/sessions/1.ref
sshd       889                   root    7w     FIFO               0,23      0t0      19005 /run/systemd/sessions/1.ref
sshd       903                   root    7w     FIFO               0,23      0t0      19005 /run/systemd/sessions/1.ref



Actual results:


Expected results:


Additional info:

Comment 1 Derek Schrock 2022-06-08 22:17:35 UTC
Can https://issues.redhat.com/browse/RHELPLAN-122999 be opened for public viewing?  "You can't view this issue"  when I goto that URL.

Base off the comment here https://bugzilla.redhat.com/show_bug.cgi?id=1940067#c33 that approx. the script /usr/libexec/dbus-1/dbus-kill-process-with-session should the
   
   ... 
   trap "kill -TERM $DBUS_SESSION_BUS_PID" EXIT

   coproc SESSION_MONITOR (gio monitor -f "/run/systemd/sessions/${XDG_SESSION_ID}")
   ...

be after of the coproc including "$!" to ensure the gio process is terminated as well?

   ... 
   coproc SESSION_MONITOR (gio monitor -f "/run/systemd/sessions/${XDG_SESSION_ID}")

   trap "kill -TERM $DBUS_SESSION_BUS_PID $!" EXIT
   ...

That terminate both dbus session and the gio process when the script exits.

Comment 2 Tomas Popela 2022-06-15 11:41:42 UTC
(In reply to Derek Schrock from comment #1)
> Can https://issues.redhat.com/browse/RHELPLAN-122999 be opened for public
> viewing?  "You can't view this issue"  when I goto that URL.

That's just an internally mirrored bug into JIRA - the content is identical.

I'm adding Ray as he made the comment you've referenced.

Comment 3 Ray Strode [halfline] 2022-06-15 19:17:50 UTC
It's unfortunate that we're still having issues.

Thanks Derek for investigating. The actual script we shipped was:

---
╎❯ cat dbus-kill-process-with-session 
#!/bin/bash
# This script ensures the dbus-daemon is killed when the session closes.
# It's used by SSH sessions that have X forwarding (since the X display
# may outlive the session in those cases)
[ $# != 1 ] && exit 1

exec >& /dev/null

trap "kill -TERM $1" EXIT

export GVFS_DISABLE_FUSE=1
coproc SESSION_MONITOR (gio monitor -f "/run/systemd/sessions/${XDG_SESSION_ID}")

while grep -q ^State=active <(loginctl show-session $XDG_SESSION_ID)
do
    read -u ${SESSION_MONITOR[0]}
done
---

But you have indeed figured out the problem.  I think it's slightly better to keep the trap where it is, but change it to:

trap 'kill -TERM $1 $(jobs -p)' EXIT

That way the trap is in effect from the beginning

Comment 4 Derek Schrock 2022-06-15 19:39:46 UTC
LGTM. Thanks for the feedback.

Comment 8 Ray Strode [halfline] 2022-06-16 15:12:31 UTC
Created attachment 1890654 [details]
Proposed fix

Comment 11 Petr Schindler 2022-07-11 19:08:19 UTC
I couldn't reproduce the bug with version dbus-1.12.8-19.el8. No regression found.

Comment 12 GV 2022-08-09 09:31:55 UTC
The bug is still there if shell is tcsh.

$ rpm -q dbus-x11
dbus-x11-1.12.8-18.el8_6.1.x86_64

[ssh several times to remote-host...]

$ ps auxww | grep gio
remote-user  3432354  0.0  0.0 143972  7476 ?        Sl   12:28   0:00 gio monitor -f /run/systemd/sessions/249
remote-user  3432423  0.0  0.0 143972  7488 ?        Sl   12:28   0:00 gio monitor -f /run/systemd/sessions/250
remote-user  3432496  0.0  0.0 143972  7436 ?        Sl   12:28   0:00 gio monitor -f /run/systemd/sessions/251
remote-user  3432565  0.0  0.0 143972  7416 ?        Sl   12:28   0:00 gio monitor -f /run/systemd/sessions/252
remote-user  3432642  0.0  0.0 143972  7416 ?        Sl   12:28   0:00 gio monitor -f /run/systemd/sessions/253
remote-user  3432658  0.0  0.0  12140  1188 pts/4    S+   12:28   0:00 grep --color=auto gio

Comment 13 Ray Strode [halfline] 2022-08-09 15:28:20 UTC
Interesting bug.

tcsh apparently, oddly, instructs all child processes in startup scripts to ignore requests to be killed.  See:

https://access.redhat.com/solutions/74293

Some programs, like dbus-daemon will explicitly override that instruction, but others like the gio program just go with the flow.

The two ways to fix this that I can think of would be to either:

1. Change the gio program to explicitly die on termination signals (like dbus-daemon does)
2. Use a hang up signal instead of a termination signal, since tcsh instructs child processes in start up scripts to treat those as fatal.

I think 2 is the path of least resistance... Something like, 

--- /usr/libexec/dbus-1/dbus-kill-process-with-session
+++ /usr/libexec/dbus-1/dbus-kill-process-with-session
@@ -8,3 +8,3 @@ 
 
-trap 'kill -TERM $1 $(jobs -p)' EXIT
+trap 'kill -TERM $1; KILL -HUP $(jobs -p)' EXIT

Comment 18 Petr Schindler 2022-08-23 09:05:11 UTC
Our tests pass. No regression found. I wasn't able to reproduce original bug. Not even with original version of dbus. So it looks good for me.

Comment 25 Ray Strode [halfline] 2022-09-06 15:32:09 UTC
So this reproducer highlights a race condition.

1. trap 'kill -TERM $1; kill -HUP $(jobs -p)' EXIT
2.
3. export GVFS_DISABLE_FUSE=1
4. coproc SESSION_MONITOR (gio monitor -f "/run/systemd/sessions/${XDG_SESSION_ID}")
5. 
6. while grep -q ^State=active <(loginctl show-session $XDG_SESSION_ID)
7. do
8.    read -u ${SESSION_MONITOR[0]}
9. done

line 4 is asynchronous and the file monitor might not actually get set up until after the session closes. line 6 is supposed to catch that, but it's possible the session closes right after 6 runs.  If it does, the file monitor may get set up after the file has changed, and so won't see the change, leading to the read call to hang indefinitely.

Possible ideas:

a) Put a "sleep $TIMEOUT" on line 5. That would give enough time for the file monitor to get setup, so would practically speaking eliminate the race, though technically speaking still have the race in pathological situations

b) Add a loop to explicitly wait for the file monitor on line 5. This would require looking in fdinfo of the gio process for an inotify watch. This assumes implementation details of how gio works, so it's not a great idea.
  
c) put a "-t $TIMEOUT" on the read call to make the loop wake up every $TIMEOUT seconds. if we hit the above race, the read call will block for a bit and then see the session is closed. This has the downside that the loop will wake up every iteration even when the race doesn't happen.  We'd probably want a pretty big timeout, maybe 60 seconds or something, to minimize needless wake ups. 

d) only put "-t $TIMEOUT" the first time through the loop, and do a fully blocking read for every subsequent run.  This is pretty much equivalent to idea a), just written a little differently.

Comment 27 Ray Strode [halfline] 2022-09-06 18:03:30 UTC
So I thought about this at lunch and had another idea...We could do something like this:

```
MONITOR_READY_FILE=$(mktemp dbus-session-monitor.XXXXXX --tmpdir)

trap 'rm -f "${MONITOR_READY_FILE}"; kill -TERM $1; kill -HUP $(jobs -p)' EXIT

export GVFS_DISABLE_FUSE=1
coproc SESSION_MONITOR (gio monitor -f "/run/systemd/sessions/${XDG_SESSION_ID}" "${MONITOR_READY_FILE}")

# Poll until the gio monitor command is actively monitoring
until 
    touch "${MONITOR_READY_FILE}"
    read -t 0.25 -u ${SESSION_MONITOR[0]}
do
    continue
done

# Wait until the session is closed
while
    grep -q ^State=active <(loginctl show-session $XDG_SESSION_ID)
do
    read -u ${SESSION_MONITOR[0]}
done

```

In other words, at line 5 in the previous comment, poll the monitor pipe until it detects a change event (using a brief timeout that should normally need no more than 1 iteration). Ensure the change event will always happen by using a separate file that the script has full control over, and make the script explicitly touch that file.

Comment 30 Petr Schindler 2022-09-07 14:44:02 UTC
I tested versions 22 and 23 with code from comment 24. I was finally able to reproduce the bug but with lower frequency (it seems that 0.001 s sleep worked best for me and I got 71 sessions stuck out of 1000). With version dbus-1.12.8-23.el8 I've got 0/1000 so it looks (hopefully) fixed this time.
No regression found with our test suit.

Comment 33 errata-xmlrpc 2022-11-08 10:53:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (dbus bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:7769


Note You need to log in before you can comment on or make changes to this bug.