Bug 1913734

Summary: Stop RHEL 8.3 / CentOS 8.3 systemd-nspawn container with timeout 120 seconds with message "A stop job is running for User Manager for UID 0"
Product: Red Hat Enterprise Linux 8 Reporter: Gena Makhomed <makhomed>
Component: systemdAssignee: David Tardon <dtardon>
Status: CLOSED CURRENTRELEASE QA Contact: Frantisek Sumsal <fsumsal>
Severity: high Docs Contact:
Priority: unspecified    
Version: 8.4CC: dtardon, igor, msekleta, rhbug, systemd-maint-list
Target Milestone: rc   
Target Release: 8.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-05-19 07:17:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Gena Makhomed 2021-01-07 13:46:03 UTC
Description of problem:

Stop RHEL 8.3 / CentOS 8.3 systemd-nspawn container with timeout 120 seconds with message "A stop job is running for User Manager for UID 0"

Version-Release number of selected component (if applicable):

systemd-239-41.el8_3.x86_64
systemd-container-239-41.el8_3.x86_64

How reproducible:

always

Steps to Reproduce:

1. install RHEL 8.3 / CentOS 8.3

1.1 disable SELinux

1.2 reboot

2. install systemd-container:

yum install systemd-container

3. create container test:

yum --installroot=/var/lib/machines/test --releasever=8 install systemd vim mc

4. start container test:

machinectl start test

5. enter inside container: // WARNING!!! this step is REQUIRED to reproduce this bug

machinectl shell test

6. logout from container test: 

logout

7. stop container test:

machinectl stop test

8. check in container is still running:

watch machinectl list

Actual results:

After executing command "machinectl stop test" container test will be runned for 120 seconds and will be shutdowned only after 120 seconds timeout.

Expected results:

After executing command "machinectl stop test" expected immediate container shutdown, without 120 seconds timeout.

Additional info:

Container system journal fragment:

Jan 07 14:55:16 test systemd[1]: Stopped D-Bus System Message Bus.
Jan 07 14:55:16 test systemd[1]: console-getty.service: Succeeded.
Jan 07 14:55:16 test systemd[1]: Stopped Console Getty.
Jan 07 14:56:55 test systemd[1]: user: Processes still around after final SIGKILL. Entering failed mode.
Jan 07 14:56:55 test systemd[1]: user: Failed with result 'timeout'.
Jan 07 14:56:55 test systemd[1]: Stopped User Manager for UID 0.
Jan 07 14:56:55 test systemd[1]: Stopping /run/user/0 mount wrapper...
Jan 07 14:56:55 test systemd[1]: Removed slice User Slice of UID 0.
Jan 07 14:56:55 test systemd[1]: Stopping Permit User Sessions...
Jan 07 14:56:55 test systemd[1]: run-user-0.mount: Succeeded.
Jan 07 14:56:55 test systemd[1]: Unmounted /run/user/0.
Jan 07 14:56:55 test systemd[1]: systemd-user-sessions.service: Succeeded.
Jan 07 14:56:55 test systemd[1]: Stopped Permit User Sessions.
Jan 07 14:56:55 test systemd[1]: Stopped target Remote File Systems.
Jan 07 14:56:55 test systemd[1]: user-runtime-dir: Succeeded.
Jan 07 14:56:55 test systemd[1]: Stopped /run/user/0 mount wrapper.

Host system journal fragment:

Jan  7 14:55:16 test-hosting systemd-nspawn[87722]: [  OK  ] Stopped D-Bus System Message Bus.
Jan  7 14:55:16 test-hosting systemd-nspawn[87722]: [  OK  ] Stopped Console Getty.
Jan  7 14:56:55 test-hosting systemd-nspawn[87722]: [  *** ] A stop job is running for User Manager for UID 0 (5s / 1min 39s)#015#033[K[   ***] A stop job is running for User Manager for UID 0 (5s / 1min 39s)#015#033[K[    **] A stop job is running for User Manager for UID 0 (6s / 1min 39s)#015#033[K[     *] A stop job is running for User Manager for UID 0 (6s / 1min 39s)#015#033[K[    **] A stop job is running for User Manager for UID 0 (7s / 1min 39s)#015#033[K[   ***] A stop job is running for User Manager for UID 0 (7s / 1min 39s)#015#033[K[  *** ] A stop job is running for User Manager for UID 0 (8s / 1min 39s)#015#033[K[ ***  ] A stop job is running for User Manager for UID 0 (8s / 1min 39s)#015#033[K[***   ] A stop job is running for User Manager for UID 0 (9s / 1min 39s)#015#033[K[**    ] A stop job is running for User Manager for UID 0 (9s / 1min 39s)#015#033[K[*     ] A stop job is running for User Manager for UID 0 (10s / 1min 39s)#015#033[K[**    ] A stop job is running for User Manager for UID 0 (10s / 1min 39s)#015#033[K[***   ] A stop job is running for User Manager for UID 0 (11s / 1min 39s)#015#033[K[ ***  ] A stop job is running for User Manager for UID 0 (11s / 1min 39s)#015#033[K[  *** ] A stop job is running for User Manager for UID 0 (12s / 1min 39s)#015#033[K[   ***] A stop job is running for User Manager for UID 0 (12s / 1min 39s)#015#033[K[    **] A stop job is running for User Manager for UID 0 (13s / 1min 39s)#015#033[K[     *] A stop job is running for User Manager for UID 0 (13s / 1min 39s)#015#033[K[    **] A stop job is running for User Manager for UID 0 (14s / 1min 39s)#015#033[K[   ***] A stop job is running for User Manager for UID 0 (14s / 1min 39s)#015#033[K[  *** ] A stop job is running for User Manager for UID 0 (15s / 1min 39s)#015#033[K[ ***  ] A stop job is running for User Manager for UID 0 (15s / 1min 39s)#015#033[K[***   ] A stop job is running for User Manager for UID 0 (16s / 1min 39s)#015#033[K[**    ] A stop job is running for User Manager for UID 0 (16s / 1min 39s)#015#033[K[*     ] A stop job is running for User Manager for UID 0 (17s / 1min 39s)#015#033[K[**    ] A stop job is running for User Manager for UID 0 (17s / 1min 39s)#015#033[K[***   ] A stop job is running for User Manager for UID 0 (18s / 1min 39s)#015#033[K[ ***  ] A stop job is running for User Manager for UID 0 (18s / 1min 39s)#015#033[K[  *** ] A stop job is running for User Manager for UID 0 (19s / 1min 39s)#015#033[K[   ***] A stop job is running for User Manager for UID 0 (19s / 1min 39s)#015#033[K[    **] A stop job is running for User Manager for UID 0 (20s / 1min 39s)#015#033[K[     *] A stop job is running for User Manager for UID 0 (20s / 1min 39s)#015#033[K[    **] A stop job is running for User Manager for UID 0 (21s / 1min 39s)#015#033[K[   ***] A stop job is running for User Manager for UID 0 (21s / 1min 39s)#015#033[K[  *** ] A stop job is running for User Manager for UID 0 (22s / 1min 39s)#015#033[K[ ***  ] A stop job is running for User Manager for UID 0 (22s / 1min 39s)#015#033[K[***   ] A stop job is running for User Manager for UID 0 (23s / 1min 39s)#015#033[K[**    ] A stop job is running for User Manager for UID 0 (23s / 1min 39s)#015#033[K[*     ] A stop job is running for User Manager for UID 0 (24s / 1min 39s)#015#033[K[**    ] A stop job is running for User Manager for UID 0 (24s / 1min 39s)#015#033[K[***   ] A stop job is running for User Manager for UID 0 (25s / 1min 39s)#015#033[K[ ***  ] A stop job is running for User Manager for UID 0 (25s / 1min 39s)#015#033[K[  *** ] A stop job is running for User Manager for UID 0 (26s / 1min 39s)#015#033[K[   ***] A stop job is running for User Manager for UID 0 (26s / 1min 39s)#015#033[K[    **] A stop job is running for User Manager for UID 0 (27s / 1min 39s)#015#033[K[     *] A stop job is running for User Manager for UID 0 (27s / 1min 39s)#015#033[K[    **] A stop job is running for User Manager for UID 0 (28s / 1min 39s)#015#033[K[   ***] A stop job is running for User Manager for UID 0 (28s / 1min 39s)#015#033[K[  *** ] A stop job is running for User Manager for UID 0 (29s / 1min 39s)#015#033[K[ ***  ] A stop job is running for User Manager for UID 0 (29s / 1min 39s)#015#033[K[***   ] A stop job is running for User Manager for UID 0 (30s / 1min 39s)#015#033[K[**    ] A stop job is running for User Manager for UID 0 (30s / 1min 39s)#015#033[K[*     ] A stop job is running for User Manager for UID 0 (31s / 1min 39s)#015#033[K[**    ] A stop job is running for User Manager for UID 0 (31s / 1min 39s)#015#033[K[***   ] A stop job is running for User Manager for UID 0 (32s / 1min 39s)#015#033[K[ ***  ] A stop job is running for User Manager for UID 0 (32s / 1min 39s)#015#033[K[  *** ] A stop job is running for User Manager for UID 0 (33s / 1min 39s)#015#033[K[   ***] A stop job is running for User Manager for UID 0 (33s / 1min 39s)#015#033[K[    **] A stop job is running for User Manager for UID 0 (34s / 1min 39s)#015#033[K[     *] A stop job is running for User Manager for UID 0 (34s / 1min 39s)#015#033[K[    **] A stop job is running for User Manager for UID 0 (35s / 1min 39s)#015#033[K[   ***] A stop job is running for User Manager for UID 0 (35s / 1min 39s)#015#033[K[  *** ] A stop job is running for User Manager for UID 0 (36s / 1min 39s)#015#033[K[ ***  ] A stop job is running for User Manager for UID 0 (36s / 1min 39s)#015#033[K[***   ] A stop job is running for User Manager for UID 0 (37s / 1min 39s)#015#033[K[**    ] A stop job is running for User Manager for UID 0 (37s / 1min 39s)#015#033[K[*     ] A stop job is running for User Manager for UID 0 (38s / 1min 39s)#015#033[K[**    ] A stop job is running for User Manager for UID 0 (38s / 1min 39s)#015#033[K[***   ] A stop job is running for User Manager for UID 0 (39s / 1min 39s)#015#033[K[ ***  ] A stop job is running for User Manager for UID 0 (39s / 1min 39s)#015#033[K[  *** ] A stop job is running for User Manager for UID 0 (40s / 1min 39s)#015#033[K[   ***] A stop job is running for User Manager for UID 0 (40s / 1min 39s)#015#033[K[    **] A stop job is running for User Manager for UID 0 (41s / 1min 39s)#015#033[K[     *] A stop job is running for User Manager for UID 0 (41s / 1min 39s)#015#033[K[    **] A stop job is running for User Manager for UID 0 (42s / 1min 39s)#015#033[K[   ***] A stop job is running for User Manager for UID 0 (42s / 1min 39s)#015#033[K[  *** ] A stop job is running for User Manager for UID 0 (43s / 1min 39s)#015#033[K[ ***  ] A stop job is running for User Manager for UID 0 (43s / 1min 39s)#015#033[K[***   ] A stop job is running for User Manager for UID 0 (44s / 1min 39s)#015#033[K[**    ] A stop job is running for User Manager for UID 0 (44s / 1min 39s)#015#033[K[*     ] A stop job is running for User Manager for UID 0 (45s / 1min 39s)#015#033[K[**    ] A stop job is running for User Manager for UID 0 (45s / 1min 39s)#015#033[K[***   ] A stop job is running for User Manager for UID 0 (46s / 1min 39s)#015#033[K[ ***  ] A stop job is running for User Manager for UID 0 (46s / 1min 39s)#015#033[K[  *** ] A stop job is running for User Manager for UID 0 (47s / 1min 39s)#015#033[K[   ***] A stop job is running for User Manager for UID 0 (47s / 1min 39s)#015#033[K[    **] A stop job is running for User Manager for UID 0 (48s / 1min 39s)#015#033[K[     *] A stop job is running for User Manager for UID 0 (48s / 1min 39s)#015#033[K[    **] A stop job is running for User Manager for UID 0 (49s / 1min 39s)#015#033[K[   ***] A stop job is running for User Manager for UID 0 (49s / 1min 39s)#015#033[K[  *** ] A stop job is running for User Manager for UID 0 (50s / 1min 39s)#015#033[K[ ***  ] A stop job is running for User Manager for UID 0 (50s / 1min 39s)#015#033[K[***   ] A stop job is running for User Manager for UID 0 (51s / 1min 39s)#015#033[K[**    ] A stop job is running for User Manager for UID 0 (51s / 1min 39s)#015#033[K[*     ] A stop job is running for User Manager for UID 0 (52s / 1min 39s)#015#033[K[**    ] A stop job is running for User Manager for UID 0 (52s / 1min 39s)#015#033[K[***   ] A stop job is running for User
Jan  7 14:56:55 test-hosting systemd-nspawn[87722]:         Stopping /run/user/0 mount wrapper...
Jan  7 14:56:55 test-hosting systemd-nspawn[87722]: [  OK  ] Removed slice User Slice of UID 0.
Jan  7 14:56:55 test-hosting systemd-nspawn[87722]:         Stopping Permit User Sessions...
Jan  7 14:56:55 test-hosting systemd-nspawn[87722]: [  OK  ] Unmounted /run/user/0.
Jan  7 14:56:55 test-hosting systemd-nspawn[87722]: [  OK  ] Stopped Permit User Sessions.
Jan  7 14:56:55 test-hosting systemd-nspawn[87722]: [  OK  ] Stopped target Remote File Systems.
Jan  7 14:56:55 test-hosting systemd-nspawn[87722]: [  OK  ] Stopped /run/user/0 mount wrapper.

See also bugreport https://bugzilla.redhat.com/show_bug.cgi?id=1866888 for more details of this bug.

Comment 1 Gena Makhomed 2021-09-11 18:02:44 UTC
This bug also reproduces with RHEL 8.4 / CentOS 8.4

Version-Release number of selected component:

systemd-239-45.el8_4.3.x86_64
systemd-container-239-45.el8_4.3.x86_64

Comment 2 David Tardon 2022-05-18 13:46:07 UTC
Does this still happen with 8.6? We've backported some fixes related to user session handling since 8.4.

Comment 3 rhbug 2022-05-18 19:07:00 UTC
This server is running a few systemd-nspawn instances with ssh-sessions to containers and hardware node.

120s timeout on 8.5 reboot:
May 18 19:46:42 server1 systemd[1]: dracut-shutdown.service: Succeeded.
May 18 19:48:41 server1 systemd[1]: user: State 'stop-sigterm' timed out. Killing.
May 18 19:48:41 server1 systemd[1]: user: Killing process 5500 (systemd) with signal SIGKILL.
May 18 19:48:41 server1 systemd[1]: user: Failed with result 'timeout'.
May 18 19:48:41 server1 systemd[1]: Stopped User Manager for UID 0.

No timeout on 8.6 reboot:
May 18 19:56:44 server1 systemd[1]: user: Succeeded

This is just a quick test but looks good so far. Thanks!

Comment 4 David Tardon 2022-05-19 07:17:27 UTC
Thanks! Let's assume the issue is gone then.