2047187 – [spec] user slice unit can fail on logout - invalid unit

This bug has been migrated to another issue tracking site. It has been closed here and may no longer be being monitored.

If you would like to get updates for this issue, or to participate in it, you may do so at Red Hat Issue Tracker .

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2047187 - [spec] user slice unit can fail on logout - invalid unit

Summary: [spec] user slice unit can fail on logout - invalid unit

Keywords:
Status:	CLOSED MIGRATED
Alias:	None
Product:	Red Hat Enterprise Linux 8
Classification:	Red Hat
Component:	systemd
Sub Component:
Version:	CentOS Stream
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	rc
Target Release:	---
Assignee:	David Tardon
QA Contact:	Frantisek Sumsal
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	2086989 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2022-01-27 11:13 UTC by Steve Traylen
Modified:	2023-09-21 11:27 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2023-09-21 11:27:39 UTC
Type:	Bug
Target Upstream Version:
Embargoed:
Dependent Products:
Flags:	pm-rhel: mirror+

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	RHEL-5896	0	None	Migrated	None	2023-09-21 11:27:35 UTC
Red Hat Issue Tracker	RHELPLAN-110005	0	None	None	None	2022-01-27 11:59:45 UTC

Description Steve Traylen 2022-01-27 11:13:22 UTC

Description of problem:

A user slice unit can go into a failed state:

# systemctl status user
● user - User Manager for UID 12345
   Loaded: loaded (/usr/lib/systemd/system/user@.service; static; vendor preset: disabled)
   Active: failed (Result: timeout) since Thu 2022-01-27 11:43:46 CET; 16min ago
  Process: 3691600 ExecStart=/usr/lib/systemd/systemd --user (code=killed, signal=KILL)
 Main PID: 3691600 (code=killed, signal=KILL)
   Status: "Startup finished in 171ms."

Jan 27 09:38:20 host.example.org systemd[3691600]: Started Mark boot as successful.
Jan 27 11:41:46 host.example.org systemd[1]: Stopping User Manager for UID 12345...
Jan 27 11:41:46 host.example.org systemd[3691600]: /usr/lib/systemd/user/systemd-exit.service:16: Failed to parse failure action specifier, ignoring: exit-force
Jan 27 11:41:46 host.example.org systemd[3691600]: systemd-exit.service: Service lacks both ExecStart= and ExecStop= setting. Refusing.
Jan 27 11:41:46 host.example.org systemd[3691600]: Failed to enqueue exit.target job: Unit systemd-exit.service has a bad unit file setting.
Jan 27 11:43:46 host.example.org systemd[1]: user: State 'stop-sigterm' timed out. Killing.
Jan 27 11:43:46 host.example.org systemd[1]: user: Killing process 3691600 (systemd) with signal SIGKILL.
Jan 27 11:43:46 host.example.org systemd[1]: user: Killing process 3691623 (krenew) with signal SIGKILL.
Jan 27 11:43:46 host.example.org systemd[1]: user: Failed with result 'timeout'.
Jan 27 11:43:46 host.example.org systemd[1]: Stopped User Manager for UID 12345.



In particular that " Failed to parse failure ac" line looks bad.

/usr/lib/systemd/user/systemd-exit.service

changed with this version of the package "systemd-239-55.el8.x86_64"

It was:
[Unit]
Description=Exit the Session
Documentation=man:systemd.special(7)
DefaultDependencies=no
Requires=shutdown.target
After=shutdown.target

[Service]
Type=oneshot
ExecStart=/usr/bin/systemctl --force exit

and is now

[Unit]
Description=Exit the Session
Documentation=man:systemd.special(7)
DefaultDependencies=no
Requires=shutdown.target
After=shutdown.target
SuccessAction=exit-force

Looks to be: 

https://github.com/systemd/systemd/commit/a400bd8c2a6285576edf8e2147e1d17aab129501

Version-Release number of selected component (if applicable):

systemd-239-55.el8.x86_64

How reproducible:

Tricky. I have not managed to recreate this on demand.

I do see it often for many users.

Steps to Reproduce:
1. User logins
2. User logouts - I don't know how, will try to find out.
3. User slice can go bad.

Actual results:

Bad user slice as above.


Expected results:

User slice should close cleanly.
Additional info:

Comment 1 Steve Traylen 2022-01-27 15:56:58 UTC

This is probably user sessions that were started before systemd was upgraded.

Can the existing systemd --user instances be respawned by calling `systemctl --user daemon-reexec` or
something on package upgrade.

Comment 2 David Tardon 2022-02-07 10:11:44 UTC

(In reply to Steve Traylen from comment #1)
> This is probably user sessions that were started before systemd was upgraded.

Yes.
 
> Can the existing systemd --user instances be respawned by calling `systemctl
> --user daemon-reexec` or
> something on package upgrade.

Doing it for all active user instances is not that simple. Unless I'm missing something, we'd have to do something like the following (which probably works, but it isn't pretty):

for u in $(systemctl show -P User user@*); do
    runuser $(id -un $u) -c 'systemctl --user daemon-reexec'
done

Comment 3 David Tardon 2022-05-17 08:58:43 UTC

*** Bug 2086989 has been marked as a duplicate of this bug. ***

Comment 4 David Tardon 2022-05-19 14:14:05 UTC

Actually there is a simpler way to reexec all user managers than the one I proposed in comment 2:

systemctl kill -s SIGRTMIN+25 $(systemctl show -P Id user@*)

Comment 5 David Tardon 2022-05-23 13:44:25 UTC

The root cause for this is a change in the way exit from an user session is done that happened between systemd-239-43 and 239-44. But the new way--use of SuccessAction=exit-force in systemd-exit.service--is only recognized by updated systemd. Because we don't reexec user instances on update, any such instance that had been started before systemd was updated still runs old systemd. Hence the user session fails to exit itself and is eventually killed by timeout. But there's practically no harm from this: only the timeout and the user instance being in failed state. It doesn't affect the system's operation in any way. The user can log in again without any problem--and will be running the updated systemd if they do, thus the issue won't happen again.

Comment 6 RHEL Program Management 2023-09-21 11:23:53 UTC

Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug.

Comment 7 RHEL Program Management 2023-09-21 11:27:39 UTC

This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there.

Due to differences in account names between systems, some fields were not replicated.  Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information.

To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer.  You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like:

"Bugzilla Bug" = 1234567

In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information.

Note You need to log in before you can comment on or make changes to this bug.