RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2068495 - System enters Emergency mode due to ostree-remount.service failing (hitting start limit)
Summary: System enters Emergency mode due to ostree-remount.service failing (hitting s...
Keywords:
Status: CLOSED DUPLICATE of bug 2065322
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: systemd
Version: 8.5
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: systemd-maint
QA Contact: Frantisek Sumsal
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-03-25 14:05 UTC by Renaud Métrich
Modified: 2022-03-25 15:23 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-25 15:23:07 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Journal showing failing boot with systemd-239-51.el8_5.5 (228.12 KB, application/gzip)
2022-03-25 14:09 UTC, Renaud Métrich
no flags Details
Journal showing successful boot with systemd-239-51.el8_5.3 (211.41 KB, application/gzip)
2022-03-25 14:10 UTC, Renaud Métrich
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-116836 0 None None None 2022-03-25 14:16:30 UTC

Description Renaud Métrich 2022-03-25 14:05:54 UTC
Description of problem:

On a RHEL8.5 system with ostree package install, the ostree-remount.service makes the system enter Emergency target even though the service is a no-op (due to not having "ostree" on the kernel command line).

This happens when GUI related targets are re-enqueued multiple times due to failures, e.g. bluetooth.target, sound.target (in case of multiple sound cards).

The targets pull again ostree-remount.service and after 6 or 7 retries ostree-remount.service fires OnFailure which leads to entering Emergency.

Example using a QEMU/KVM reproducer where I configured 6 sound cards (2 of each kind), booted with debugging enabled:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
systemd[1]: ostree-remount.service: Installed new job ostree-remount.service/start as 148
systemd[1]: ostree-remount.service: ConditionKernelCommandLine=ostree failed.
systemd[1]: ostree-remount.service: Starting requested but condition failed. Not starting unit.
systemd[1]: ostree-remount.service: Job ostree-remount.service/start finished, result=done
systemd[1]: ostree-remount.service: Installed new job ostree-remount.service/start as 347
systemd[1]: ostree-remount.service: ConditionKernelCommandLine=ostree failed.
systemd[1]: ostree-remount.service: Starting requested but condition failed. Not starting unit.
systemd[1]: ostree-remount.service: Job ostree-remount.service/start finished, result=done
systemd[1]: ostree-remount.service: Installed new job ostree-remount.service/start as 422
systemd[1]: ostree-remount.service: ConditionKernelCommandLine=ostree failed.
systemd[1]: ostree-remount.service: Starting requested but condition failed. Not starting unit.
systemd[1]: ostree-remount.service: Job ostree-remount.service/start finished, result=done
systemd[1]: ostree-remount.service: Installed new job ostree-remount.service/start as 514
systemd[1]: ostree-remount.service: Merged into installed job ostree-remount.service/start as 514
systemd[1]: ostree-remount.service: Merged into installed job ostree-remount.service/start as 514
systemd[1]: ostree-remount.service: Merged into installed job ostree-remount.service/start as 514
systemd[1]: ostree-remount.service: ConditionKernelCommandLine=ostree failed.
systemd[1]: ostree-remount.service: Starting requested but condition failed. Not starting unit.
systemd[1]: ostree-remount.service: Job ostree-remount.service/start finished, result=done
systemd[1]: ostree-remount.service: Installed new job ostree-remount.service/start as 888
systemd[1]: ostree-remount.service: ConditionKernelCommandLine=ostree failed.
systemd[1]: ostree-remount.service: Starting requested but condition failed. Not starting unit.
systemd[1]: ostree-remount.service: Job ostree-remount.service/start finished, result=done
systemd[1]: ostree-remount.service: Installed new job ostree-remount.service/start as 981
systemd[1]: ostree-remount.service: Start request repeated too quickly.
systemd[1]: ostree-remount.service: Failed with result 'start-limit-hit'.
systemd[1]: ostree-remount.service: Changed dead -> failed
systemd[1]: ostree-remount.service: Job ostree-remount.service/start finished, result=failed
systemd[1]: ostree-remount.service: Unit entered failed state.
systemd[1]: ostree-remount.service: Triggering OnFailure= dependencies.

---> HERE OnFailure triggering

systemd[1]: ostree-remount.service: Changed dead -> failed
systemd[1]: ostree-remount.service: Installed new job ostree-remount.service/start as 1228
systemd[1]: ostree-remount.service: ConditionKernelCommandLine=ostree failed.
systemd[1]: ostree-remount.service: Starting requested but condition failed. Not starting unit.
systemd[1]: ostree-remount.service: Job ostree-remount.service/start finished, result=done
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

Attaching the full journal. Below is the excerpt showing the targets being re-enqueued due to sound cards initialization:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
Mar 25 14:29:37.718829 vm-workstation84 systemd[1]: initrd.target: Trying to enqueue job initrd.target/start/isolate
Mar 25 14:29:37.836965 vm-workstation84 systemd[1]: sys-fs-fuse-connections.mount: Trying to enqueue job sys-fs-fuse-connections.mount/start/fail
Mar 25 14:29:37.861212 vm-workstation84 systemd[1]: sys-kernel-config.mount: Trying to enqueue job sys-kernel-config.mount/start/fail
Mar 25 14:29:38.084474 vm-workstation84 systemd[1]: lvm2-pvscan@252:3.service: Trying to enqueue job lvm2-pvscan@252:3.service/start/fail
Mar 25 14:29:39.070500 vm-workstation84 systemd[1]: initrd-fs.target: Trying to enqueue job initrd-fs.target/start/replace
Mar 25 14:29:39.075523 vm-workstation84 systemd[1]: initrd-cleanup.service: Trying to enqueue job initrd-cleanup.service/start/replace
Mar 25 14:29:39.129489 vm-workstation84 systemd[1]: initrd-switch-root.target: Trying to enqueue job initrd-switch-root.target/start/isolate
Mar 25 14:29:39.684256 vm-workstation84 systemd[1]: graphical.target: Trying to enqueue job graphical.target/start/isolate
Mar 25 14:29:39.692033 vm-workstation84 systemd[1]: systemd-journald.service: Trying to enqueue job systemd-journald.service/restart/replace
Mar 25 14:29:40.016500 vm-workstation84 systemd[1]: sys-kernel-config.mount: Trying to enqueue job sys-kernel-config.mount/start/fail
Mar 25 14:29:40.020613 vm-workstation84 systemd[1]: sys-fs-fuse-connections.mount: Trying to enqueue job sys-fs-fuse-connections.mount/start/fail
Mar 25 14:29:40.075700 vm-workstation84 systemd[1]: qemu-guest-agent.service: Trying to enqueue job qemu-guest-agent.service/start/fail
Mar 25 14:29:40.085866 vm-workstation84 systemd[1]: spice-vdagentd.socket: Trying to enqueue job spice-vdagentd.socket/start/fail
Mar 25 14:29:40.143828 vm-workstation84 systemd[1]: lvm2-pvscan@252:3.service: Trying to enqueue job lvm2-pvscan@252:3.service/start/fail
Mar 25 14:29:40.440335 vm-workstation84 systemd[1]: sound.target: Trying to enqueue job sound.target/start/fail
Mar 25 14:29:40.441277 vm-workstation84 systemd[1]: sound.target: Trying to enqueue job sound.target/start/fail
Mar 25 14:29:40.442296 vm-workstation84 systemd[1]: sound.target: Trying to enqueue job sound.target/start/fail
Mar 25 14:29:40.444293 vm-workstation84 systemd[1]: sound.target: Trying to enqueue job sound.target/start/fail
Mar 25 14:29:42.370219 vm-workstation84 systemd[1]: sound.target: Trying to enqueue job sound.target/start/fail
Mar 25 14:29:42.383957 vm-workstation84 systemd[1]: sound.target: Trying to enqueue job sound.target/start/fail
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

For some reason the issue doesn't happen with systemd-239-51.el8_5.3, which doesn't have the fix for BZ #2037395.
Probably with this older systemd, the rate limit was not checked at this point in the boot (0677-core-Check-unit-start-rate-limiting-earlier.patch seems to have introduced the regression).


Version-Release number of selected component (if applicable):

systemd-239-51.el8_5.5


How reproducible:

Always on my QEMU/KVM but only in full debugging mode ("systemd.log_level=debug systemd.log_target=kmsg log_buf_len=15M printk.devkmsg=on")
The reason is likely a race condition due to increased slowness.

Steps to Reproduce:
1. Install a QEMU/KVM system with Graphical User Interface
2. Add 6 sound cards (2 "ac97", 2 "ich6", 2 "ich9")
3. Configure 2 CPUs
4. Boot with "systemd.log_level=debug systemd.log_target=kmsg log_buf_len=15M printk.devkmsg=on"


Actual results:

Emergency entered

Expected results:

No failure since ostree-remount.service is a no-op so should not trigger start-limit at all

Additional info:

This happens for a real customer system having 2 sound cards, 1 bluetooth and 1 softraid.

Comment 1 Renaud Métrich 2022-03-25 14:09:46 UTC
Created attachment 1868338 [details]
Journal showing failing boot with systemd-239-51.el8_5.5

Comment 2 Renaud Métrich 2022-03-25 14:10:17 UTC
Created attachment 1868339 [details]
Journal showing successful boot with systemd-239-51.el8_5.3

Comment 3 David Tardon 2022-03-25 15:23:07 UTC

*** This bug has been marked as a duplicate of bug 2065322 ***


Note You need to log in before you can comment on or make changes to this bug.