Bug 2174645
| Summary: | Failed to start Flush Journal to Persistent Storage (systemd-journal-flush.service) | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | libhe | |
| Component: | systemd | Assignee: | David Tardon <dtardon> | |
| Status: | CLOSED ERRATA | QA Contact: | Frantisek Sumsal <fsumsal> | |
| Severity: | high | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 8.8 | CC: | andavis, atodorov, bnerickson87, dreua, dtardon, hidenori.i, hongzliu, hshuai, jabia, jhughes, johannes.schischke, jpazdziora, juzhou, kdreyer, libhe, lijin, linl, litian, lizhu, lucasbaile14, matthew.lesieur, mdeng, meili, mhayden, minl, mosvald, mpitt, nmunoz, obudai, pvlasin, qzhang, roman.aleksic, scott, smitterl, systemd-maint-list, systemd-maint, tdawson, troels, tyan, tzheng, vkuznets, vogt, xiliang, xuli, xxiong, yacao, ymao, yoyang, yuxisun | |
| Target Milestone: | rc | Keywords: | CustomerScenariosInitiative, Regression, Triaged | |
| Target Release: | 8.8 | Flags: | pm-rhel:
mirror+
|
|
| Hardware: | All | |||
| OS: | Linux | |||
| Whiteboard: | CockpitTest | |||
| Fixed In Version: | systemd-239-74.el8_8 | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 2176892 (view as bug list) | Environment: | ||
| Last Closed: | 2023-05-16 09:07:47 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 2129764, 2176892 | |||
This is a regression in systemd-239-73.el8, compared to systemd-239-68.el8_7.4. Indeed, `journalctl --flush` gets stuck if /var/log/journal doesn't exist. (Also, a repeated call of `journalctl --relinquish-var` gets stuck, but that's a smaller issue.) *** Bug 2178393 has been marked as a duplicate of this bug. *** *** Bug 2178897 has been marked as a duplicate of this bug. *** This is effective as an interim workaround: sudo -s cd /var/log mkdir journal chown root.systemd-journal journal chmod 2755 journal *** Bug 2179327 has been marked as a duplicate of this bug. *** *** Bug 2182446 has been marked as a duplicate of this bug. *** (In reply to Scott Brown from comment #19) > This is effective as an interim workaround: > > sudo -s > cd /var/log > mkdir journal > chown root.systemd-journal journal > chmod 2755 journal I don't know if I would call this a workaround. This switches to persistent logging which is describe also in https://access.redhat.com/solutions/696893 This causes a change of behaviour and writes the journal to the disk with all implications. This isn't necessary, however: except for the systemd failed service unit and the delay during startup due to this bug the system still works as intended and there is no need to switch to persistent logging. So, you can safely ignore this until is has been fixed or you could run # systemctl reset-failed systemd-journal-flush.service to reset the failed unit (until the next reboot). That's more like a workaround to me instead of avoiding the error message by writing the journal to disk now... (In reply to Gerald Vogt from comment #23) > (In reply to Scott Brown from comment #19) > > This is effective as an interim workaround: > > > > sudo -s > > cd /var/log > > mkdir journal > > chown root.systemd-journal journal > > chmod 2755 journal > > I don't know if I would call this a workaround. This switches to persistent > logging which is describe also in https://access.redhat.com/solutions/696893 > > This causes a change of behaviour and writes the journal to the disk with > all implications. > > This isn't necessary, however: except for the systemd failed service unit > and the delay during startup due to this bug the system still works as > intended and there is no need to switch to persistent logging. So, you can > safely ignore this until is has been fixed or you could run > > # systemctl reset-failed systemd-journal-flush.service > > to reset the failed unit (until the next reboot). > > That's more like a workaround to me instead of avoiding the error message by > writing the journal to disk now... Alright, thanks for the clarification. I would rather have this arrangement in the interim than the disruptive 90 second wait on boot, but once the underlying defect is fixed, I assume anyone who did this can return to the original non-persistent ring buffer config by doing sudo rm -rf /var/log/journal (if that is what they want)? I hit this on CentOS Stream today (CentOS-Stream-GenericCloud-8-20230404.0 with systemd-239-73.el8). Would you please fix future bugs in CentOS Stream first before RHEL? This will save engineering resources when we all know where to look for the latest code. Here's how I customized CentOS 8's image to have the latest systemd package (systemd-239-75.el8): virt-customize -v -x -a CentOS-Stream-GenericCloud-8-20230404.0.x86_64.qcow2 --run update-cloud-init.sh --selinux-relabel And my update-systemd.sh script: #!/bin/bash # For https://bugzilla.redhat.com/show_bug.cgi?id=2174645 # # Inject this with: # virt-customize -v -x -a CentOS-Stream-GenericCloud-8-20230404.0.x86_64.qcow2 --run update-systemd.sh --selinux-relabel set -eux cat >/etc/yum.repos.d/baseos-dev.repo <<EOL [baseos-development] name=BaseOS Development baseurl=https://composes.stream.centos.org/stream-8/development/latest-CentOS-Stream/compose/BaseOS/x86_64/os/ gpgcheck=0 enabled=1 EOL dnf -y update systemd rm /etc/yum.repos.d/baseos-dev.repo *** Bug 2189428 has been marked as a duplicate of this bug. *** Just so people know, systemd-239-75.el8 has been built for CentOS Stream 8. It has this fix in it. I don't know why it was so long in gating (testing) but it got tagged into c8s-pending an hour or two after the weekly Stream 8 compose started. So it didn't make it into this weeks CentOS Stream 8 release. It will be in next weeks Stream 8 release. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (systemd bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:2985 |
Description of problem: Failed to start Flush Journal to Persistent Storage. Below is log from journalctl: Mar 02 03:19:44 ip-10-0-28-169.us-west-2.compute.internal systemd[1]: Starting Flush Journal to Persistent Storage... Mar 02 03:21:15 ip-10-0-28-169.us-west-2.compute.internal systemd[1]: systemd-journal-flush.service: start operation timed out> Mar 02 03:21:15 ip-10-0-28-169.us-west-2.compute.internal systemd[1]: systemd-journal-flush.service: Main process exited, code> Mar 02 03:21:15 ip-10-0-28-169.us-west-2.compute.internal systemd[1]: systemd-journal-flush.service: Failed with result 'timeo> Mar 02 03:21:15 ip-10-0-28-169.us-west-2.compute.internal systemd[1]: Failed to start Flush Journal to Persistent Storage. RHEL Version: RHEL8.8(4.18.0-477.el8) RHEL-8.8.0-20230301.1 How reproducible: 100% Steps to Reproduce: 1. Launch an RHEL guest with the latest RHEL-8.8 build. 2. Run 'systemctl status systemd-journal-flush.service'. Actual results: [ec2-user@ip-10-0-28-169 ~]$ systemctl status systemd-journal-flush.service ● systemd-journal-flush.service - Flush Journal to Persistent Storage Loaded: loaded (/usr/lib/systemd/system/systemd-journal-flush.service; static; vendor preset: disabled) Active: failed (Result: timeout) since Thu 2023-03-02 03:21:15 UTC; 14s ago Docs: man:systemd-journald.service(8) man:journald.conf(5) Process: 7229 ExecStart=/usr/bin/journalctl --flush (code=killed, signal=TERM) Main PID: 7229 (code=killed, signal=TERM) Mar 02 03:19:44 ip-10-0-28-169.us-west-2.compute.internal systemd[1]: Starting Flush Journal to Persistent Storage... Mar 02 03:21:15 ip-10-0-28-169.us-west-2.compute.internal systemd[1]: systemd-journal-flush.service: start operation timed out> Mar 02 03:21:15 ip-10-0-28-169.us-west-2.compute.internal systemd[1]: systemd-journal-flush.service: Main process exited, code> Mar 02 03:21:15 ip-10-0-28-169.us-west-2.compute.internal systemd[1]: systemd-journal-flush.service: Failed with result 'timeo> Mar 02 03:21:15 ip-10-0-28-169.us-west-2.compute.internal systemd[1]: Failed to start Flush Journal to Persistent Storage. Expected results: Flush Journal service should be started successfully. Additional info: - This is observed on both x86_64 and aarch64 - It seems a regression from build RHEL-8.8.0-20230301.1