Bug 1825232
Summary: | System drops into emergency mode for no obvious reason after upgrading to latest systemd [rhel-7.9.z] | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Renaud Métrich <rmetrich> | |
Component: | systemd | Assignee: | Michal Sekletar <msekleta> | |
Status: | CLOSED ERRATA | QA Contact: | Frantisek Sumsal <fsumsal> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 7.8 | CC: | amarecek, asamir, bcao, fkrska, fsumsal, jreznik, kwalker, mhatanak, mschena, msekleta, myamazak, ovasik, pdwyer, qguo, rblakley, systemd-maint-list | |
Target Milestone: | rc | Keywords: | ZStream | |
Target Release: | --- | |||
Hardware: | All | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1889314 1889315 (view as bug list) | Environment: | ||
Last Closed: | 2020-11-10 12:58:04 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1889314, 1889315 |
Description
Renaud Métrich
2020-04-17 12:49:30 UTC
In order to reproduce easily, I perform the following hack: 1. Update the system to RHEL 7.6 Latest and reboot 2. Edit /usr/lib/systemd/system/initrd-cleanup.service to delay its end -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- ExecStart=/bin/bash -c '/usr/bin/systemctl --no-block isolate initrd-switch-root.target && sleep 5' -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- 3. Update the system to RHEL 7.8 latest *except* systemd and reboot 4. Update systemd to latest Doing so triggers the issue. I then get the following journal (with "debug"): -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- Trying to enqueue job initrd-switch-root.target/start/isolate Installed new job systemd-udevd-control.socket/stop as 80 Installed new job timers.target/stop as 90 Installed new job initrd.target/stop as 85 Installed new job swap.target/stop as 81 Installed new job paths.target/stop as 100 Installed new job remote-fs.target/stop as 96 Installed new job systemd-udev-trigger.service/stop as 91 Installed new job local-fs.target/stop as 95 Installed new job sockets.target/stop as 99 Installed new job systemd-tmpfiles-setup-dev.service/stop as 102 HERE: job canceled Job initrd-cleanup.service/start finished, result=canceled Sent message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=JobRemoved cookie=1 reply_cookie=0 error=n/a Installed new job initrd-cleanup.service/stop as 94 Installed new job dracut-cmdline.service/stop as 82 Installed new job systemd-udevd-kernel.socket/stop as 78 Installed new job dracut-pre-udev.service/stop as 92 Installed new job dracut-initqueue.service/stop as 88 Installed new job remote-fs-pre.target/stop as 101 Installed new job initrd-switch-root.service/start as 55 Installed new job plymouth-switch-root.service/start as 58 Installed new job initrd-switch-root.target/start as 54 Installed new job slices.target/stop as 89 Installed new job basic.target/stop as 83 Installed new job initrd-udevadm-cleanup-db.service/start as 77 Installed new job sysinit.target/stop as 97 Installed new job dracut-pre-pivot.service/stop as 86 Installed new job systemd-sysctl.service/stop as 87 Installed new job systemd-udevd.service/stop as 79 Installed new job kmod-static-nodes.service/stop as 93 Enqueued job initrd-switch-root.target/start as 54 [...] initrd-cleanup.service changed start -> stop-sigterm Received SIGCHLD from PID 492 (bash). Child 492 (bash) died (code=killed, status=15/TERM) Child 492 belongs to initrd-cleanup.service initrd-cleanup.service: main process exited, code=killed, status=15/TERM initrd-cleanup.service changed stop-sigterm -> dead Job initrd-cleanup.service/stop finished, result=done Stopped Cleaning Up and Shutting Down Daemons. -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- The weird thing is that emergency.target enters not because of initrd-cleanup.service, but initrd-switch-root.service which doesn't print any suspicious log! This may be due to BZ #1754053 but I would like to be really sure. A customer reported he could see this while upgrading systemd from latest 7.7 to 7.8 Folks from Alibaba are also running into the same problem and they proposed solution upstream. https://github.com/systemd-rhel/rhel-7/pull/117 Even though the proposed fix is a hack we have decided to go ahead and merge it (after the issues pointed out in code review get fixed) due to number of cases attached to the BZ. Making the BZ public. fix merged to github master branch -> https://github.com/systemd-rhel/rhel-7/pull/117 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (systemd bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:5007 |