Description of problem: When booting a system with a degraded rootfs, systemd hangs indefinitely at basic.target instead of dropping to a shell and producing the sosreport for troubleshooting. Version-Release number of selected component (if applicable): systemd-212-4 How reproducible: Always Steps to Reproduce: 0. Any failure to mount /sysroot will do. In my case it's a Btrfs raid1 volume with one device removed, making it degraded which currently on Btrfs does not automatically mount degraded. Actual results: Indefinite hang, cyclon eye, at basic.target. Expected results: Eventual timeout and shell prompt. Additional info: This works as expected on Fedora 20, systemd-209. If I use systemctl enable debug-shell.service and retry, I still cannot get to a shell on any tty; but even without debug-shell enabled we really need to eventually fail on basic.target and drop to a shell. Indefinite hang prevents troubleshooting even basic causes for boot failures.
During hang, console text reads with the following four lines, repeated every 15s to 55s (variable). [ **] A start job is running for dev-disk-by\x2uuid-7b742…55s / no limit)G ot notification message of unit systemd-journald.service systemd-journald.service: Got notification message from PID 105 (WATCHDOG=1…) systemd-journald.service: got WATCHDOG=1
This also hangs on Fedora 20 after updating systemd-208-9 to systemd-208-16; however there's no timer it just says: [ *** ] A start job is running for dev-disk-by\x2uuid-9ff63..b4fb6d66.device After 1 hour it's still hung.
So far, the only way I have been able to fix a btrfs raid1 volume with a missing device is to boot up in rescue mode, btrfs-dev-add a new device and then btrfs-dev-delete-missing.
*** Bug 1186908 has been marked as a duplicate of this bug. ***
This bug appears to have been reported against 'rawhide' during the Fedora 22 development cycle. Changing version to '22'. More information and reason for this action is here: https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora22
Fedora 22 changed to end-of-life (EOL) status on 2016-07-19. Fedora 22 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.
Re-opening, as we have the exact same problem in F34
Prior upstream discussion https://lists.freedesktop.org/archives/systemd-devel/2021-February/045973.html I don't actually know that this really belongs in dracut per the discussion. On the one hand, degraded arrays are the domain of dracut. mdadm doesn't do it automatically, basically dracut does a loop to wait for all devices to appear, and if they don't (after I think 300 seconds) then it does the work to assemble the array in degraded mode. Equivalent code to do this for Btrfs doesn't exist in dracut. On the other hand, there is a udev rule in place prior to even attempting to mount a multiple device Btrfs. And udev has no concept of timers. If it's never ready because it's degraded, we simply never get to the next step. This implies we need a better udev rule, that upstream bug is https://github.com/kdave/btrfs-progs/issues/264 And maybe https://github.com/kdave/btrfs-progs/issues/302 And it might imply udev needs work to better understand btrfs multiple devices, so I'm just going to leave this on systemd for now, due to these other btrfs+udev issues related to multiple devices: https://github.com/systemd/systemd/issues/19393 https://github.com/systemd/systemd/issues/14674
Oops, setting back to systemd. And also the top of the systemd thread is in January, here: https://lists.freedesktop.org/archives/systemd-devel/2021-January/045918.html
Ha, ok after re-reading all of that, I think we need someone who understands udev, liblkid, and dracut better than I do. Right now it may really be dracut's responsibility to do a timeout. But as I read this: https://lists.freedesktop.org/archives/systemd-devel/2021-January/045928.html I can't help but think that's a problem of its own. How can we not distinguish between a failed device and a user who has just wandered away? While this bug isn't about cryptsetup, it happens without LUKS being used, seems we need some way determining if all devices needed to boot are present, and if not, drop to a shell. Hanging forever isn't great. A workaround though is to have x-systemd.device-timeout=300 set in fstab for the / UUID.
*** Bug 1878652 has been marked as a duplicate of this bug. ***
This bug appears to have been reported against 'rawhide' during the Fedora 35 development cycle. Changing version to 35.
This message is a reminder that Fedora Linux 35 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora Linux 35 on 2022-12-13. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a 'version' of '35'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, change the 'version' to a later Fedora Linux version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora Linux 35 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora Linux, you are encouraged to change the 'version' to a later version prior to this bug being closed.
Fedora Linux 35 entered end-of-life (EOL) status on 2022-12-13. Fedora Linux 35 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora Linux please feel free to reopen this bug against that version. Note that the version field may be hidden. Click the "Show advanced fields" button if you do not see the version field. If you are unable to reopen this bug, please file a new report against an active release. Thank you for reporting this bug and we are sorry it could not be fixed.
Bumping to rawhide as this doesn't seem to have been fixed.
This bug appears to have been reported against 'rawhide' during the Fedora Linux 38 development cycle. Changing version to 38.
This message is a reminder that Fedora Linux 38 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora Linux 38 on 2024-05-21. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a 'version' of '38'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, change the 'version' to a later Fedora Linux version. Note that the version field may be hidden. Click the "Show advanced fields" button if you do not see it. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora Linux 38 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora Linux, you are encouraged to change the 'version' to a later version prior to this bug being closed.
Fedora Linux 38 entered end-of-life (EOL) status on 2024-05-21. Fedora Linux 38 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora Linux please feel free to reopen this bug against that version. Note that the version field may be hidden. Click the "Show advanced fields" button if you do not see the version field. If you are unable to reopen this bug, please file a new report against an active release. Thank you for reporting this bug and we are sorry it could not be fixed.