Description of problem: In certain system installs, the booted system seems to freeze during boot. But it doesn't freeze, it's just that the screen stops refreshing. You can either see a black screen, or the text that was present when grub exited, i.e. a text saying "Booting $grub-item-name", or just a text cursor in the top left corner. Underneath, the system boots just fine, and can even be operated blindly (I can log in and run commands blindly, it works), but the screen never updates. But if you remove "rhgb" from the grub command line, the system boots perfectly fine, screen updates as expected. These are the conditions under which the bug occurs: 1. The install must be performed from Everything netinst image. (Server netinst or Server DVD work just fine). 2. You must install either Fedora Custom Operating System (custom-environment) or Minimal Install (minimal-environment). 3. It only happens on bare metal. (I tested two bare metal machines, both exhibit it. It doesn't happen in virtual machines). 4. It happens for both UEFI and BIOS installs. It very confuses me that the same environment has different behavior when installed from Everything netinst vs Server netinst. I tried to compare the package set, and found a few extra packages when installed from Server netinst, but adding them to the Everything netinst installation doesn't resolve the problem. Even installing the whole server-product-environment group on an affected system doesn't resolve the problem (which is a group which works just fine when installed from both Server and Everything netinst). So perhaps this is not about the package set but about some bootloader/grub files that differ between Server and Everything? I really don't know where to look. Version-Release number of selected component (if applicable): plymouth-24.004.60-3.fc40.x86_64 grub2-common-2.06-119.fc40.noarch Fedora-Everything-netinst-x86_64-40_Beta-1.2.iso # broken Fedora-Server-netinst-x86_64-40_Beta-1.2.iso # works How reproducible: always Steps to Reproduce: 1. use a bare metal 2. boot Everything netinst image 3. install either Fedora Custom Operating System or Minimal Install 4. reboot after install 5. see the grub menu, let it time out 6. see that the default "Booting $grub-menu-item" text never disappears, login prompt never appears 7. you can use Ctrl+Alt+Del to reboot the machine. Or ssh in. Or log in blindly and reboot. 8. Boot again, this time remove "rhgb" from grub 9. See that it boots normally, you can see boot messages and the login prompt
Created attachment 2021429 [details] journal (broken boot) This is a journal from a broken boot (with rhgb). The system is operational, just the screen never updates. The system was rebooted with Ctrl+Alt+Del after a while.
Created attachment 2021430 [details] journal (ok boot) This is a journal from an OK boot (rhgb removed). Screen updates as expected.
Created attachment 2021431 [details] list of rpms installed
Everything netinst is a release blocking deliverable, proposing for a blocker discussion.
Oh jeez, I actually saw this on my test system when verifying the firmware RAID bug fix, but I was in a hurry and didn't think much of it, figured it was just a weird blip... Can you produce 'success' by doing an install from Everything boot iso, but going through custom partitioning and setting the filesystem to XFS (but otherwise letting it create the partitions for you)?
Uhhhh, very nice! The difference is really in the partition layout! It doesn't matter whether you use Everything or Server netinst (I checked), it only depends on the target layout: WORKS: /boot ext4 / lvm -> xfs /boot xfs / lvm -> xfs /boot ext4 / lvm -> ext4 DOESN'T WORK: /boot ext4 / btrfs /boot xfs / btrfs /boot ext4 / ext4 /boot ext4 / xfs /boot xfs / xfs Eh, it looks like it depends on the / partition, not /boot (as I assumed), and it only works if the / partition is inside LVM! Also, once you install the full Server package set, any partition layout (most probably, haven't checked everything) works. The bug only affects Custom and Minimal sets.
I suspect that the file system is not a direct cause, but instead it just changes timing and exposes the issue in some other component. One thing that clearly fails is this: Mar 13 15:55:09 fedora systemd[1]: Starting systemd-vconsole-setup.service - Virtual Console Setup... Mar 13 15:55:09 fedora systemd[1]: Mounted sys-kernel-config.mount - Kernel Configuration File System. Mar 13 15:55:09 fedora systemd-vconsole-setup[537]: setfont: ERROR kdfontop.c:183 put_font_kdfontop: Unable to load such font with such kernel version Mar 13 15:55:09 fedora systemd-vconsole-setup[534]: /usr/bin/setfont failed with a "system error" (EX_OSERR), ignoring. Mar 13 15:55:09 fedora systemd-vconsole-setup[534]: Setting source virtual console failed, ignoring remaining ones. Mar 13 15:55:09 fedora systemd[1]: Finished systemd-vconsole-setup.service - Virtual Console Setup. But this should only cause the text console not to get the right fonts, it shouldn't interfere with getting a text console. There were a few patches in systemd after v255 to make handle this better, and so far we didn't backport them because it didn't seem important enough. But if you don't figure out a different reason, we can try, at least to see if it makes a difference.
+4 in https://pagure.io/fedora-qa/blocker-review/issue/1521 , marking accepted.
I have a couple interesting findings. First, changing which plymouth theme is active doesn't have any effect. Second, installing plymouth-graphics-libs resolves the problem, the bootsplash appears and the login prompt is usable. Third, while this was true yesterday, it's not true today: > Even installing the whole server-product-environment group on an affected system doesn't resolve the problem Today, when I install the server group, bootsplash appears, login prompt works. When I uninstall it, it's back to the broken state. I tried to bisect whether I find a package that flips the behavior, and it looks like if I install nfs-utils and certain iwlwifi-*-firmware packages together, it flips to a working state. But it's weird and inconsistent. From all these bits, I have a feeling that this is really a race condition, as Zbigniew suggested. And having different filesystems, or different processes/services running during boot, or having the boot files large enough (take longer to load) changes the timing, which changes whether the race condition occurs.
OK, this is really bugging me now because it sounds *super* familiar - I swear I remember the name plymouth-graphics-libs in the context of a very similar bug before. But I can't find it. I'll keep looking.
ooh, okay, so I kinda suspect the changes from https://src.fedoraproject.org/rpms/plymouth/c/e08eb228aef455106511b0eb6155e17e09aced29?branch=rawhide (they were rolled into the next major version release, so they no longer exist as patches in the package, but they are in the upstream). We should probably try reverting those selectively...
So, first I checked F39 Everything netinst, just to be sure - it works just fine, as expected. Now, on F40, I tried downgrading plymouth. It changes things! So it really seems to be a regression in plymouth. plymouth-22.02.122-6.fc40 [1] is the last plymouth that works. plymouth-23.358.4-6.fc40 [2] is the first plymouth that doesn't work. So it broke somewhere between those versions. I'll see if I can narrow it down more. But at this point I believe we need Ray to start looking into it. [1] https://koji.fedoraproject.org/koji/buildinfo?buildID=2322964 [2] https://koji.fedoraproject.org/koji/buildinfo?buildID=2337638
So even more precise is that this commit works: https://src.fedoraproject.org/rpms/plymouth/c/6534ca93a154ef3c49bbfe7406a63aac5120d2cf?branch=f40 (that's plymouth-22.02.122-6.fc40) And this commit doesn't: https://src.fedoraproject.org/rpms/plymouth/c/9c15b6a28ab0a8ede11b24cfa8486e534a0aa492?branch=f40 (that's plymouth-23.356.9-4.fc40) There are no further actionable commits between those two. So in order to dig further, I'd have to try git bisect on the upstream source code.
Yeah, that would be my next step, set up the spec file to build git snapshots then just bisect it. I will probably do this over the weekend or on Monday if nobody else gets to it first.
Bisected to https://gitlab.freedesktop.org/plymouth/plymouth/-/commit/48881ba2ef3d25fd27fd150d4d5957d4df9868e0 . Will see if that reverts cleanly.
FEDORA-2024-adf0027989 (plymouth-24.004.60-4.fc40) has been submitted as an update to Fedora 40. https://bodhi.fedoraproject.org/updates/FEDORA-2024-adf0027989
FEDORA-2024-adf0027989 has been pushed to the Fedora 40 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-adf0027989` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-adf0027989 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
(In reply to Fedora Update System from comment #16) > FEDORA-2024-adf0027989 (plymouth-24.004.60-4.fc40) has been submitted as an > update to Fedora 40. > https://bodhi.fedoraproject.org/updates/FEDORA-2024-adf0027989 This fixes the problem on my hardware.
Reported upstream: https://gitlab.freedesktop.org/plymouth/plymouth/-/issues/249
Yeah, works for me, too.
FEDORA-2024-adf0027989 (plymouth-24.004.60-4.fc40) has been pushed to the Fedora 40 stable repository. If problem still persists, please make note of it in this bug report.
Still not working here
what is not working? did you test with a fresh image? if you only updated the system, you also need to run `dracut -f` and reboot to see the fix.
It boots to login screen and then crashes to a blank blinking screen from there all you can do is get to the command line with alt alt control F2. It was updated not from my fresh image, I will run dracut -f` later to see if it fixes the issue.
That does not sound like this bug. With this bug, you saw *nothing at all* on the screen. No login prompt, no blinking cursor.
dracut -f`didn't help. Maybe need to file new bug.
Yeah, from your description I would say so.
spaceboy60, please link to your new bug here, and also include information, whether downgrading plymouth* packages (and running `sudo dracut -f`) to some older version resolves the problem for you. We'll discuss there. Thanks!