Fedora Account System
Red Hat Associate
Red Hat Customer
This causes Fedora Atomic Desktops that rebases from F40 to F41 to enter boot loop, because on these ostree based desktops grub in /boot is not automatically updated. It's caused by these two snippets ### BEGIN /etc/grub.d/25_bli ### if [ "$grub_platform" = "efi" ]; then insmod bli fi ### END /etc/grub.d/25_bli ### and ### BEGIN /etc/grub.d/30_uefi-firmware ### if [ "$grub_platform" = "efi" ]; then fwsetup --is-supported if [ "$?" = 0 ]; then menuentry 'UEFI Firmware Settings' $menuentry_id_option 'uefi-firmware' { fwsetup } fi fi ### END /etc/grub.d/30_uefi-firmware ### Reproducible: Always Steps to Reproduce: 1. Install Silverblue F40 2. Rebase to F41 3. Overlay some package 4. Reboot Actual Results: Boot loop. Expected Results: Able to boot. I think these snippt can be fixed by wrapping the command in a test expression like if [ ! insmod bli ]; then echo "bli not available" fi and if [ fwsetup --is-supported ]; then menuentry 'UEFI Firmware Settings' $menuentry_id_option 'uefi-firmware' { fwsetup } fi
The direct link to the original issue for extra visibility: https://github.com/fedora-silverblue/issue-tracker/issues/587
We'd also be automatically resilient to this if we moved the logic that lives in the config file: ``` if [ "$grub_platform" = "efi" ]; then insmod bli fi ``` into the actual grub binary, or at least had "built in config". Are there any situations in which we would truly not want to load that module? (hmm, what does it do? Upstream...oh, I see https://github.com/rhboot/grub2/commit/e0fa7dc84c03c7089b458137531a2913aa9e92b0 - nice, that is useful). So I guess it'd definitely be helpful to load it unconditionally; would anyone actually object to that?
Discussed during the 2024-08-19 blocker review meeting: [1] The decision to classify this bug as a AcceptedBlocker (Beta) was made: "This is accepted on the assumption that it also affects IoT and CoreOS (our release-blocking atomic builds). If so it's clearly a blocker per "The upgraded system must meet all release criteria", since the upgraded system doesn't boot" [1] https://meetbot.fedoraproject.org/blocker-review_matrix_fedoraproject-org/2024-08-19/f41-blocker-review.2024-08-19-15.59.log.html
> "This is accepted on the assumption that it also affects IoT and CoreOS (our release-blocking atomic builds) I am unsure about IoT (which is also in a transition to rebase on bootc). CoreOS though is not affected as it has always used "static" grub configs which we formalized recently; we never run grub2-mkconfig on the client by default. Basically we don't touch the bootloader stack on updates by default, deferring all of that to an explicit `bootupctl update`.
I tested with IoT, rebasing from stable (40) to devel (41) and it was not affected.
To validate that IoT is not affected, you need to boot Fedora IoT 40, rebase to Fedora IoT 41, reboot, then overlay a random package and reboot.
Verified this affects IoT installations with Anaconda, not the pre-generated disk images.
OK, so it definitely looks blocker-y, then. Re your idea of loading it unconditionally, Colin - the upstream commit notes "This module is available on EFI platforms only." What happens in grub if you try to load a module which is just not available? Does it just ignore the issue, or does it blow up? Seems like we kinda need the answer to that question. I guess I can test it.
oh, hmm, I guess you weren't quite suggesting what I thought you were, you were essentially suggesting to move the logic for loading it into grub core somehow, where it'd presumably only get loaded if it was appropriate?
I have a naive question: why does the bootloader not get updated as well? We have a situation is RHEL now where, on aarch64, the older bootloader cannot even load the newer kernel during the upgrade because the compression type changed.
Workaround PR for Atomic Desktops: https://pagure.io/workstation-ostree-config/pull-request/561 What's needed to update the bootloader: - https://gitlab.com/fedora/ostree/sig/-/issues/1 - https://fedoraproject.org/wiki/Changes/FedoraSilverblueBootupd
Fedora IoT included the same workaround: https://pagure.io/fedora-iot/ostree/c/0aa593407dfa3e8ef60729efa97af978d051f6ae?branch=f41
Workaround verified in today's F41 IoT compose.
OK, so can we say the blocker component of this is addressed, and remove the blocker metadata?
(In reply to Adam Williamson from comment #14) > OK, so can we say the blocker component of this is addressed, and remove the > blocker metadata? Can probably just close it as resolved I think
Well, I wasn't sure if folks maybe want to use it to work on a 'better' fix. For now let's drop the metadata.
(In reply to Kan-Ru Chen from comment #0) > This causes Fedora Atomic Desktops that rebases from F40 to F41 to enter > boot loop, because on these ostree based desktops grub in /boot is not > automatically updated. > > It's caused by these two snippets > > ### BEGIN /etc/grub.d/25_bli ### > if [ "$grub_platform" = "efi" ]; then > insmod bli > fi > ### END /etc/grub.d/25_bli ### > > and > > ### BEGIN /etc/grub.d/30_uefi-firmware ### > if [ "$grub_platform" = "efi" ]; then > fwsetup --is-supported > if [ "$?" = 0 ]; then > menuentry 'UEFI Firmware Settings' $menuentry_id_option 'uefi-firmware' { > fwsetup > } > fi > fi > ### END /etc/grub.d/30_uefi-firmware ### > > Reproducible: Always > > Steps to Reproduce: > 1. Install Silverblue F40 > 2. Rebase to F41 > 3. Overlay some package > 4. Reboot > Actual Results: > Boot loop. > > Expected Results: > Able to boot. > > I think these snippt can be fixed by wrapping the command in a test > expression like > > if [ ! insmod bli ]; then > echo "bli not available" > fi > > and > > > if [ fwsetup --is-supported ]; then > menuentry 'UEFI Firmware Settings' $menuentry_id_option 'uefi-firmware' { > fwsetup > } > fi I have create the below patch which seems to be working fine: I have booted a rawhide system with this patch included on grub2, then copy the /etc/grub.d/25_bli into a f40 VM, and in the latter recreate the /boot/grub2/grub.cfg, and finally reboots, f40 boots fine. You have suggested another change on /etc/grub.d/30_uefi-firmware but I do not think this is needed. --------------------- 25_bli.in: load the bli module on a test expression Loading this module inside a test expression allows those system to boot even if it bli module is not present. Suggested-by: Kan-Ru Chen: <kanru> Signed-off-by: Leo Sandoval <lsandova> 1 file changed, 3 insertions(+), 1 deletion(-) util/grub.d/25_bli.in | 4 +++- modified util/grub.d/25_bli.in @@ -19,6 +19,8 @@ set -e cat << EOF if [ "\$grub_platform" = "efi" ]; then - insmod bli + if [ ! insmod bli ]; then + echo "bli module not available" + fi fi EOF ---------------------
(In reply to Leo Sandoval from comment #17) > I have create the below patch which seems to be working fine: I have booted > a rawhide system with this patch included on grub2, then copy the > /etc/grub.d/25_bli into a f40 VM, and in the latter recreate the > /boot/grub2/grub.cfg, and finally reboots, f40 boots fine. You have > suggested another change on /etc/grub.d/30_uefi-firmware but I do not think > this is needed. I tried to copy /etc/grub.d/30_uefi-firmware to a freshly installed F40 VM and regenerate /boot/grub2/grub.cfg then reboot. It does not boot after. I think the change on 30_uefi-firmware is very much needed.
(In reply to Kan-Ru Chen from comment #18) > (In reply to Leo Sandoval from comment #17) > > I have create the below patch which seems to be working fine: I have booted > > a rawhide system with this patch included on grub2, then copy the > > /etc/grub.d/25_bli into a f40 VM, and in the latter recreate the > > /boot/grub2/grub.cfg, and finally reboots, f40 boots fine. You have > > suggested another change on /etc/grub.d/30_uefi-firmware but I do not think > > this is needed. > > I tried to copy /etc/grub.d/30_uefi-firmware to a freshly installed F40 VM > and regenerate /boot/grub2/grub.cfg then reboot. It does not boot after. > I think the change on 30_uefi-firmware is very much needed. you are right Kan. MR [1] has been merged but official build not yet launched, just a scratch one [2] in case you want to test it. BTW, Marta L. has asked it before but why grub2 is not upgraded automatically? [1] https://src.fedoraproject.org/rpms/grub2/pull-request/105 [2] https://koji.fedoraproject.org/koji/taskinfo?taskID=123166239
(In reply to Leo Sandoval from comment #19) > BTW, Marta L. has asked it before but why grub2 is not upgraded > automatically? On the IoT variant and the Atomic Desktop variants, grub2 is not updated automatically yet.
(In reply to Leo Sandoval from comment #19) > BTW, Marta L. has asked it before but why grub2 is not upgraded > automatically? There are a lot of details about this in https://fedoraproject.org/wiki/Changes/FedoraSilverblueBootupd.
FEDORA-2024-2e76a9ff56 (grub2-2.12-9.fc41) has been submitted as an update to Fedora 41. https://bodhi.fedoraproject.org/updates/FEDORA-2024-2e76a9ff56
FEDORA-2024-2e76a9ff56 has been pushed to the Fedora 41 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-2e76a9ff56` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-2e76a9ff56 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2024-2e76a9ff56 (grub2-2.12-9.fc41) has been pushed to the Fedora 41 stable repository. If problem still persists, please make note of it in this bug report.