Bug 2305291 - GRUB 2.12 generated grub.cfg does not work with GRUB 2.06
Summary: GRUB 2.12 generated grub.cfg does not work with GRUB 2.06
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: grub2
Version: 41
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Leo Sandoval
QA Contact: Fedora Extras Quality Assurance
URL: https://github.com/fedora-silverblue/...
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-08-16 09:40 UTC by Kan-Ru Chen
Modified: 2024-10-12 00:19 UTC (History)
15 users (show)

Fixed In Version: grub2-2.12-9.fc41
Clone Of:
Environment:
Last Closed: 2024-10-12 00:19:46 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Kan-Ru Chen 2024-08-16 09:40:31 UTC
This causes Fedora Atomic Desktops that rebases from F40 to F41 to enter boot loop, because on these ostree based desktops grub in /boot is not automatically updated.

It's caused by these two snippets

### BEGIN /etc/grub.d/25_bli ###
if [ "$grub_platform" = "efi" ]; then
  insmod bli
fi
### END /etc/grub.d/25_bli ###

and

### BEGIN /etc/grub.d/30_uefi-firmware ###
if [ "$grub_platform" = "efi" ]; then
	fwsetup --is-supported
	if [ "$?" = 0 ]; then
		menuentry 'UEFI Firmware Settings' $menuentry_id_option 'uefi-firmware' {
			fwsetup
		}
	fi
fi
### END /etc/grub.d/30_uefi-firmware ###

Reproducible: Always

Steps to Reproduce:
1. Install Silverblue F40
2. Rebase to F41
3. Overlay some package
4. Reboot
Actual Results:  
Boot loop.

Expected Results:  
Able to boot.

I think these snippt can be fixed by wrapping the command in a test expression like

if [ ! insmod bli ]; then
   echo "bli not available"
fi

and


	if [ fwsetup --is-supported ]; then
		menuentry 'UEFI Firmware Settings' $menuentry_id_option 'uefi-firmware' {
			fwsetup
		}
	fi

Comment 1 Colin Walters 2024-08-16 21:41:13 UTC
The direct link to the original issue for extra visibility: https://github.com/fedora-silverblue/issue-tracker/issues/587

Comment 2 Colin Walters 2024-08-16 21:55:15 UTC
We'd also be automatically resilient to this if we moved the logic that lives in the config file:

```
if [ "$grub_platform" = "efi" ]; then
  insmod bli
fi
```

into the actual grub binary, or at least had "built in config".

Are there any situations in which we would truly not want to load that module? (hmm, what does it do? Upstream...oh, I see https://github.com/rhboot/grub2/commit/e0fa7dc84c03c7089b458137531a2913aa9e92b0 - nice, that is useful).
So I guess it'd definitely be helpful to load it unconditionally; would anyone actually object to that?

Comment 3 František Zatloukal 2024-08-19 18:29:09 UTC
Discussed during the 2024-08-19 blocker review meeting: [1]

The decision to classify this bug as a AcceptedBlocker (Beta) was made:

"This is accepted on the assumption that it also affects IoT and CoreOS (our release-blocking atomic builds). If so it's clearly a blocker per "The upgraded system must meet all release criteria", since the upgraded system doesn't boot"

[1] https://meetbot.fedoraproject.org/blocker-review_matrix_fedoraproject-org/2024-08-19/f41-blocker-review.2024-08-19-15.59.log.html

Comment 4 Colin Walters 2024-08-22 15:45:06 UTC
> "This is accepted on the assumption that it also affects IoT and CoreOS (our release-blocking atomic builds)

I am unsure about IoT (which is also in a transition to rebase on bootc). CoreOS though is not affected as it has always used "static" grub configs which we formalized recently; we never run grub2-mkconfig on the client by default.
Basically we don't touch the bootloader stack on updates by default, deferring all of that to an explicit `bootupctl update`.

Comment 5 Kan-Ru Chen 2024-08-23 01:55:18 UTC
I tested with IoT, rebasing from stable (40) to devel (41) and it was not affected.

Comment 6 Timothée Ravier 2024-08-26 15:36:01 UTC
To validate that IoT is not affected, you need to boot Fedora IoT 40, rebase to Fedora IoT 41, reboot, then overlay a random package and reboot.

Comment 7 Paul Whalen 2024-08-26 19:34:19 UTC
Verified this affects IoT installations with Anaconda, not the pre-generated disk images.

Comment 8 Adam Williamson 2024-08-28 17:32:15 UTC
OK, so it definitely looks blocker-y, then.

Re your idea of loading it unconditionally, Colin - the upstream commit notes "This module is available on EFI platforms only." What happens in grub if you try to load a module which is just not available? Does it just ignore the issue, or does it blow up? Seems like we kinda need the answer to that question. I guess I can test it.

Comment 9 Adam Williamson 2024-08-28 17:35:56 UTC
oh, hmm, I guess you weren't quite suggesting what I thought you were, you were essentially suggesting to move the logic for loading it into grub core somehow, where it'd presumably only get loaded if it was appropriate?

Comment 10 Marta Lewandowska 2024-08-29 14:45:52 UTC
I have a naive question: why does the bootloader not get updated as well?

We have a situation is RHEL now where, on aarch64, the older bootloader cannot even load the newer kernel during the upgrade because the compression type changed.

Comment 11 Timothée Ravier 2024-08-29 16:09:14 UTC
Workaround PR for Atomic Desktops: https://pagure.io/workstation-ostree-config/pull-request/561
What's needed to update the bootloader:
- https://gitlab.com/fedora/ostree/sig/-/issues/1
- https://fedoraproject.org/wiki/Changes/FedoraSilverblueBootupd

Comment 12 Timothée Ravier 2024-09-03 17:11:52 UTC
Fedora IoT included the same workaround: https://pagure.io/fedora-iot/ostree/c/0aa593407dfa3e8ef60729efa97af978d051f6ae?branch=f41

Comment 13 Paul Whalen 2024-09-03 20:41:10 UTC
Workaround verified in today's F41 IoT compose.

Comment 14 Adam Williamson 2024-09-04 16:14:32 UTC
OK, so can we say the blocker component of this is addressed, and remove the blocker metadata?

Comment 15 Peter Robinson 2024-09-04 16:19:27 UTC
(In reply to Adam Williamson from comment #14)
> OK, so can we say the blocker component of this is addressed, and remove the
> blocker metadata?

Can probably just close it as resolved I think

Comment 16 Adam Williamson 2024-09-04 16:32:22 UTC
Well, I wasn't sure if folks maybe want to use it to work on a 'better' fix. For now let's drop the metadata.

Comment 17 Leo Sandoval 2024-09-04 19:42:32 UTC
(In reply to Kan-Ru Chen from comment #0)
> This causes Fedora Atomic Desktops that rebases from F40 to F41 to enter
> boot loop, because on these ostree based desktops grub in /boot is not
> automatically updated.
> 
> It's caused by these two snippets
> 
> ### BEGIN /etc/grub.d/25_bli ###
> if [ "$grub_platform" = "efi" ]; then
>   insmod bli
> fi
> ### END /etc/grub.d/25_bli ###
> 
> and
> 
> ### BEGIN /etc/grub.d/30_uefi-firmware ###
> if [ "$grub_platform" = "efi" ]; then
> 	fwsetup --is-supported
> 	if [ "$?" = 0 ]; then
> 		menuentry 'UEFI Firmware Settings' $menuentry_id_option 'uefi-firmware' {
> 			fwsetup
> 		}
> 	fi
> fi
> ### END /etc/grub.d/30_uefi-firmware ###
> 
> Reproducible: Always
> 
> Steps to Reproduce:
> 1. Install Silverblue F40
> 2. Rebase to F41
> 3. Overlay some package
> 4. Reboot
> Actual Results:  
> Boot loop.
> 
> Expected Results:  
> Able to boot.
> 
> I think these snippt can be fixed by wrapping the command in a test
> expression like
> 
> if [ ! insmod bli ]; then
>    echo "bli not available"
> fi
> 
> and
> 
> 
> 	if [ fwsetup --is-supported ]; then
> 		menuentry 'UEFI Firmware Settings' $menuentry_id_option 'uefi-firmware' {
> 			fwsetup
> 		}
> 	fi


I have create the below patch which seems to be working fine: I have booted a rawhide system with this patch included on grub2, then copy the /etc/grub.d/25_bli into a f40 VM, and in the latter recreate the /boot/grub2/grub.cfg, and finally reboots, f40 boots fine. You have suggested another change on /etc/grub.d/30_uefi-firmware but I do not think this is needed.


---------------------
25_bli.in: load the bli module on a test expression

Loading this module inside a test expression allows those system to
boot even if it bli module is not present.

Suggested-by: Kan-Ru Chen: <kanru>
Signed-off-by: Leo Sandoval <lsandova>

1 file changed, 3 insertions(+), 1 deletion(-)
util/grub.d/25_bli.in | 4 +++-

modified   util/grub.d/25_bli.in
@@ -19,6 +19,8 @@ set -e
 
 cat << EOF
 if [ "\$grub_platform" = "efi" ]; then
-  insmod bli
+        if [ ! insmod bli ]; then
+                echo "bli module not available"
+        fi
 fi
 EOF
---------------------

Comment 18 Kan-Ru Chen 2024-09-04 23:48:55 UTC
(In reply to Leo Sandoval from comment #17)
> I have create the below patch which seems to be working fine: I have booted
> a rawhide system with this patch included on grub2, then copy the
> /etc/grub.d/25_bli into a f40 VM, and in the latter recreate the
> /boot/grub2/grub.cfg, and finally reboots, f40 boots fine. You have
> suggested another change on /etc/grub.d/30_uefi-firmware but I do not think
> this is needed.

I tried to copy /etc/grub.d/30_uefi-firmware to a freshly installed F40 VM
and regenerate /boot/grub2/grub.cfg then reboot. It does not boot after.
I think the change on 30_uefi-firmware is very much needed.

Comment 19 Leo Sandoval 2024-09-09 18:20:25 UTC
(In reply to Kan-Ru Chen from comment #18)
> (In reply to Leo Sandoval from comment #17)
> > I have create the below patch which seems to be working fine: I have booted
> > a rawhide system with this patch included on grub2, then copy the
> > /etc/grub.d/25_bli into a f40 VM, and in the latter recreate the
> > /boot/grub2/grub.cfg, and finally reboots, f40 boots fine. You have
> > suggested another change on /etc/grub.d/30_uefi-firmware but I do not think
> > this is needed.
> 
> I tried to copy /etc/grub.d/30_uefi-firmware to a freshly installed F40 VM
> and regenerate /boot/grub2/grub.cfg then reboot. It does not boot after.
> I think the change on 30_uefi-firmware is very much needed.

you are right Kan. MR [1] has been merged but official build not yet launched, just a scratch one [2] in case you want to test it.

BTW, Marta L. has asked it before but why grub2 is not upgraded automatically?

[1] https://src.fedoraproject.org/rpms/grub2/pull-request/105
[2] https://koji.fedoraproject.org/koji/taskinfo?taskID=123166239

Comment 20 Kan-Ru Chen 2024-09-09 23:42:30 UTC
(In reply to Leo Sandoval from comment #19)
> BTW, Marta L. has asked it before but why grub2 is not upgraded
> automatically?

On the IoT variant and the Atomic Desktop variants, grub2 is not updated automatically yet.

Comment 21 Timothée Ravier 2024-09-16 11:11:48 UTC
(In reply to Leo Sandoval from comment #19)

> BTW, Marta L. has asked it before but why grub2 is not upgraded
> automatically?

There are a lot of details about this in https://fedoraproject.org/wiki/Changes/FedoraSilverblueBootupd.

Comment 22 Fedora Update System 2024-10-10 15:17:29 UTC
FEDORA-2024-2e76a9ff56 (grub2-2.12-9.fc41) has been submitted as an update to Fedora 41.
https://bodhi.fedoraproject.org/updates/FEDORA-2024-2e76a9ff56

Comment 23 Fedora Update System 2024-10-11 01:36:21 UTC
FEDORA-2024-2e76a9ff56 has been pushed to the Fedora 41 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-2e76a9ff56`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-2e76a9ff56

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 24 Fedora Update System 2024-10-12 00:19:46 UTC
FEDORA-2024-2e76a9ff56 (grub2-2.12-9.fc41) has been pushed to the Fedora 41 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.