Bug 2235692

Summary: posttrans script clobbers GRUB config with nonstandard EFI stub configs, leading to boot failure
Product: [Fedora] Fedora Reporter: Hector Martin <marcan>
Component: grub2Assignee: Nicolas Frayer <nfrayer>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 38CC: asahi-sig, davide, fmartine, janne-fdr, lkundrak, mlewando, nfrayer, ngompa13, pjones, raravind, teohhanhui
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: aarch64   
OS: Linux   
Whiteboard:
Fixed In Version: grub2-2.06-98 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-09-04 14:04:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Hector Martin 2023-08-29 13:26:20 UTC
The posttrans script does:

```
if grep -q "configfile" ${EFI_HOME}/grub.cfg; then
    exit 0 # already unified, nothing to do
fi

[...]

cp -a ${EFI_HOME}/grub.cfg ${EFI_HOME}/grub.cfg.rpmsave
cp -a ${EFI_HOME}/grub.cfg ${GRUB_HOME}/
mv ${EFI_HOME}/grub.cfg.stb ${EFI_HOME}/grub.cfg
```

That means that if the EFI grub.cfg does NOT contain "configfile", it copies it to /boot/grub2/grub.cfg under the assumption that it is a full-blown GRUB config (if there is none at all, it generates one correctly).

The EFI stub grub.cfg generated by Kiwi uses "source" instead of "configfile". That means it is not recognized, so posttrans clobbers the real grub.cfg with it on the next upgrade. This leads to an infinite loop on boot, as the main /boot/grub2/grub.cfg is now a stub script that sources itself.

The script should either NOT clobber the main /boot/grub/grub.cfg at all ever, or it should unconditionally regenerate it with grub2-mkconfig (as already happens indirectly when there is no EFI config at all). Blindly copying over /boot/efi/EFI/fedora/grub.cfg to /boot/grub/grub.cfg is a recipe for disaster.

Right now, every Fedora Asahi users who upgrades their system is at risk of this failure mode and an unbootable system (depending on other conditions, which could cause grub.cfg to be regenerated and fixed). Marking urgent as we're getting more and more users making their systems unbootable with a simple upgrade as a result of this bug.

Reproducible: Sometimes

Steps to Reproduce:
1. Install Fedora Asahi prior to a GRUB upgrade
2. Upgrade
3. Be unlucky that nothing else regenerated grub.cfg to fix the mess as part of the upgrade.
Actual Results:  
System hangs on boot. Recovery is a major pain in the ass, involving typing a pile of u-boot/grub commands to avoid hitting the infinite loop and getting a kernel to load.

Expected Results:  
System continues to boot.

Comment 1 Janne Grunau 2023-08-29 13:48:39 UTC
This broke recently due to following kiwi commit https://github.com/OSInside/kiwi/commit/97ac758de8154f37b9e64b850b21356c6e252d9d

Comment 2 raravind 2023-08-30 15:23:07 UTC
Hector,
Thank you for reporting the issue.We understand the critical nature of this bug and its impact on Fedora / Asahi /etc..users. Our engineering team is currently reviewing the bug, and we will provide you with an update as soon as possible. Many thanks for your diligence.

Comment 4 Marta Lewandowska 2023-09-04 10:58:06 UTC
According to our testing, this should work, but I should also wait for feedback. :)

Comment 5 Hector Martin 2023-09-04 13:58:58 UTC
Reverted the EFI config file to the problematic one and installed grub2-2.06-98.fc38, no config file explosions now :)

Thanks!

Comment 6 Marta Lewandowska 2023-09-04 14:04:20 UTC
Awesome. Thank for the feedback, Hector. :)