grub package updates fail to update modules in `/boot/grub2/arm64-efi` (at least on arm64/Fedora Asahi Remix), even though it updates the core image in /boot/efi/EFI/fedora/grubaa64.efi. This silently works most of the time, until a version incompatibility problem happens and it blows up when the wrong combination of core image and module set are installed. This happens on at least F40 and F41. All of my Fedora Asahi Remix systems (and everyone else I asked) had ancient modules in /boot/grub2/arm64-efi, so it seems this has always been broken. Haven't tested F42 yet. After an update on two F40 systems, GRUB crashed on boot like this: "Synchronous Abort" handler, esr 0x96000005, far 0x0 elr: ffffffff68634760 lr : ffffffff68634ce0 (reloc) elr: 000001073c8bd760 lr : 000001073c8bdce0 x0 : 0000000000000000 x1 : 000001073dc93250 x2 : 000001073dc83854 x3 : 0000000000000000 x4 : 000001073c8bdcc0 x5 : 000001073c8b3000 x6 : 000001073c8c4000 x7 : 000001073dc91ec8 x8 : 000001073c904869 x9 : 0000000000001000 x10: 0000000000000ff8 x11: 0000000001392288 x12: 000001073dc94f80 x13: 00000107d4289000 x14: 000001073d703a20 x15: 0000000000000000 x16: 00000107d42bdadc x17: 0000000000000000 x18: 0000000000000011 x19: 000001073e518000 x20: 0000000000000000 x21: 00000107d02f57a0 x22: 000001073e52c038 x23: 00000107d02f57a0 x24: 00000107d0246570 x25: 000001073e52c038 x26: 000001073e52c100 x27: 000001073e52c108 x28: 000001073e52c110 x29: 00000107d02459a0 Code: f9001bff 12800020 b90047e0 f9400fe0 (f9400000) UEFI image [0x000001073e7a6000:0x000001073e7bdfff] '/\EFI\BOOT\fbaa64.efi' UEFI image [0x000001073dc72000:0x000001073e072fff] '/\EFI\fedora\grubaa64.efi' Updating the modules fixed it. Reproducible: Always Steps to Reproduce: 1. Install Fedora Asahi Remix 2. Upgrade grub Actual Results: /boot/grub2/arm64-efi has install time contents, and is never updated Expected Results: /boot/grub2/arm64-efi is updated Additional Information: Manual workaround to fix a broken system (from some kind of rescue boot): rsync -av --delete /usr/lib/grub/arm64-efi/ /boot/grub2/arm64-efi/
This is probably an issue for all architectures, but it's particularly bad for AArch64 systems. I can see this on Fedora KDE AArch64 on a Raspberry Pi 400 too.
Hector, which version of GRUB are we talking about? You're installing the released version and then updating? Do you install -modules and then update them? thanks.
from a semi affected system (modules in /boot/grub/arm64-efi with timestamps from the initial installation (2023-08-22) but no boot failures): > 2023-08-22T15:33:52+0000 SUBDEBUG Installed: grub2-efi-aa64-1:2.06-95.fc38.aarch64 > 2023-08-22T15:35:50+0000 SUBDEBUG Installed: grub2-efi-aa64-modules-1:2.06-95.fc38.noarch > 2023-10-11T21:10:29+0000 SUBDEBUG Upgraded: grub2-efi-aa64-1:2.06-95.fc38.aarch64 > 2023-10-11T21:10:29+0000 SUBDEBUG Upgraded: grub2-efi-aa64-modules-1:2.06-95.fc38.noarch > 2023-11-20T19:48:41+0000 SUBDEBUG Upgraded: grub2-efi-aa64-1:2.06-102.fc38.aarch64 > 2023-11-20T19:48:47+0000 SUBDEBUG Upgraded: grub2-efi-aa64-modules-1:2.06-102.fc38.noarch > 2024-03-24T20:38:08+0000 SUBDEBUG Upgraded: grub2-efi-aa64-1:2.06-110.fc39.aarch64 > 2024-03-24T20:38:08+0000 SUBDEBUG Upgraded: grub2-efi-aa64-modules-1:2.06-110.fc39.noarch > 2024-04-29T17:21:32+0200 SUBDEBUG Upgraded: grub2-efi-aa64-1:2.06-118.fc39.aarch64 > 2024-04-29T17:21:32+0200 SUBDEBUG Upgraded: grub2-efi-aa64-modules-1:2.06-118.fc39.noarch > 2024-05-05T13:40:48+0200 SUBDEBUG Upgraded: grub2-efi-aa64-1:2.06-120.fc39.aarch64 > 2024-05-05T13:41:32+0200 SUBDEBUG Upgraded: grub2-efi-aa64-modules-1:2.06-120.fc39.noarch > 2024-06-01T15:07:44+0200 SUBDEBUG Upgraded: grub2-efi-aa64-1:2.06-121.fc40.aarch64 > 2024-06-01T15:07:45+0200 SUBDEBUG Upgraded: grub2-efi-aa64-modules-1:2.06-121.fc40.noarch Modules in /boot/grub2/arm64-efi/ have 2023-04-12 as date which is probably the date of the files in grub2-efi-aa64-modules-1:2.06-95.fc38.noarch
From, the same system: > 2024-12-17T13:55:18+01:00 pk-offline-update[1305]: package updating grub2-efi-aa64-1:2.12-15.fc41.aarch64 (updates) > 2024-12-17T13:55:23+01:00 pk-offline-update[1305]: package updating grub2-efi-aa64-modules-1:2.12-15.fc41.noarch (updates) > 2024-12-17T13:55:29+01:00 pk-offline-update[1305]: package cleanup grub2-efi-aa64-1:2.06-123.fc40.aarch64 (installed) > 2024-12-17T13:55:32+01:00 pk-offline-update[1305]: package cleanup grub2-efi-aa64-modules-1:2.06-123.fc40.noarch (installed) > 2025-03-14T21:55:16+01:00 pk-offline-update[1323]: package updating grub2-efi-aa64-1:2.12-20.fc41.aarch64 (updates) > 2025-03-14T21:55:21+01:00 pk-offline-update[1323]: package updating grub2-efi-aa64-modules-1:2.12-20.fc41.noarch (updates) > 2025-03-14T21:55:30+01:00 pk-offline-update[1323]: package cleanup grub2-efi-aa64-1:2.12-15.fc41.aarch64 (installed) > 2025-03-14T21:55:33+01:00 pk-offline-update[1323]: package cleanup grub2-efi-aa64-modules-1:2.12-15.fc41.noarch (installed)
As Janne mentioned, this affects all versions of GRUB going back to at least what was current in F38. Simply doing normal system upgrades never updates the modules.
The "right" way to do this is to run grub2-install, which for a while was not permitted on EFI at all because of SB, but works now if you --force. The problem is that this creates a new GRUB image, which is incompatible with SB on systems that care about that. On aarch, you might not care yet, but it's coming. I'm guessing that the Synchronous Abort that Hector reported, which I can't reproduce on a vanilla f40 aarch VM, is happening because you need additional modules to boot Asahi that are not built-in to the binary... is that right? if so, which ones?
Booting works without grub modules. I removed the modules on a apple silicon notebook without connected USB devices and it still boots. That probably explains why we haven't seen many issues so far. Hector, are the affected systems special in a way noticeable by grub? I can only think of connected USB devices (I guess this should be abstrected away by u-boot's efi implementation), filesystems types on storage devices or special grub configuration. Why are the modules present at all if they can't be used with secure boot and possibly/likely broken after grub efi image updates?
Nothing special other than some USB drives containing LVM PVs. The volumes themselves are Ceph OSDs so nothing GRUB could interpret, and are not used for booting. In addition, two machines have identical peripherals connected and only one failed, with the same exact installed GRUB package version (the only difference was the version of the ancient modules). I'm also pretty sure I tried disconnecting the drives and the problem persisted. So I don't think it has anything to do with USB devices. GRUB config is also boring: GRUB_DEFAULT=saved GRUB_DISABLE_SUBMENU=true GRUB_DISABLE_RECOVERY=true GRUB_CMDLINE_LINUX_DEFAULT="rootflags=subvol=root" GRUB_DISTRIBUTOR="Fedora Linux Asahi Remix" GRUB_ENABLE_BLSCFG=true GRUB_GFXMODE=auto GRUB_TERMINAL="" GRUB_TIMEOUT=5 GRUB_TIMEOUT_STYLE=menu The internal NVMe partition layout is also bog standard for Asahi Linux. There is one partitioning difference: The two machines that failed have a separate /home btrfs subvolume, while the one that survived does not. They also have slightly diferent /boot partition sizes. I believe this is a change that happened at some point. The machines that broke have an install date of Dec 19 2023 and I believe the image is from the same date, while the one that survived has an install date of May 16 2023 and I believe the image was built on May 9. So in this case, the machine with the *older* GRUB modules survived, the *newer* (but still wildly out of date) modules broke. There *is* a difference in GRUB config. The machine that survived has this: GRUB_DEFAULT=saved GRUB_DISABLE_SUBMENU=true GRUB_DISABLE_RECOVERY=true GRUB_CMDLINE_LINUX_DEFAULT="scsi_mod.use_blk_mq=1 multipath=off" GRUB_DISTRIBUTOR="Fedora Linux Asahi Remix" GRUB_ENABLE_BLSCFG=true GRUB_GFXMODE=auto GRUB_TERMINAL="console" GRUB_TIMEOUT=5 GRUB_TIMEOUT_STYLE=hidden The default for GRUB_TERMINAL is `gfxterm`. So that means the machines that failed were trying to use a graphical terminal, while the one that survived was trying to use the native UEFI I/O. In gfxterm mode, a bunch of extra things happen that wouldn't otherwise, including loading the all_video module, which then loads all video modules as dependencies. I'm guessing not all of those are bundled into the EFI image. So that's quite likely what made it explode. If the intent is that only built-in modules are (can be) used, then there should not be any modules in /boot at all. Heck, if this is supposed to be a secure-boot-capable image, it shouldn't even *have* the ability to load extra modules from /boot.
Just switching to gfxterm and menu timeout style does not reproduce the SError on M2 Macbook Pro without USB devices. The grub modules are from: > 2023-07-31T17:11:13+0000 SUBDEBUG Installed: grub2-efi-aa64-modules-1:2.06-95.fc38.noarch 2023-07-31 should be the image date. "efi_gop" is not explicitly included in the image in https://src.fedoraproject.org/rpms/grub2/blob/rawhide/f/grub.macros but it seems that grub-mkimage includes it via "all_video". The installed grub efi image seems to include it. I found a way to produce a SError with that system. Change to the grub command line and type "ls (hd0," and press tab for completion. This happens independently of gfxterm/console. On a system with the modules removed this prints a list of all partitions on the internal nvme. Since that includes file system information I suspect grub2 could try to load all file system modules for the unknown apfs partitions. I suspect the easiest way to reproduce this on other platforms would be adding an exfat (module should be added to the image) or a ntfs (no module like apfs) partition. That's not helpful to determine why the affected systems tried to load a module. I think we only have two options: 1. the modules must not be present in /boot/grub2 2. the modules in /boot/grub2 must be updated with the grub2 (efi) image
This is a problem created by kiwi. Kiwi copies the grub2 modules "helpfully" to the boot partition: https://github.com/OSInside/kiwi/blob/eeea9e6405ceb9d0aa346523b0d8232139ab50c8/kiwi/bootloader/config/grub2.py#L559 This was explicitly added in https://github.com/OSInside/kiwi/commit/c6e80a13c95a7f61b0df7334204d2ef0d19a5cbd for EFI secure boot in 2016. That copies the the modules explicitly from /usr/lib/grub/arm64-efi/ to /boot/grub2 with rsync. I'll open a kiwi issue and will forcefully delete the files in the mean time. This still leaves the question open how to deal with the broken existing installations which are ticking timebombs.
Sounds like it's time for a postinstall script in one of the platform metapackages to wipe it?
changing component to kiwi
Upstream pull request: https://github.com/OSInside/kiwi/pull/2791
FEDORA-EPEL-2025-abc2389dd4 (kiwi-10.2.18-1.el9) has been submitted as an update to Fedora EPEL 9. https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2025-abc2389dd4
FEDORA-2025-b9ae42c8d7 (kiwi-10.2.18-1.fc40) has been submitted as an update to Fedora 40. https://bodhi.fedoraproject.org/updates/FEDORA-2025-b9ae42c8d7
FEDORA-EPEL-2025-1516ba47ea (kiwi-10.2.18-1.el10_1) has been submitted as an update to Fedora EPEL 10.1. https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2025-1516ba47ea
FEDORA-EPEL-2025-a6bd816644 (kiwi-10.2.17-1.el10_0) has been submitted as an update to Fedora EPEL 10.0. https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2025-a6bd816644
FEDORA-2025-caba97efbd (kiwi-10.2.18-1.fc42) has been submitted as an update to Fedora 42. https://bodhi.fedoraproject.org/updates/FEDORA-2025-caba97efbd
FEDORA-2025-7cf125b833 (kiwi-10.2.18-1.fc41) has been submitted as an update to Fedora 41. https://bodhi.fedoraproject.org/updates/FEDORA-2025-7cf125b833
FEDORA-2025-caba97efbd (kiwi-10.2.18-1.fc42) has been pushed to the Fedora 42 stable repository. If problem still persists, please make note of it in this bug report.
FEDORA-2025-7cf125b833 (kiwi-10.2.18-1.fc41) has been pushed to the Fedora 41 stable repository. If problem still persists, please make note of it in this bug report.
FEDORA-EPEL-2025-1516ba47ea (kiwi-10.2.18-1.el10_1) has been pushed to the Fedora EPEL 10.1 stable repository. If problem still persists, please make note of it in this bug report.
FEDORA-2025-b9ae42c8d7 (kiwi-10.2.18-1.fc40) has been pushed to the Fedora 40 stable repository. If problem still persists, please make note of it in this bug report.
FEDORA-EPEL-2025-a6bd816644 (kiwi-10.2.18-1.el10_0) has been pushed to the Fedora EPEL 10.0 stable repository. If problem still persists, please make note of it in this bug report.
FEDORA-EPEL-2025-abc2389dd4 (kiwi-10.2.18-1.el9) has been pushed to the Fedora EPEL 9 stable repository. If problem still persists, please make note of it in this bug report.
asahi-platform-metapackage-core-0-23 was released with a %posttrans script to remove stale grub modules from /boot/grub2/arm64-efi/ of existing installs: https://pagure.io/fedora-asahi/asahi-platform-metapackage/c/ffe251c9a0c51d13a61a7aa60a22825d6981e9aa?branch=main