Upon updating grub2 to the latest (2.06-116), grub no longer boots when /boot is on md software raid. The --root-dev-only flag was added to the search command in /boot/efi/EFI/fedora/grub.cfg, via grub2.spec in 2.06-116. With this flag present, $dev doesn't find the software raid boot device. Reproducible: Always Steps to Reproduce: 1. Upgrade grub2 to 2.06-116 2. Reboot 3. End up at the grub prompt instead of booting 4. echo $dev and note that it's empty instead of containing the boot device Actual Results: System hangs at the grub prompt and echo $dev displays an empty value Expected Results: grub should have found the boot device on software raid, populated $dev and then loaded the /boot/grub2/grub.cfg to boot a kernel This problem also affects rhel 9 (starting with 2.06-61.el9_2.2)
Hi, could you please give us some more info, like about your partitioning scheme / raid layout, version of raid metadata, etc. thanks!
Hello, /boot is a software raid1 device with raid metadata 1.2 /dev/md126: Version : 1.2 Creation Time : Wed Dec 6 20:41:42 2023 Raid Level : raid1 Array Size : 974848 (952.00 MiB 998.24 MB) Used Dev Size : 974848 (952.00 MiB 998.24 MB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Mon Feb 12 14:52:59 2024 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Consistency Policy : bitmap Name : localhost.localdomain:boot UUID : 2173aaba:d4b33683:cd6ca589:204fdf6e Events : 79 Number Major Minor RaidDevice State 0 259 7 0 active sync /dev/nvme1n1p3 1 259 3 1 active sync /dev/nvme0n1p3
Thank you. I can easily reproduce this. You have already discovered the culprit and know how to remedy it for yourself. (: Adding the option to search only the root device is a mitigation for a CVE (https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-4001) in which the attacker needs physical access, and which affects only certain UEFI implementations. I think in this case, you will just have to either leave the vulnerability in place or transition to a setup in which /boot is no longer a software raid device. Alternately, you could move files around between /boot and /boot/efi, but I would discourage that because it can break other stuff like setting the default kernel, etc.
I know how to workaround the problem by removing --root-dev-only from the search command to make a non-bootable system bootable again, but I'm unsure if this is a temporary fix or a permanent solution. Are there circumstances where /boot/efi/EFI/fedora/grub.cfg would get re-written, perhaps when a grub or another package is updated?
That file should only get regenerated on a update of grub2-common iff it does not already exist and contain the words "source" or "configfile". I cannot promise that this will never change, but as long as you don't remove the grub.cfg in the ESP, this should be a "permanent" solution.
I have been experiencing this on my Fedora 39 server with md1 raid. I do a dnf update and a new kernel gets installed. I reboot and I get a grub prompt. I boot from a flash drive, chroot, run grub2-mkconfig, and it fixes it. Yet on the next kernel upgrade it breaks again.
hi Nathan, could you please cat your /boot/efi/EFI/redhat/grub.cfg ? Kernel updates should not be affected. Do you get any errors from grub? or from anything else? Just to be sure, which grub version do you have installed?
Not only raid systems seem to be affected: Recently I reordered my dual boot system (Fedora 39, Windows 11) to have all Fedora partitions on one drive (all filesystems on /dev/sda are ext4): sda 8:0 0 232,9G 0 disk ├─sda1 8:1 0 1,1G 0 part /boot ├─sda2 8:2 0 49,1G 0 part / └─sda3 8:3 0 31,7G 0 part /home The EFI partition mounted as /boot/efi has stayed on another drive where Windows resides. I reinstalled the grub related rpms: # dnf reinstall shim-* grub2-efi-* grub2-common to version 2.06-118. Rebooting got me to the grub prompt as described above. After a lot of searching and help from the Fedora users list, I finally came up to the workaround mentioned in Comment 4: removing the flag --root-dev-only from /boot/efi/EFI/fedora/grub.cfg
We are working to fix this, at least to get RAID working, and hopefully more. (: Klaus-Peter, could I ask you for a little more info about your setup? If you would rename your /boot/efi/EFI/fedora/grub.cfg and then get dropped to the grub prompt, could you: ls echo $root cat ($root)/efi/fedora/<grub.cfg.name> execute the first search command (without the root-dev flag, which I guess you removed) echo $dev thanks.
Marta, it's fully updated F39 system (KDE), dual boot with Windows 11 Kernel 6.8.7-200.fc39.x86_64 grub2-common-2.06-118.fc39.noarch output of lsblk: sda 8:0 0 232,9G 0 disk ├─sda1 8:1 0 1,1G 0 part /boot ├─sda2 8:2 0 49,1G 0 part / └─sda3 8:3 0 31,7G 0 part /home sdb 8:16 0 931,5G 0 disk └─sdb2 8:18 0 899,8G 0 part /mnt/Daten sdc 8:32 0 931,5G 0 disk ├─sdc1 8:33 0 293,4G 0 part ├─sdc2 8:34 0 803M 0 part ├─sdc3 8:35 0 100M 0 part /boot/efi ├─sdc6 8:38 0 3,7G 0 part └─sdc7 8:39 0 573,1G 0 part /mnt/Daten2 As you can see, fedora partitions reside on /dev/sda (all ext4), whereas the EFI partition is on the windows disk /dev/sdc. Here you find the the output of the mentioned grub commands (I did a character recognition of the screen photo - no other method found, so there might be typos): grub> ls (memdisk) (proc) (hd0) (hd0,gpt3) (hd0,gpt2) (hd0,gpt1) (hd1) (hd1,msdos2) (hd2) (hd2,gpt7) (hd2,gpt6) (hd2,gpt3) (hd2,gpt2) hd2,gpt1) (hd3) grub> echo $root hd2,gpt3 gnub> cat ($noot)/efi/fedora/grub.cfg search --no-floppy --root-dev-only --fs-uuid --set=dev 206c0b5f-eddf-42e8-96f1-e666c5635cd0 set prefix=($dev)/grub2 export $prefix configfile $prefix/grub.cfg grub> search --no-floppy --fs-uuid --set=dev 206c0b5f-eddf-42e8-96f1-e666c5635cd0 grub> echo $dev hd0,gpt1 grub> Any other info needed?
Thank you. that's all I was after.
Seeing this on Nobara (FC39 based) after an update (and there are several other reports across the net). I doubt this is (solely) related to a RAID setup, as this box here is a bog standard 1 NVMe Desktop (with LUKS and LVM). The common denominator could be device-mapper, though. I wonder why there aren't masses of reports like this, as neither RAID nor LUKS+LVM are uncommon setups. Whatever causes this, the option "--root-dev-only" is simply not available on this installed Grub 2.06-116.fc39 I tried to dig up a man page for the GRUB shell on this installation, but only found those for individual grub2-* binaries, so it is hard to tell whether there is a similar switch that would achieve the same. I found https://www.gnu.org/software/grub/manual/grub/html_node/search.html#search (which is for v2.12 however) and it also does not mention this switch. If I try to enter the search command in GRUB's bash-like env, it tries to interpret "--root-dev-only" as a device (i.e. parameter, not switch), so this definitely does not exist on this version. Wherever that comes from, it is either outdated or simply wrong and has never existed. Searching for this switch only ever turns up RPM-based distro pages, so I really think this switch is just non-existent. Removing it makes this system boot flawlessly again. If this mitigates security issues by not booting the system at all, then it does indeed work ;-) If this switch should be patched in (as it is not available upstream), then this is some kind of packaging error.
You're right that this was not really documented, and it was not meant to break so many setups either. :\ The root-dev-only flag stuff was a "fix" in fedora and not in the upstream grub due to the way that grub configs are laid out in fedora, but not necessarily in other distros. A newer, better fix is on its way. Sorry about this. :(
Just in case that it matters: I have also been affected by this, although my setup is pretty simple: /boot is ext4, no lvm, no raid, no luks.
Removing --root-dev-only from /boot/efi/EFI/fedora/grub.cfg did fix it for me. FYI, I still saw this on my system after having upgraded from 39 to 40.
I found it isn't hard to make your own copy of grub2 without --root-dev-only. 1. Remove the reference to --root-dev-only in grub2.spec 2. Comment out "Patch0348: 0348-add-flag-to-only-search-root-dev.patch" in grub.patches 3. Increase the Release, like "Release: 121%{?dist}" to "Release: 121.11%{?dist}" 4. Install dependencies, which you can learn with the build command below 5. rpmbuild -bb grub2.spec Side note, holy patches Batman! There are 360 patches for grub2. It might be a sign of too many patches if you need a grub.patches as a separate file from grub2.spec to just list all the patches. RedHat consider either dropping patches or upstreaming them.
(In reply to Thomas Köller from comment #14) > Just in case that it matters: I have also been affected by this, although my > setup is pretty simple: /boot is ext4, no lvm, no raid, no luks. Hi Thomas, Although the new approach that Nicolas is working on is quite different from the current one, I am curious about your setup so that we test it as well. Do you have several disks or how does does it look? The output of `lsblk` is probably enough. thank you!
(In reply to Nathan G. Grennan from comment #16) > Side note, holy patches Batman! There are 360 patches for grub2. It might be > a sign of too many patches if you need a grub.patches as a separate file > from grub2.spec to just list all the patches. RedHat consider either > dropping patches or upstreaming them. Upstreaming patches is an ongoing effort, which unfortunately takes a long time and a lot of resources on both ends. :( But you are right.
(In reply to Marta Lewandowska from comment #17) [root@sarkovy ~]# lsblk -f NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS sda btrfs HdStorage 87aedf83-c280-4fe0-8417-30b65363d16d sdb ├─sdb1 vfat FAT32 EFISYS 9C97-AAEC 499,5M 2% /boot/efi └─sdb2 btrfs SysStorage f827adea-4448-493c-9ae9-3154458bc814 639,4G 31% /export/home /var /home / sdc ├─sdc1 swap 1 Swap f9b3291b-1e1f-4384-ae84-664078a1127b [SWAP] ├─sdc2 btrfs SysStorage f827adea-4448-493c-9ae9-3154458bc814 └─sdc3 ext4 1.0 BOOT 5681f737-5b42-4470-b103-5d88624bc29d 102,5M 73% /boot sdd btrfs HdStorage 87aedf83-c280-4fe0-8417-30b65363d16d 909,2G 51% /export/media /workspace sde btrfs HdStorage 87aedf83-c280-4fe0-8417-30b65363d16d sdf └─sdf1 crypto_LUKS 2 CnMemory ecc53bb9-af52-4a3a-b8cc-bf88f30ff756 sr0 > Do you have several disks or how does does it look? The output of `lsblk` is > probably enough. > thank you!
Thank you, Thomas. Yeah, it is failing for you because /boot and the ESP are on different disks. We will definitely test for this scenario and make sure that it works in the future.
I can confirm `--root-dev-only` also affects (encrypted) btrfs /boot partition on a different partition of the same device. Grub immediately drops to rescue shell (after having input the passphrase) and effectively so, `$dev` is empty (as confirmed with `echo $dev` from rescue shell). Reiterating each line as in the stub file `/boot/efi/EFI/fedora/grub.cfg` from rescue shell as is throws device not found error. It correctly finds the `crypto0` device however, without the `--root-dev-only` flag, and reiterating each line in the stub file as is - only this time without the flag - successfully boots to kernel. I am on Fedora 40 KDE and my fs is setup like this: ``` $ lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS sda 8:0 0 232.9G 0 disk ├─sda1 8:1 0 70G 0 part │ ├─vg0-group1lv1 253:1 0 60G 0 lvm │ │ └─group1p1 253:4 0 60G 0 crypt / │ └─vg0-group1lv2 253:2 0 702M 0 lvm │ └─group1p2 253:3 0 700M 0 crypt /boot . . . . ├─sda6 8:6 0 500M 0 part /boot/efi . . . . ```
FEDORA-2024-e62a99b9d7 (grub2-2.06-121.fc39) has been submitted as an update to Fedora 39. https://bodhi.fedoraproject.org/updates/FEDORA-2024-e62a99b9d7
FEDORA-2024-a7983d1f0a (grub2-2.06-123.fc40) has been submitted as an update to Fedora 40. https://bodhi.fedoraproject.org/updates/FEDORA-2024-a7983d1f0a
FEDORA-2024-e62a99b9d7 has been pushed to the Fedora 39 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-e62a99b9d7` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-e62a99b9d7 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2024-a7983d1f0a has been pushed to the Fedora 40 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-a7983d1f0a` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-a7983d1f0a See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2024-a7983d1f0a (grub2-2.06-123.fc40) has been pushed to the Fedora 40 stable repository. If problem still persists, please make note of it in this bug report.
FEDORA-2024-e62a99b9d7 (grub2-2.06-121.fc39) has been pushed to the Fedora 39 stable repository. If problem still persists, please make note of it in this bug report.
Today there has been a grub update changing back the grub.cfg containing the --root-dev-only flag again. Removed the flag, as usual, reboot, and no success. Unfortunately I am not able to provide actual console output, as this is a hosted server and I don't have access to the hardware console, but I could provide moreinformation from a rescue system mounting the file systems. /boot here is on raid1, just for completeness. OS is rocky 9.4.
(In reply to Stefan Haslinger from comment #28) > Today there has been a grub update changing back the grub.cfg containing the > --root-dev-only flag again. > Removed the flag, as usual, reboot, and no success. > > Unfortunately I am not able to provide actual console output, as this is a > hosted server and I don't have access to the hardware console, but I could > provide moreinformation from a rescue system mounting the file systems. > > /boot here is on raid1, just for completeness. OS is rocky 9.4. Whooha! I am happy to report, that a second reboot fixed the issue. No clue, why...
Hi Stefan, I don't know what version of grub you have in rocky 9.4, but latest versions in fedora (2.06-121.fc39 and 2.06-123.fc40) should work fine also on raid1 with the root-dev-only flag.
(In reply to Marta Lewandowska from comment #30) > Hi Stefan, > I don't know what version of grub you have in rocky 9.4, but latest versions > in fedora (2.06-121.fc39 and 2.06-123.fc40) should work fine also on raid1 > with the root-dev-only flag. Hi Marta, I see 2.06-80.el9_4 for grub2-commont currently. Maybe that's still affected?