Description of problem: Updating several Raspberry Pi V3 systems running aarch64 Fedora 31. Kernel update was to be 5.4.19. dnf seems hung. Processes indicated a stuck grubbly module. End result was totally unbootable. Booting the RPi ended up at the U-Boot prompt and no kernel menu. Diagnosing indictated a corrupted /boot/efi/EFI/fedora/grub.cfg file with a corrupt entry for the default kernel opts. Result was a line of several hundred (~800) K of repeating root UUID entries. Replacing that line from the stock image fixed the problem and allowed a boot. Version-Release number of selected component (if applicable): grubby-8.40-36.fc31.aarch64 How reproducible: 100% Steps to Reproduce: 1. Install Fedora 31 aarch64 image to an RPi V3. 2. Update system to include kernel 5.4.19 3. Reboot. Actual results: U-Boot> Expected results: Kernel menu and booting Additional info: Fixing the grub.cfg file before rebooting or by manually editing the sd card in another system restores the ability to boot. Tested on 6 Raspberry Pi V3 cards. Totally reproducible. Caught 3 of them before I ended up stripping out the cards to repair. The others, I was able to kill grubby to finish the update and then fix the corrupted grub.cfg file.
Original Fedora image was the Xfce image, if it matters.
Another point on the curve. This seems to NOT be recent. Two of the six systems I was recovering had crashed completely during the dnf update at the kernel-core update script. After editing the grub.cfg file, one of them got passed the U-Boot but crashed loading the latest kernel, due to no initramfs. Fine. That's a known headache. Rebooted to the 5.4.18 kernel and it's back up and I do a reinstall of the 5.4.19 kernel-core package to fix the initramfs. While doing that, I monitored the gurb.cfg file from the good file that worked to the end of the reinstall. This is what I saw... This: -- set default_kernelopts="root=UUID=92567a3e-b3d2-494a-b4ec-ebb4687d6b40 ro cma=192MB" -- Became this during the install: -- set default_kernelopts="root=UUID=92567a3e-b3d2-494a-b4ec-ebb4687d6b40=UUID=92567a3e-b3d2-494a-b4ec-ebb4687d6b40=92567a3e-b3d2-494a-b4ec-ebb4687d6b40 ro cma=192MB" -- And became this after the reinstall was complete: -- set default_kernelopts="root=UUID=92567a3e-b3d2-494a-b4ec-ebb4687d6b40=UUID=92567a3e-b3d2-494a-b4ec-ebb4687d6b40=92567a3e-b3d2-494a-b4ec-ebb4687d6b40=UUID=92567a3e-b3d2-494a-b4ec-ebb4687d6b40=UUID=92567a3e-b3d2-494a-b4ec-ebb4687d6b40=92567a3e-b3d2-494a-b4ec-ebb4687d6b40=92567a3e-b3d2-494a-b4ec-ebb4687d6b40=UUID=92567a3e-b3d2-494a-b4ec-ebb4687d6b40=92567a3e-b3d2-494a-b4ec-ebb4687d6b40=UUID=92567a3e-b3d2-494a-b4ec-ebb4687d6b40=92567a3e-b3d2-494a-b4ec-ebb4687d6b40=92567a3e-b3d2-494a-b4ec-ebb4687d6b40=92567a3e-b3d2-494a-b4ec-ebb4687d6b40=92567a3e-b3d2-494a-b4ec-ebb4687d6b40 ro cma=192MB" -- It duplicated the root UUID on the default_kernelopts several times. And rebooting with that bad line still worked. It seems to be the size of that duplicated root UUID that finally borks it. Each time the kernel gets updated, it gets another string attached until it blows chunks. This may have been in there for a long time, I'm just real aggressive about doing kernel updates. It indicates a couple of problems including a potential option too long somewhere and an possible buffer overrun but it doesn't blow chunks until around 800K or so on that line. I tried to upload the bad cfg file but the upload gave me an error. Doesn't appear to affected x86_64 systems though. But it's catching all of my aarm64 systems in the same way.
Could you please attach the following files: /etc/default/grub /etc/grub2-efi.cfg /boot/grub2/grubenv /boot/loader/entries/*
Created attachment 1664060 [details] /etc/grub2.cfg -> ../boot/efi/EFI/fedora/grub.cfg /etc/grub2.cfg is a symlink to ../boot/efi/EFI/fedora/grub.cfg I have attached the later. Attachments are taken from the same system in mentioning tracking the changes to the grub.cfg file.
Created attachment 1664062 [details] /boot/grub2/grub.env
Created attachment 1664066 [details] /etc/default/grub
Created attachment 1664069 [details] /boot/loader/entries/* #1 First of the boot/loader/entries/* files. 4 in total.
Created attachment 1664070 [details] /boot/loader/entries/* #2
Created attachment 1664071 [details] /boot/loader/entries/* #3
Created attachment 1664072 [details] /boot/loader/entries/* #4
Created attachment 1664073 [details] /boot/loader/entries/* #5 #5. My original count was off.
On those attachments. /boot/loader/entries/* where as follows: 32db7df466bc45f9b1c0f514329fb96e-5.3.7-301.fc31.aarch64.conf cebb70635a3d4a669cddb17ac389fc78-0-rescue.conf cebb70635a3d4a669cddb17ac389fc78-5.4.17-200.fc31.aarch64.conf cebb70635a3d4a669cddb17ac389fc78-5.4.18-200.fc31.aarch64.conf cebb70635a3d4a669cddb17ac389fc78-5.4.19-200.fc31.aarch64.conf
On the /boot/efi/EFI/fedora/grub.cfg file. This is one of the lightly corrupted files. Only a handful of dups. Did not fix but it did work. I have copies of the heavily corrupted files but their all just addition dups to the "set default_kernelopts=" line.
Another point on the curve. Possibly related. Reinstalled kernel-core-5.4.19-200.fc31.aarch64 using dnf and saw this result I had not noticed in the original updates (but could have easily missed it): -- Running transaction check Transaction check succeeded. Running transaction test Transaction test succeeded. Running transaction Preparing : 1/1 Reinstalling : kernel-core-5.4.19-200.fc31.aarch64 1/2 Running scriptlet: kernel-core-5.4.19-200.fc31.aarch64 1/2 Running scriptlet: kernel-core-5.4.19-200.fc31.aarch64 2/2 Cleanup : kernel-core-5.4.19-200.fc31.aarch64 2/2 Running scriptlet: kernel-core-5.4.19-200.fc31.aarch64 2/2 grubby fatal error: unable to find a suitable template Verifying : kernel-core-5.4.19-200.fc31.aarch64 1/2 Verifying : kernel-core-5.4.19-200.fc31.aarch64 2/2 Completion plugin: Generating completion cache... Reinstalled: kernel-core-5.4.19-200.fc31.aarch64 Complete! -- Checking /boot/efi/EFI/fedora/grub.cfg BEFORE rebooting and it was now 841768 in size in one go. Fortunately save a working copy of grub.cfg and replaced the bad one.
is still present.(In reply to Michael H. Warfield from comment #15) > Another point on the curve. Possibly related. Reinstalled > kernel-core-5.4.19-200.fc31.aarch64 using dnf and saw this result I had not > noticed in the original updates (but could have easily missed it): > > -- > Running transaction check > Transaction check succeeded. > Running transaction test > Transaction test succeeded. > Running transaction > Preparing : > 1/1 > Reinstalling : kernel-core-5.4.19-200.fc31.aarch64 > 1/2 > Running scriptlet: kernel-core-5.4.19-200.fc31.aarch64 > 1/2 > Running scriptlet: kernel-core-5.4.19-200.fc31.aarch64 > 2/2 > Cleanup : kernel-core-5.4.19-200.fc31.aarch64 > 2/2 > Running scriptlet: kernel-core-5.4.19-200.fc31.aarch64 > 2/2 > grubby fatal error: unable to find a suitable template > Hmm, that error seems to come from the grubby tool that's in the grubby-deprecated package. Do you have that package installed? If so, could you please remove it and reinstall the kernel to see if the issue is still present?
(In reply to Michael H. Warfield from comment #5) > Created attachment 1664060 [details] > /etc/grub2.cfg -> ../boot/efi/EFI/fedora/grub.cfg > > /etc/grub2.cfg is a symlink to ../boot/efi/EFI/fedora/grub.cfg > > I have attached the later. Attachments are taken from the same system in > mentioning tracking the changes to the grub.cfg file. So all the files look correct... (besides the grub.cfg of course).
Yes, the grubby-deprecated package was installed. Removing it also removed extlinux-bootloader. -- -> Starting dependency resolution --> Finding unneeded leftover dependencies ---> Package extlinux-bootloader.aarch64 1.2-10.fc31 will be erased ---> Package grubby-deprecated.aarch64 8.40-36.fc31 will be erased --> Finished dependency resolution Dependencies resolved. ================================================================================ Package Architecture Version Repository Size ================================================================================ Removing: grubby-deprecated aarch64 8.40-36.fc31 @fedora 127 k Removing dependent packages: extlinux-bootloader aarch64 1.2-10.fc31 @fedora 2.0 k Transaction Summary ================================================================================ Remove 2 Packages Freed space: 129 k -- Completing that and reinstalling 5.4.19 works fine. That seems to be at the heart of the problem. Checking both the downloaded aarch64 and armfp images for Xfce in the download directories, they BOTH seem to have grubby-deprecated in the builds. But it doesn't seem to impact my armfp (RPi V2-B+) systems. In fact, there's nothing at all in the /boot/efi/EFI/fedora directories at all. Have not checked any other spins.
Have tested it on all, six, of the affected systems. Removing grubby-deprecated seems to have resolved the issue. This real issue might now be, why was grubby-deprecated packaged in those stock images and are any other images affected.
(In reply to Michael H. Warfield from comment #18) > Yes, the grubby-deprecated package was installed. Removing it also removed > extlinux-bootloader. > The extlinux-bootloader is only needed for armv7, not for aarch64. And extlinux-bootloader has as a dependency the grubby-deprecated package because extlinux doesn't have BLS support like grub2. [snip] > > Checking both the downloaded aarch64 and armfp images for Xfce in the > download directories, they BOTH seem to have grubby-deprecated in the That doesn't seem correct. The grubby-deprecated package should only be present for armv7 and not for aarch64. But still installing grubby-deprecated should be a no-op if you have GRUB_ENABLE_BLSCFG=true in /etc/default/grub which is your case. > builds. But it doesn't seem to impact my armfp (RPi V2-B+) systems. In > fact, there's nothing at all in the /boot/efi/EFI/fedora directories at all. > Have not checked any other spins. Right, for armv7 that directory would be empty since it doesn't use the u-boot EFI stub to chain-load grub2. For some reason I was not able to reproduce your issue, but could you please test the following in the systems where you had the problem: $ grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg // so your grub.cfg is correct $ dnf install -y extlinux-bootloader // to pull grubby-deprecated $ dnf resintall -y kernel-core // this should corrupt your grub.cfg And then try the following: $ grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg // so your grub.cfg is correct again $ chmod -x /usr/lib/kernel/install.d/20-grubby.install // so the /usr/lib/kernel/install.d/20-grub2.install plugin is executed instead $ dnf resintall -y kernel-core // I think your grub.cfg should be correct this time
(In reply to Javier Martinez Canillas from comment #20) > (In reply to Michael H. Warfield from comment #18) > > Yes, the grubby-deprecated package was installed. Removing it also removed > > extlinux-bootloader. > > > > The extlinux-bootloader is only needed for armv7, not for aarch64. And > extlinux-bootloader has as a dependency the grubby-deprecated package > because extlinux doesn't have BLS support like grub2. > > [snip] > > > > > Checking both the downloaded aarch64 and armfp images for Xfce in the > > download directories, they BOTH seem to have grubby-deprecated in the > That doesn't seem correct. The grubby-deprecated package should only be > present for armv7 and not for aarch64. That proved to be an error at my end. I checked a fresh arm-installer install of that package and it was not present. Checking logs, I found it had snuk in somehow during what should have been a routing update. I think my central management was to blame. A thread on the fedora arm mailing list helped me track some of that down. > But still installing grubby-deprecated should be a no-op if you have > GRUB_ENABLE_BLSCFG=true in /etc/default/grub which is your case. > > > builds. But it doesn't seem to impact my armfp (RPi V2-B+) systems. In > > fact, there's nothing at all in the /boot/efi/EFI/fedora directories at all. > > Have not checked any other spins. > Right, for armv7 that directory would be empty since it doesn't use the > u-boot EFI stub to chain-load grub2. > For some reason I was not able to reproduce your issue, but could you please > test the following in the systems where you had the problem: > > $ grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg // so your grub.cfg is > correct I first made a reference copy of /boot/efi/EFI/fedora/grub.cfg (which was now good after the uninstalls, fixes, and reinstalls) and could do diffs. The grub2-mkconfig command resulted in a few minor cosmetic differences like this: 91c91 < set default=1 --- > set default="1" 124c124 < set default_kernelopts="root=UUID=92567a3e-b3d2-494a-b4ec-ebb4687d6b40 ro cma=192MB " --- > set default_kernelopts="root=UUID=92567a3e-b3d2-494a-b4ec-ebb4687d6b40 ro cma=192MB" Par for the course. > $ dnf install -y extlinux-bootloader // to pull grubby-deprecated Done > $ dnf resintall -y kernel-core // this should corrupt your grub.cfg Hahahahaha... This one got funny. They say timing is everything and it is... I got the error that none of my kernel-cores were available for reinstall. What the? Oh, kernel 5.4.20 had popped out of the queue over night. So doing an update instead. Some days. :-) Well, the grubby error is back: -- Cleanup : ibus-setup-1.5.21-7.fc31.noarch 64/76 Running scriptlet: kernel-core-5.4.17-200.fc31.aarch64 65/76 grubby fatal error: unable to find a suitable template grubby: doing this would leave no kernel entries. Not writing out new config. Erasing : kernel-core-5.4.17-200.fc31.aarch64 65/76 -- This didn't end up with a corrupted grub.cfg. But it did generate the earlier grubby error plus a bit. Did a reinstall of kerenl-core-5.4.20 Transaction test succeeded. Running transaction Preparing : 1/1 Reinstalling : kernel-core-5.4.20-200.fc31.aarch64 1/2 Running scriptlet: kernel-core-5.4.20-200.fc31.aarch64 1/2 Running scriptlet: kernel-core-5.4.20-200.fc31.aarch64 2/2 grubby fatal error: unable to find a suitable template grubby: doing this would leave no kernel entries. Not writing out new config. Cleanup : kernel-core-5.4.20-200.fc31.aarch64 2/2 Running scriptlet: kernel-core-5.4.20-200.fc31.aarch64 2/2 grubby fatal error: unable to find a suitable template grubby fatal error: unable to find a suitable template grubby: doing this would leave no kernel entries. Not writing out new config. Verifying : kernel-core-5.4.20-200.fc31.aarch64 1/2 Verifying : kernel-core-5.4.20-200.fc31.aarch64 2/2 Reinstalled: kernel-core-5.4.20-200.fc31.aarch64 Still... No corruption. I'm baffled. Removing the errant modules "fixed" the problem but reinstalling them caused the "grubby errors" to return but doesn't seem to have reintroduced the problem. Only difference I see now is that second error line "grubby: doing this would leave no kernel entries. Not writing out new config." That line wasn't there before. Something changed. Checked version numbers of grubby-deprecated and extlinux-bootloader and they match to the earlier versions. I'm baffled. > And then try the following: > $ grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg // so your grub.cfg is > correct again > $ chmod -x /usr/lib/kernel/install.d/20-grubby.install // so the > /usr/lib/kernel/install.d/20-grub2.install plugin is executed instead > $ dnf resintall -y kernel-core // I think your grub.cfg should be correct > this time
As mentioned you should remove the grubby-deprecated module since this shouldn't be used on aarch64. And re-generate your grub.cfg file with grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg.
This message is a reminder that Fedora 31 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora 31 on 2020-11-24. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '31'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 31 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora 31 changed to end-of-life (EOL) status on 2020-11-24. Fedora 31 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.