Bug 2071034
Summary: | kernel-install misdetects /boot/efi over /boot as install location on systems | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | tstruk |
Component: | systemd | Assignee: | systemd-maint |
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 36 | CC: | acaringi, adscvr, airlied, alain.vigne.14, alciregi, augenauf, bob.bogo, bskeggs, fedoraproject, filbranden, flepied, gbcox, hcamp, hdegoede, hpa, hugh, jan.public, jarodwilson, jforbes, jglisse, jonathan, josef, kernel-maint, kothandan.sathiyamoorthy, lgoncalv, linville, lnykryn, martineau, masami256, matthias.andree, mchehab, msekleta, ptalbert, ryncsn, ssahani, s, stanley.king, steved, suchandra.spam+fedora, swt, systemd-maint, unixi, yuwatana, zbyszek |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | systemd-250.6-1.fc36 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2022-05-27 01:10:05 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
tstruk
2022-04-01 16:12:46 UTC
This has nothing to do with kernel unfortunately, it seems to be systemd with kernel-install, which is placing kernels in /boot/efi/<machine-id>. The same thing hit me on upgrade to F36 on Tuesday. A quick fix seems to be an rm -rf of that directory in /boot/efi and then reinstall the kernel or call kernel-install by hand. For the systemd people, when it happened to me, I first tried to create an /etc/kernel/install.conf which explicitly passed KERNEL_INSTALL_BOOT_ROOT=/boot but running kernel install with -x showed that it was still getting changed to /boot/efi. (In reply to Justin M. Forbes from comment #1) > This has nothing to do with kernel unfortunately, true, I didn't know exactly what to log it against. > it seems to be systemd > with kernel-install, which is placing kernels in /boot/efi/<machine-id>. I thought that it might be the case and checked /boot/efi/, but it doesn't install it there. # find /boot/efi/ -name "vmlinuz*" finds nothing. Oh, hmm. Are you doing that as root? for everyone else this has hit, the kernel is installed in /boot/efi/<machine-id> and the entries do not show up in grub on reboot. Perhaps this is a different but, either way, systemd kernel-install is what puts things in /boot. (In reply to Justin M. Forbes from comment #3) > Oh, hmm. Are you doing that as root? for everyone else this has hit, the > kernel is installed in /boot/efi/<machine-id> and the entries do not show up > in grub on reboot. Perhaps this is a different but, either way, systemd > kernel-install is what puts things in /boot. What I did was just `sudo dnf update` and it pulled and installed the new kernel. I was surprised that it didn't boot into 5.17 after reboot so I started digging around. As I said, there is nothing new installed in my /boot/efi/ I think this may be related to an issue I have stumbled across when upgrading from F34 to F36 Beta. DNF reports that the kernel has been updated, but if you go back to the logs, you will find that there wasn't enough space for dracut to create the relevant boot images, etc. So, rpm -qa reports that kernel-core is successfully installed, but the relevant files are not there. Even as I'm writing this, I'm running a Fedora 34 kernel, though I went through the 'dnf system-upgrade' process. $ uname -a Linux diziet.reple.at 5.17.4-100.fc34.x86_64 #1 SMP PREEMPT Wed Apr 20 14:41:56 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux $ cat /etc/redhat-release Fedora release 36 (Thirty Six) ### excerpt from /var/log/dnf.rpm.log: 2022-05-02T14:57:09-0400 INFO ERROR: src/skipcpio/skipcpio.c:191:main(): fwrite ERROR: src/skipcpio/skipcpio.c:191:main(): fwrite /usr/lib/kernel/install.d/51-dracut-rescue.install: line 76: /boot/efi/loader/entries/3ce13bc91cea438ba1644997b3fa4412-0-rescue.conf: No such file or directory kdump: kernel 5.17.3-302.fc36.x86_64 doesn't exist warning: %posttrans(kernel-core-5.17.3-302.fc36.x86_64) scriptlet failed, exit status 1 2022-05-02T14:57:09-0400 ERROR Error in POSTTRANS scriptlet in rpm package kernel-core 2022-05-02T14:57:33-0400 INFO yes: standard output: Broken pipe ### tail end of most recent dnf -y upgrade output: Upgraded: libipa_hbac-2.7.0-1.fc36.x86_64 libsss_autofs-2.7.0-1.fc36.x86_64 libsss_certmap-2.7.0-1.fc36.x86_64 libsss_idmap-2.7.0-1.fc36.x86_64 libsss_nss_idmap-2.7.0-1.fc36.x86_64 libsss_sudo-2.7.0-1.fc36.x86_64 lpf-spotify-client-1.1.84.716-1.fc36.x86_64 python3-sssdconfig-2.7.0-1.fc36.noarch rpmfusion-free-release-36-1.noarch rpmfusion-free-release-tainted-36-1.noarch rpmfusion-nonfree-release-36-1.noarch selinux-policy-36.8-1.fc36.noarch selinux-policy-targeted-36.8-1.fc36.noarch sssd-2.7.0-1.fc36.x86_64 sssd-ad-2.7.0-1.fc36.x86_64 sssd-client-2.7.0-1.fc36.x86_64 sssd-common-2.7.0-1.fc36.x86_64 sssd-common-pac-2.7.0-1.fc36.x86_64 sssd-ipa-2.7.0-1.fc36.x86_64 sssd-krb5-2.7.0-1.fc36.x86_64 sssd-krb5-common-2.7.0-1.fc36.x86_64 sssd-ldap-2.7.0-1.fc36.x86_64 sssd-nfs-idmap-2.7.0-1.fc36.x86_64 sssd-proxy-2.7.0-1.fc36.x86_64 xdg-desktop-portal-1.12.4-1.fc36.x86_64 xz-5.2.5-9.fc36.x86_64 xz-devel-5.2.5-9.fc36.x86_64 xz-libs-5.2.5-9.fc36.i686 xz-libs-5.2.5-9.fc36.x86_64 Installed: kernel-5.17.5-300.fc36.x86_64 kernel-core-5.17.5-300.fc36.x86_64 kernel-devel-5.17.5-300.fc36.x86_64 kernel-modules-5.17.5-300.fc36.x86_64 kernel-modules-extra-5.17.5-300.fc36.x86_64 libjose-11-5.fc36.x86_64 sssd-idp-2.7.0-1.fc36.x86_64 Removed: kernel-5.16.19-100.fc34.x86_64 kernel-core-5.16.19-100.fc34.x86_64 kernel-devel-5.16.19-100.fc34.x86_64 kernel-modules-5.16.19-100.fc34.x86_64 kernel-modules-extra-5.16.19-100.fc34.x86_64 Complete! # rpm -qa |grep kernel-core |sort kernel-core-5.16.20-100.fc34.x86_64 kernel-core-5.17.3-302.fc36.x86_64 kernel-core-5.17.4-100.fc34.x86_64 kernel-core-5.17.5-300.fc36.x86_64 # ls /boot config-5.17.4-100.fc34.x86_64 grub2 lost+found vmlinuz-0-rescue-3ce13bc91cea438ba1644997b3fa4412 efi initramfs-0-rescue-3ce13bc91cea438ba1644997b3fa4412.img memtest86+-5.31 vmlinuz-5.17.4-100.fc34.x86_64 elf-memtest86+-5.31 initramfs-5.17.4-100.fc34.x86_64.img symvers-5.17.4-100.fc34.x86_64.gz extlinux loader System.map-5.17.4-100.fc34.x86_64 # ### Even after deleting as much as possible (only keeping files related to the currently running kernel), ### removing and reinstalling the new kernel still fails to actually install the new fc36 kernel, ### but DNF happily reports that it is installed and fine # df -h Filesystem Size Used Avail Use% Mounted on devtmpfs 4.0M 0 4.0M 0% /dev tmpfs 16G 12K 16G 1% /dev/shm tmpfs 6.3G 2.2M 6.3G 1% /run /dev/dm-3 489G 446G 18G 97% / tmpfs 16G 48K 16G 1% /tmp /dev/sda2 339M 68M 255M 21% /boot /dev/sda1 250M 6.2M 244M 3% /boot/efi /dev/mapper/b510-data 664G 609G 22G 97% /data /dev/mapper/b510-vms 251G 114G 125G 48% /vms tmpfs 3.2G 168K 3.2G 1% /run/user/1000 # rpm -qa |grep kernel |grep '5\.17\..-...\.fc36' kernel-modules-5.17.3-302.fc36.x86_64 kernel-devel-5.17.3-302.fc36.x86_64 kernel-5.17.3-302.fc36.x86_64 kernel-modules-extra-5.17.3-302.fc36.x86_64 kernel-tools-libs-5.17.0-300.fc36.x86_64 kernel-headers-5.17.0-300.fc36.x86_64 kernel-core-5.17.3-302.fc36.x86_64 kernel-core-5.17.5-300.fc36.x86_64 kernel-modules-5.17.5-300.fc36.x86_64 kernel-5.17.5-300.fc36.x86_64 kernel-modules-extra-5.17.5-300.fc36.x86_64 kernel-devel-5.17.5-300.fc36.x86_64 # ls /boot config-5.17.4-100.fc34.x86_64 extlinux loader symvers-5.17.4-100.fc34.x86_64.gz vmlinuz-5.17.4-100.fc34.x86_64 efi grub2 lost+found System.map-5.17.4-100.fc34.x86_64 elf-memtest86+-5.31 initramfs-5.17.4-100.fc34.x86_64.img memtest86+-5.31 vmlinuz-0-rescue-3ce13bc91cea438ba1644997b3fa4412 # rpm -qa |grep kernel |grep '5\.17\..-...\.fc36' |grep -v header |grep -v tools |xargs -d '\n' rpm -ev Preparing packages... kernel-modules-extra-5.17.5-300.fc36.x86_64 kernel-5.17.5-300.fc36.x86_64 kernel-modules-extra-5.17.3-302.fc36.x86_64 kernel-5.17.3-302.fc36.x86_64 kernel-modules-5.17.3-302.fc36.x86_64 kernel-modules-5.17.5-300.fc36.x86_64 kernel-core-5.17.5-300.fc36.x86_64 kernel-core-5.17.3-302.fc36.x86_64 kernel-devel-5.17.5-300.fc36.x86_64 kernel-devel-5.17.3-302.fc36.x86_64 # rpm -qa |grep 'kernel-core.*fc36' # ============================================================================================================================================================================================================ Package Architecture Version Repository Size ============================================================================================================================================================================================================ Installing: kernel x86_64 5.17.5-300.fc36 fedora 163 k kernel-core x86_64 5.17.5-300.fc36 fedora 46 M kernel-devel x86_64 5.17.5-300.fc36 fedora 15 M kernel-modules x86_64 5.17.5-300.fc36 fedora 53 M kernel-modules-extra x86_64 5.17.5-300.fc36 fedora 3.4 M Transaction Summary ============================================================================================================================================================================================================ Install 5 Packages Total download size: 118 M Installed size: 207 M Is this ok [y/N]: y Downloading Packages: (1/5): kernel-5.17.5-300.fc36.x86_64.rpm 141 kB/s | 163 kB 00:01 (2/5): kernel-devel-5.17.5-300.fc36.x86_64.rpm 1.7 MB/s | 15 MB 00:08 (3/5): kernel-core-5.17.5-300.fc36.x86_64.rpm 3.0 MB/s | 46 MB 00:15 (4/5): kernel-modules-5.17.5-300.fc36.x86_64.rpm 3.4 MB/s | 53 MB 00:15 (5/5): kernel-modules-extra-5.17.5-300.fc36.x86_64.rpm 441 kB/s | 3.4 MB 00:07 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Total 6.9 MB/s | 118 MB 00:17 Running transaction check Transaction check succeeded. Running transaction test Transaction test succeeded. Running transaction Preparing : 1/1 Installing : kernel-core-5.17.5-300.fc36.x86_64 1/5 Running scriptlet: kernel-core-5.17.5-300.fc36.x86_64 1/5 Installing : kernel-modules-5.17.5-300.fc36.x86_64 2/5 Running scriptlet: kernel-modules-5.17.5-300.fc36.x86_64 2/5 Installing : kernel-5.17.5-300.fc36.x86_64 3/5 Installing : kernel-modules-extra-5.17.5-300.fc36.x86_64 4/5 Running scriptlet: kernel-modules-extra-5.17.5-300.fc36.x86_64 4/5 Installing : kernel-devel-5.17.5-300.fc36.x86_64 5/5 Running scriptlet: kernel-devel-5.17.5-300.fc36.x86_64 5/5 Running scriptlet: kernel-core-5.17.5-300.fc36.x86_64 5/5 kdump: kernel 5.17.5-300.fc36.x86_64 doesn't exist Running scriptlet: kernel-modules-5.17.5-300.fc36.x86_64 5/5 Running scriptlet: kernel-devel-5.17.5-300.fc36.x86_64 5/5 Verifying : kernel-5.17.5-300.fc36.x86_64 1/5 Verifying : kernel-core-5.17.5-300.fc36.x86_64 2/5 Verifying : kernel-devel-5.17.5-300.fc36.x86_64 3/5 Verifying : kernel-modules-5.17.5-300.fc36.x86_64 4/5 Verifying : kernel-modules-extra-5.17.5-300.fc36.x86_64 5/5 Installed products updated. Installed: kernel-5.17.5-300.fc36.x86_64 kernel-core-5.17.5-300.fc36.x86_64 kernel-devel-5.17.5-300.fc36.x86_64 kernel-modules-5.17.5-300.fc36.x86_64 kernel-modules-extra-5.17.5-300.fc36.x86_64 Complete! # rpm -qa |grep 5.17.5-300 kernel-core-5.17.5-300.fc36.x86_64 kernel-modules-5.17.5-300.fc36.x86_64 kernel-5.17.5-300.fc36.x86_64 kernel-modules-extra-5.17.5-300.fc36.x86_64 kernel-devel-5.17.5-300.fc36.x86_64 # ls /boot config-5.17.4-100.fc34.x86_64 extlinux loader symvers-5.17.4-100.fc34.x86_64.gz vmlinuz-5.17.4-100.fc34.x86_64 efi grub2 lost+found System.map-5.17.4-100.fc34.x86_64 elf-memtest86+-5.31 initramfs-5.17.4-100.fc34.x86_64.img memtest86+-5.31 vmlinuz-0-rescue-3ce13bc91cea438ba1644997b3fa4412 # ls /boot/efi/3ce13bc91cea438ba1644997b3fa4412/5.17.5-300.fc36.x86_64/ initrd linux # ls -al /boot/efi/3ce13bc91cea438ba1644997b3fa4412/5.17.5-300.fc36.x86_64/ total 50648 drwx------. 2 root root 4096 May 5 11:49 . drwx------. 14 root root 4096 May 5 11:48 .. -rwx------. 1 root root 40048796 May 5 11:48 initrd -rwx------. 1 root root 11802352 May 5 11:49 linux [root@diziet boot]# df -h Filesystem Size Used Avail Use% Mounted on devtmpfs 4.0M 0 4.0M 0% /dev tmpfs 16G 12K 16G 1% /dev/shm tmpfs 6.3G 2.3M 6.3G 1% /run /dev/dm-3 489G 446G 18G 97% / tmpfs 16G 48K 16G 1% /tmp /dev/sda2 339M 68M 255M 21% /boot /dev/sda1 250M 173M 78M 69% /boot/efi /dev/mapper/b510-data 664G 609G 22G 97% /data /dev/mapper/b510-vms 251G 114G 125G 48% /vms tmpfs 3.2G 168K 3.2G 1% /run/user/1000 ### /boot/efi's usage went from about 6MB to 173MB, but clearly things went wrong. ### I'll also note that after the upgrade completed, there was no indication that anything went wrong at all; the system booted and everything seemed like it was working. Only when I went to check what kernel version was running did I notice that the new fc36 kernel was not running and not properly installed. My next step was going to be to back up all my files to another disk and repartition the system disk with very large /boot and /boot/efi partitions. I'm happy to try other suggestions. I think there are two issues: [a] DNF failed to recognize that the install didn't complete successfully and didn't remove the remnants of the failed installation [b] the dnf system-upgrade plugin failed to warn me that I have insufficient space before attempting the upgrade, if that is the case Well, the issue occurs, seemingly only on systems that were installed quite some time ago originally. The main problem is kernels aren't supposed to be installed in /boot/efi, something about the F36 systemd kernel-install is now putting them there, but only on a small percentage of systems. If you delete the /boot/efi/<machine-id> directory and reinstall the kernel, it will end up in /boot/ like older ones did, but if everyone does that, there will not be a system to debug the issue with. Of the 6 systems I have upgraded here, only 1 displayed the issue, and I know adamwill had it on exactly 1 system as well. *** Bug 2073734 has been marked as a duplicate of this bug. *** (In reply to tstruk from comment #2) > (In reply to Justin M. Forbes from comment #1) > > This has nothing to do with kernel unfortunately, > > true, I didn't know exactly what to log it against. > > > it seems to be systemd > > with kernel-install, which is placing kernels in /boot/efi/<machine-id>. > > I thought that it might be the case and checked /boot/efi/, but > it doesn't install it there. > # find /boot/efi/ -name "vmlinuz*" finds nothing. It shows up as linux, not vmlinuz. [root@kibble efi]# find /boot/efi/ -name "linux*" /boot/efi/<blah>/5.17.3-302.fc36.x86_64/linux /boot/efi/<blah>/0-rescue/linux /boot/efi/<blah>/5.17.5-300.fc36.x86_64/linux [root@kibble efi]# file /boot/efi/<blah>/5.17.5-300.fc36.x86_64/linux /boot/efi/<blah>/5.17.5-300.fc36.x86_64/linux: Linux kernel x86 boot executable bzImage, version 5.17.5-300.fc36.x86_64 (mockbuild.fedoraproject.org) #1 SMP PREEMPT Thu Apr 28 15:51:30 UTC 2022, RO-rootFS, swap_dev 0XB, Normal VGA (In reply to Justin M. Forbes from comment #6) > Well, the issue occurs, seemingly only on systems that were installed quite > some time ago originally. The main problem is kernels aren't supposed to be > installed in /boot/efi, something about the F36 systemd kernel-install is > now putting them there, but only on a small percentage of systems. If you > delete the /boot/efi/<machine-id> directory and reinstall the kernel, it > will end up in /boot/ like older ones did, but if everyone does that, there > will not be a system to debug the issue with. Of the 6 systems I have > upgraded here, only 1 displayed the issue, and I know adamwill had it on > exactly 1 system as well. I've got this issue on 2 of 2 systems I upgraded from F34 --> F36. I started the second upgrade after the first appeared to have been successful, but before I noticed that I was still running the F34 kernel. You're right in that both system were originally installed quite a while ago, maybe around F27, but I couldn't say for sure. Let me know if there's information I can gather that would help; otherwise I'll try deleting the /boot/efi/<machine-id> directory and reinstalling so that I don't get too far behind in kernel versions. Thanks! I removed /boot/efi/<machine-id> on one of the machines; another still has the issue. If there isn't any reason to leave the un-upgraded machine in it's current state, or no requests for information from that machine, I'll perform the workaround on Monday and move on. Thank you, --James *** Bug 2084362 has been marked as a duplicate of this bug. *** If anyone is still debugging this, you may find my comments on bug 2084362 helpful. It was declared a duplicate about an hour after I left my observations there. fyi, I also encountered this bug when upgrading from F35 to F36. This is a dual-boot machine, which was initially set up quite a while ago. I have to look up the date (not near that machine right now) - I'd say it was around F25. Reason for posting here, I am happy to provide debugging. +1 Similar experience than Flo. rm-rf /boot/efi/xxxxxx then, reinstall the kernel is a working work-around for me. Any idea if this work-around should be reproduced for next kernel update ? OK, I think I understand. Those old machines have the pre-BLS grub2 layout. kernel-install looks for /efi/<machine-id>/, /boot/<machine-id>/, /boot/efi/<machine-id>/, and in the affected machines, only the last directory exists. This causes the installation procedure to use /boot/efi/<machine-id>/<kernel-version>/ as the destination. I'll need to figure out what changed, or rather why it was working before. https://github.com/systemd/systemd/pull/23439 should fix the issue. The last commit is important, the earlier ones are preparation. *** Bug 2085288 has been marked as a duplicate of this bug. *** I upgraded to F36 weeks ago and today (May 25th) when I did an dnf offline-upgrade and then rebooted, out of the blue this problem occurred. Changed the title and removed reference to F34 because I had upgraded from F35 weeks ago and then just experienced the issue today. FEDORA-2022-3ca356cd2e has been submitted as an update to Fedora 36. https://bodhi.fedoraproject.org/updates/FEDORA-2022-3ca356cd2e FEDORA-2022-3ca356cd2e has been pushed to the Fedora 36 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2022-3ca356cd2e` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-3ca356cd2e See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates. 1. Install systemd-250.6-1.fc36.x86_64 2. Reinstall kernel-core 3. Profit! The new systemd has resolved the problem for me. FEDORA-2022-3ca356cd2e has been pushed to the Fedora 36 stable repository. If problem still persists, please make note of it in this bug report. I had this problem when I did a DNF upgrade F35->F36. No diagnostics appeared on the console. This is terrible! After the update, new kernels were on the system but grub could not see them or boot them. I could not fix it by re-installing kernels. I finally figured out that the ESP was full and that this mattered. Not by diagnostic! I fixed the problem by doing a clean install fo F36 from the .iso. After that clean install, /boot/efi contains a rescue kernel and initrd and a normal kernel (without initrd). Updating adds new kernels but only one (the latest??) is in /boot/esp. This is horrible: - no diagnostic. Not during the upgrade process, not during subsequent kernel installations. - silent change to requirement for space on ESP - silent change to how booting works When is the /boot/efi copy of the kernel used? Why only one and not all the installed kernels? (I can see an advantage of the kernel being in the ESP: it means that grub might not need to understand the filesystem or even device of /boot. Or grub could be eliminated. This seems really good!) |