Bug 2239008 - Unable install vanilla upstream kernel after update grubby to version 8.40-72.fc40
Summary: Unable install vanilla upstream kernel after update grubby to version 8.40-72...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: grub2
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Nicolas Frayer
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 2242007 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-09-14 18:37 UTC by Mikhail
Modified: 2024-06-05 14:08 UTC (History)
23 users (show)

Fixed In Version: systemd-254.2-7.fc40 grub2-2.06-104.fc39
Clone Of:
Environment:
Last Closed: 2023-11-03 18:35:13 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
kernel boot error photo (620.54 KB, image/jpeg)
2023-09-20 14:02 UTC, Mikhail
no flags Details

Description Mikhail 2023-09-14 18:37:49 UTC
Description of problem:
Unable install vanilla upstream kernel after update grubby to version 8.40-72.fc40
Last good version is 8.40-71.fc39

Version-Release number of selected component (if applicable):


How reproducible:
Always


Steps to Reproduce:
Use gude how to build and install vanilla upstream kernel from here:
https://fedoraproject.org/wiki/Building_a_custom_kernel


Actual results:
❯ sudo make install
[sudo] password for mikhail: 
  INSTALL /boot
Cannot find LILO.


Expected results:
❯ sudo make install
[sudo] password for mikhail: 
  INSTALL /boot
grep: warning: stray \ before /
kdump: For kernel=/boot/vmlinuz-6.4.12-check-kasan+, crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M now. Please reboot the system for the change to take effet. Note if you don't want kexec-tools to manage the crashkernel kernel parameter, please set auto_reset_crashkernel=no in /etc/kdump.conf.



Additional info:

Comment 1 Fedora Blocker Bugs Application 2023-09-14 18:45:33 UTC
Proposed as a Blocker and Freeze Exception for 39-beta by Fedora user mikhail using the blocker tracking app because:

 Please revert this change https://src.fedoraproject.org/rpms/grubby/c/aa9b1b454fd0c792226cefb65d744fadfaa8acda?branch=rawhide or update instruction https://fedoraproject.org/wiki/Building_a_custom_kernel "Building Vanilla upstream kernel"
Because this is regression for me. With this change I can't test custom kernels and I can't make kernel bisect for reporting bugs. Fedora can't be my primary distribution if I unable bug hunting.

Comment 2 Adam Williamson 2023-09-14 18:50:50 UTC
We have already signed off the Beta, there can't be any new Beta blockers or FEs now. (this would not be one, anyway, as the offending version of grubby is not in the Beta compose, and this doesn't violate any release criteria anyway).

Comment 3 Fedora Update System 2023-09-18 09:00:28 UTC
FEDORA-2023-2561021d71 has been submitted as an update to Fedora 40. https://bodhi.fedoraproject.org/updates/FEDORA-2023-2561021d71

Comment 4 Fedora Update System 2023-09-18 11:02:10 UTC
FEDORA-2023-2561021d71 has been pushed to the Fedora 40 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 5 Marc Dionne 2023-09-18 15:39:27 UTC
The change in systemd-udev-254.2-6 to provide a /usr/sbin/installkernel link is broken, at least on f39.  With systemd-udev-254.2-6.fc39.x86_64, the link is created as:

lrwxrwxrwx 1 root root 53 Sep 14 21:00 /usr/sbin/installkernel -> ../../../../BUILD/systemd-stable-254.2/kernel-install

which of course doesn't exist outside of the build environment, and results in a failure of "make install" for the kernel, as described in this BZ.

Comment 7 Zbigniew Jędrzejewski-Szmek 2023-09-18 20:20:53 UTC
Should be fixed now.

Comment 8 Marc Dionne 2023-09-20 13:35:37 UTC
I mistakenly thought this was fixed but it was because I was reinstalling a kernel that already existed in /boot.  If it doesn't, installing a new kernel fails, for instance:

  grub2-mkrelpath: error: failed to get canonical path of `/boot/vmlinuz-6.6.0-rc2.marco+'.
  dirname: missing operand
  Try 'dirname --help' for more information.

I'm confused as to how this could possibly work.  The updated systemd-254.2-7 package does provide a link at /usr/sbin/installkernel that points to /usr/bin/kernel-install, but these are not equivalent and take different arguments.  The now gone /usr/sbin/installkernel script from the grubby package used to expect "<kernel_version> <bootimage> <mapfile>", copy the files over to /boot (making .old backups if pre-existing), then would do:

  kernel-install add $KERNEL_VERSION $INSTALL_PATH/$KERNEL_NAME-$KERNEL_VERSION

So if systemd-udev intends to replace this functionality, it should perhaps just copy over the installkernel script that used to live in grubby, rather than trying to do it with a symlink.

Comment 9 Mikhail 2023-09-20 14:02:57 UTC
Created attachment 1989703 [details]
kernel boot error photo

Yeah, I confirm that installed new custom kernel don't want booting:

❯ sudo make install
[sudo] password for mikhail: 
  INSTALL /boot
grub2-mkrelpath: error: failed to get canonical path of `/boot/vmlinuz-6.5.4'.
dirname: missing operand
Try 'dirname --help' for more information.
grep: warning: stray \ before /
kdump: For kernel=/vmlinuz-6.5.4, crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M now. Please reboot the system for the change to take effet. Note if you don't want kexec-tools to manage the crashkernel kernel parameter, please set auto_reset_crashkernel=no in /etc/kdump.conf.

Comment 10 Adam Williamson 2023-09-20 17:43:46 UTC
The initial change to grubby claims "systemd has kernel-install which supports being invoked as installkernel" - that implies that kernel-install is *supposed* to have a special mode where it recognizes when it was called under the name 'installkernel' and acts the way installkernel did. Obviously something is broken with that.

Comment 11 Adam Williamson 2023-09-20 17:47:17 UTC
OK, so I see that code here:

https://github.com/systemd/systemd/blob/d66ad6ff854fd1f587ba686e5ef2025c5c3a72dc/src/kernel-install/kernel-install.c#L1045-L1053

that looks like it handles the difference in expected arguments, but I don't see that kernel-install has anything to handle the part Marc describes as "copy the files over to /boot (making .old backups if pre-existing)". That sounds like the missing link here. Either kernel-install needs to grow the ability to do that very quickly, or we need to revert back to using installkernel in grubby until this can be rethought.

Comment 12 Marc Dionne 2023-09-20 18:25:55 UTC
Yes, when kernel-install is invoked from an rpm installation of kernel-core, the files have already been dropped into /boot by rpm, so no copy is necessary.  But when /usr/sbin/installkernel is invoked from a kernel build tree's "make install", something is needed to copy System.map and vmlinuz into /boot.

Comment 13 Ronald Warsow 2023-09-21 11:45:37 UTC
info:

sudo make install run in the above error 
AND
places a file called "bzImage-*" under /boot, not an "vmlinuz-*"

Comment 14 Emory Taylor 2023-09-24 15:39:04 UTC
> info:
> 
> sudo make install run in the above error 
> AND
> places a file called "bzImage-*" under /boot, not an "vmlinuz-*"

I manually changed the bzImage-* to vmlinuz-* and re-ran make install and was able to boot after that so I can confirm that is the problem as of right now

Comment 15 Zbigniew Jędrzejewski-Szmek 2023-09-27 15:16:15 UTC
https://src.fedoraproject.org/rpms/grub2/pull-request/29

Comment 16 Marc Dionne 2023-09-27 19:39:42 UTC
The change from comment 15 does allow the installed kernel to boot.

I would note however several differences as compared to the old grubby installkernel workflow, which may or may not be important depending on the user:

- Old matching files in /boot used to be preserved with a .old extension, rather than just removed
- Because the source path is arch/x86/boot (for x86_64) and not the top level, System.map is not found and is not copied to /boot
- The /boot/System.map and /boot/vmlinuz symlinks are not updated
- The kernel image is installed with the uid:gid from the kernel build tree, rather than the user doing the install (root)

Comment 17 Zbigniew Jędrzejewski-Szmek 2023-09-27 20:25:49 UTC
(In reply to Marc Dionne from comment #16)
> I would note however several differences as compared to the old grubby
> installkernel workflow, which may or may not be important depending on the
> user:
> 
> - Old matching files in /boot used to be preserved with a .old extension,
> rather than just removed
I think that keeping the files is not a good idea, because space in /boot is
severely limited. If you compile a different kernel, it is not really useful
without the modules, and we don't do this kind of backup for modules, so 
we might just as well just replace the kernel.

> - Because the source path is arch/x86/boot (for x86_64) and not the top
> level, System.map is not found and is not copied to /boot
Yes. I don't think System.map in /boot is useful for anything, because
the boot loader is not going to use it for anything. For packaged kernels
we place it in /usr/lib/modules/$kver/, where it is accessible from
userspace.

> - The /boot/System.map and /boot/vmlinuz symlinks are not updated
We generally don't update those when installing kernels. Are they used
by anything?

> - The kernel image is installed with the uid:gid from the kernel build tree,
> rather than the user doing the install (root)
Indeed. That seems to be a idiosyncrasy of 20-grub.install. It uses 'cp -a'
to explicitly preserve ownership. Other kernel-install plugins don't do this…

Comment 18 Zbigniew Jędrzejewski-Szmek 2023-09-28 10:23:21 UTC
I updated the pull request to not drop preservation of ownership and xattrs.

Comment 19 Zbigniew Jędrzejewski-Szmek 2023-09-28 10:30:57 UTC
*to drop preservation

Comment 20 Marc Dionne 2023-09-28 11:08:17 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #17)

I agree with most of your points, I was pointing out the changes in behaviour.

> (In reply to Marc Dionne from comment #16)
> > I would note however several differences as compared to the old grubby
> > installkernel workflow, which may or may not be important depending on the
> > user:
> > 
> > - Old matching files in /boot used to be preserved with a .old extension,
> > rather than just removed
> I think that keeping the files is not a good idea, because space in /boot is
> severely limited. If you compile a different kernel, it is not really useful
> without the modules, and we don't do this kind of backup for modules, so 
> we might just as well just replace the kernel.

Yeah I won't miss those, I periodically clean them up so they don't hog space.

> > - Because the source path is arch/x86/boot (for x86_64) and not the top
> > level, System.map is not found and is not copied to /boot
> Yes. I don't think System.map in /boot is useful for anything, because
> the boot loader is not going to use it for anything. For packaged kernels
> we place it in /usr/lib/modules/$kver/, where it is accessible from
> userspace.
> 
> > - The /boot/System.map and /boot/vmlinuz symlinks are not updated
> We generally don't update those when installing kernels. Are they used
> by anything?

I'm thinking that at some point you might have had a generic boot loader entry to load the latest kernel, making use of those symlinks.

> > - The kernel image is installed with the uid:gid from the kernel build tree,
> > rather than the user doing the install (root)
> Indeed. That seems to be a idiosyncrasy of 20-grub.install. It uses 'cp -a'
> to explicitly preserve ownership. Other kernel-install plugins don't do this…

That one's definitely an issue, thanks for fixing.

Comment 21 Thorsten Leemhuis 2023-10-04 05:02:07 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #15)
> https://src.fedoraproject.org/rpms/grub2/pull-request/29

Thx Zbigniew for fixing this, the latest grub from koji (grub2-2.06-103.fc39) fixed things for me (it afaics contain changes from that pull request or something similar)

Comment 22 Adam Williamson 2023-10-04 15:41:06 UTC
So this is fixed as far as Rawhide is concerned...Nicolas, could you also send updates for other affected releases? Not sure if that's only F39?

Comment 23 Nicolas Frayer 2023-10-04 16:18:56 UTC
This PR has only been merged in rawhide and F39.

Comment 24 Fedora Update System 2023-10-04 16:23:43 UTC
FEDORA-2023-144afbd4c3 has been submitted as an update to Fedora 39. https://bodhi.fedoraproject.org/updates/FEDORA-2023-144afbd4c3

Comment 25 Fedora Update System 2023-10-05 01:49:52 UTC
FEDORA-2023-144afbd4c3 has been pushed to the Fedora 39 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-144afbd4c3`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-144afbd4c3

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 26 Ronald Warsow 2023-10-05 13:54:58 UTC
confirming bug is fixed with grub2-2.06-103.fc39

Thanks !

Comment 27 Fedora Update System 2023-10-11 17:01:06 UTC
FEDORA-2023-f46483e5b5 has been submitted as an update to Fedora 39. https://bodhi.fedoraproject.org/updates/FEDORA-2023-f46483e5b5

Comment 28 Fedora Update System 2023-10-12 02:24:11 UTC
FEDORA-2023-f46483e5b5 has been pushed to the Fedora 39 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-f46483e5b5`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-f46483e5b5

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 29 Fedora Update System 2023-11-03 18:35:13 UTC
FEDORA-2023-f46483e5b5 has been pushed to the Fedora 39 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 30 Jon Masters 2024-03-21 17:21:45 UTC
The grubby fix got unpushed due to Marcan reporting problems with devicetree systems...is there a plan to push an updated fix? It seems like upstream kernels cannot be built on fully updated F39 systems.

Comment 31 Thorsten Leemhuis 2024-05-24 09:02:23 UTC
Federico, out of Curiosity: why did you CC: to this resolved bug? Are you still seeing this or a similar problem? 

The comment from Jon also makes it sound like this is not resolved. But I some weeks ago installed a kernel manually and did not encounter any problem.

Guess if anyone is still affected it might be wise to open a new bug and drop a link to it here.

Comment 32 Federico Pedemonte 2024-05-24 09:56:16 UTC
(In reply to Thorsten Leemhuis from comment #31)
> Federico, out of Curiosity: why did you CC: to this resolved bug? Are you
> still seeing this or a similar problem? 
> 
> The comment from Jon also makes it sound like this is not resolved. But I
> some weeks ago installed a kernel manually and did not encounter any problem.
> 
> Guess if anyone is still affected it might be wise to open a new bug and
> drop a link to it here.

yes Thorsten, happened to me today on F39 trying to install a vanilla 6.6.31 package obtained via `make binrpm-pkg`.

 Preparing        :                                                                                                                                      1/1 
  Reinstalling     : kernel-6.6.31-4.x86_64                                                                                                               1/2 
  Running scriptlet: kernel-6.6.31-4.x86_64                                                                                                               1/2 
grub2-mkrelpath: error: failed to get canonical path of `/boot/vmlinuz-6.6.31'.
dirname: missing operand

I have no issues at all compiling and installing fresh mainline kernels in the same way.

thank you,
F.

Comment 33 Marta Lewandowska 2024-05-28 09:25:47 UTC
Hi Jon and Federico (or anyone else still seeing this bug),

What versions of grub2 and systemd are you using? The released f39 version of grub2 does not mitigate the issue, but updates (grub2-2.06.120.fc39) should.

The patch from https://bodhi.fedoraproject.org/updates/FEDORA-2023-144afbd4c3 (and additions to it) is still included, as is the fix https://bodhi.fedoraproject.org/updates/FEDORA-2023-f46483e5b5 which did not receive negative karma.

thanks.

Comment 34 Federico Pedemonte 2024-05-28 11:49:52 UTC
Hi Marta,

I confirm I have the updated grub2 version, and this version of systemd:

systemd-254.12-1.fc39.x86_64

if there are tests or any other information I could provide, I'll be happy to help.

thanks,
F.

Comment 35 Thorsten Leemhuis 2024-05-28 11:58:16 UTC
@federico: but the thing is: this bug was afaics about installing the kernel manually using "make install"; you afaics (correct me if I'm wrong!) seem to be using `make binrpm-pkg` and have trouble when installing the resulting rpm, so it likely is a different bug (one where the root of the problem might or might not be upstream).

Comment 36 Federico Pedemonte 2024-05-28 13:06:04 UTC
(In reply to Thorsten Leemhuis from comment #35)
> @federico: but the thing is: this bug was afaics about installing the kernel
> manually using "make install"; you afaics (correct me if I'm wrong!) seem to
> be using `make binrpm-pkg` and have trouble when installing the resulting
> rpm, so it likely is a different bug (one where the root of the problem
> might or might not be upstream).

yes Marta, your understanding is 100% correct.

sorry if my commente here was misplaced: I thought it was the same grub issue.

Comment 37 Thorsten Leemhuis 2024-05-28 13:20:57 UTC
(In reply to Federico Pedemonte from comment #36)
> (In reply to Thorsten Leemhuis from comment #35)
> > @federico: but the thing is: this bug was afaics about installing the kernel
> > manually using "make install"; you afaics (correct me if I'm wrong!) seem to
> > be using `make binrpm-pkg` and have trouble when installing the resulting
> > rpm, so it likely

s/likely/might/

>> is a different bug (one where the root of the problem
>> might or might not be upstream).
> 
> sorry if my commente here was misplaced:

no worries, no need to say

> I thought it was the same grub issue.

It might (or might be realted), but as mentioned a few days ago: I'd say it's best to open a separate bug (afterwards mention it here) to avoid confusion.

Comment 38 Marta Lewandowska 2024-05-28 14:07:52 UTC
Hi Federico, no worries, thanks for letting us know, and please do open another bug for the problem you're seeing.

Thorsten, thanks for your comments.

We'll keep this bug closed then. (:

Comment 39 Jeremy Linton 2024-06-05 14:08:27 UTC
*** Bug 2242007 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.