Bug 2032680 - grub2-mkconfig does not apply settings to BLS entries
Summary: grub2-mkconfig does not apply settings to BLS entries
Keywords:
Status: NEW
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: grub2
Version: CentOS Stream
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: rc
: ---
Assignee: Bootloader engineering team
QA Contact: Release Test Team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-12-15 00:51 UTC by Ian Wienand
Modified: 2023-04-28 08:28 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Do not check the machine-id when updating BLS entries (575 bytes, patch)
2021-12-15 00:51 UTC, Ian Wienand
no flags Details | Diff
Do not check the machine-id when updating BLS entries (1.31 KB, patch)
2022-01-20 22:13 UTC, Ian Wienand
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-105817 0 None None None 2021-12-15 00:54:16 UTC

Description Ian Wienand 2021-12-15 00:51:52 UTC
Created attachment 1846303 [details]
Do not check the machine-id when updating BLS entries

If you take a current image, say [1] and boot it, you will have a BLS entry like

$ /boot/loader/entries/<machine-id>-<kernel>.conf

The <machine-id> will not actually match what you have in /etc/machine-id, as this will have been uniquely updated at boot.

If you do something like add "GRUB_DEVICE=/dev/foo" to /etc/default/grub and run "grub2-mkconfig -o /boot/grub/grub.cfg" this will have no effect, because 10_linux only looks at BLS entries with the current machine-id to update.  This happens in [2].

I believe this used to work because all the kernel options were put into the common grubenv file that each BLS entry would refer to.  Thus grub2-mkconfig would update grubenv and, even though it wasn't touching any BLS entries, they would all effectively get the new kernel options.

However, this changed with [3] to that kernel options are kept directly in each BLS entry (I can not quite follow the history of that patch, it appears to have been squashed in [4] to "fix a bunch of CVE's", but the behaviour still seems to apply)

This creates for very confusing behaviour.

* If you boot fresh, "grub2-mkconfig" will not alter the kernel command-line options, as the machine-id doesn't match
* If you have installed a new kernel after booting (or, re-installed your existing one for some reason), "grub2-mkconfig" WILL update the kernel options for one BLS entry that matches the machine-id, but the old one will be left alone.

Since I imagine we do not want to revert to setting kernel options in grubenv, I think the solution here is to just match and update ALL BLS entries when grub2-mkconfig runs.  This restores the prior behaviour of all kernels being populated with updates to the /etc/default/grub by grub2-mkconfig.

I have attached a small patch, with I have tested locally on [1].  With it, updating /etc/default/grub and running "grub2-mkconfig -o /boot/grub/grub.cfg" correctly updates the BLS entry with the new kernel settings.

[1] https://cloud.centos.org/centos/9-stream/x86_64/images/CentOS-Stream-GenericCloud-9-20211214.3.x86_64.qcow2
[2] https://gitlab.com/redhat/centos-stream/rpms/grub2/-/blob/c9s/0074-Add-BLS-support-to-grub-mkconfig.patch#L230
[3] https://src.fedoraproject.org/rpms/grub2/c/4a742183a39f344a7685bccdc76d5e64dea3766a?branch=master
[4] https://src.fedoraproject.org/rpms/grub2/c/46968b6e6323925be6f29c2770a04a8a1a858082?branch=rawhide

Comment 3 Robbie Harwood 2022-01-04 21:50:21 UTC
Hi Ian, thank you for the patch.  However, upstream grub requires the use of a Signed-off-by line on commits to assert compliance with https://developercertificate.org/ - can you confirm that you're in compliance with the DCO?  Otherwise we probably can't consider taking it.

Comment 4 Ian Wienand 2022-01-20 22:07:19 UTC
I have re-uploaded this with a description and signed-off-by

Comment 5 Ian Wienand 2022-01-20 22:13:54 UTC
Created attachment 1852352 [details]
Do not check the machine-id when updating BLS entries

Comment 6 Robbie Harwood 2022-01-20 22:54:34 UTC
Thanks.  Your changes have been applied in grub2-2.06-15.fc36.

Comment 7 Adam Williamson 2022-02-02 17:20:32 UTC
This sounds a lot like https://bugzilla.redhat.com/show_bug.cgi?id=2036199 , which we fixed by having anaconda not include /etc/machine-info when installing from a live image: https://github.com/rhinstaller/anaconda/pull/3770 . The upstream change that triggered the problem is https://github.com/systemd/systemd/pull/21757 . Note that Lennart doesn't like that change and has filed a ticket advocating for it to be re-considered: https://github.com/systemd/systemd/issues/22376 .

I considered sending a patch to change this code in the grub2 downstream patch, but decided against it for now. I would have had it read /etc/machine-info as well as /etc/machine-id and check both prefixes, rather than ditch the prefix check altogether.

The purpose of the prefix is outlined in the upstream boot loader specification document:
https://systemd.io/BOOT_LOADER_SPECIFICATION/
section "Type #1 Boot Loader Specification Entries":

"Note: $BOOT should be considered shared among all OS installations of a system. Instead of maintaining one $BOOT per installed OS (as /boot/ was traditionally handled), all installed OS share the same place to drop in their boot-time configuration.

...

Inside the $BOOT/loader/entries/ directory each OS vendor may drop one or more configuration snippets with the suffix “.conf”, one for each boot menu item. The file name of the file is used for identification of the boot item but shall never be presented to the user in the UI. The file name may be chosen freely but should be unique enough to avoid clashes between OS installations. More specifically it is suggested to include the machine ID (/etc/machine-id or the D-Bus machine ID for OSes that lack /etc/machine-id), the kernel version (as returned by uname -r) and an OS identifier (The ID field of /etc/os-release). Example: $BOOT/loader/entries/6a9857a393724b7a981ebb5b8495b9ea-3.8.0-2.fc19.x86_64.conf."

So the concept, at least, is that other OSes (or other installs of Red Hat-family OSes) may also be storing snippets in that directory, and the purpose of the UUID prefix is to identify which OS each snippet belongs to. At least in theory, by dropping the UUID prefix check, we could wind up editing files that "belong" to a different OS.

Things are complicated a bit by the fact that Red Hat-family OSes do not really follow the upstream boot loader specification fully, our implementation of it diverges considerably. Javier stated (in IRC discussion of this issue) that he thinks the whole "multiple OSes sharing $BOOT" thing doesn't really work in practice and thus it's fine to do this and we should just be explicit that RH-family OSes expect to have exclusive ownership of their $BOOT location and do not support sharing it in this way, and that would probably be a supportable position. But I figured it'd be a good idea to explain the likely background to this issue and raise the possibility that we might want to re-consider ditching the UUID prefix check.

Comment 8 Adam Williamson 2022-02-02 17:45:14 UTC
Chris Murphy points out that anaconda actually explicitly wipes all snippets it finds in `/boot/loader/entries` on installation:
https://github.com/rhinstaller/anaconda/blob/master/pyanaconda/modules/storage/bootloader/utils.py#L229-L233

So we're certainly not aligning with the original idea there either. That's been around since https://github.com/rhinstaller/anaconda/commit/6791e7320e6c222b92746e8f228026695fa0e2a0 in March 2019 (which was fixing other problems with live installs. You know, every month I'm more convinced that live installs are a mistake...)

Comment 9 Chris Murphy 2022-02-02 20:40:33 UTC
That'd be bug 1874724.

Comment 10 Ian Wienand 2022-02-02 21:10:42 UTC
One other thing that I didn't mention originally to try and keep this report simple is that this behaviour emerged after diskimage-builder merged changes to use grubby [1] where it was found that 

 grubby --update-kernel=ALL --args="root=LABEL=${DIB_ROOT_LABEL}"

applied to all installed kernels (irrespective of machine-id).  

[2] says "For systems that use the GRUB2 bootloader, the command updates the /boot/grub2/grubenv file by adding a new kernel parameter to the kernelopts variable in that file." -- and that's what it seems to do @ [4] -- but I'm not even sure how that works since kernel entries aren't looking at the grub environment file per the aforementioned [3]?  To add to the confusion, grubby *also* updates GRUB_CMDLINE_LINUX in /etc/grub/default so without the change to ignore the machine-id it's not quite "idempotent" (for want of a better word) -- grubby writes in config to /etc/grub/default that then doesn't get applied by grub2-mkconfig.

So that adds another rather confusing few entries to the "what the heck is going on" matrix for your average admin trying to update a kernel parameter...

[1] https://review.opendev.org/c/openstack/diskimage-builder/+/804002
[2] https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9-beta/html/managing_monitoring_and_updating_the_kernel/configuring-kernel-command-line-parameters_managing-monitoring-and-updating-the-kernel
[3] https://src.fedoraproject.org/rpms/grub2/c/4a742183a39f344a7685bccdc76d5e64dea3766a?branch=master
[4] https://src.fedoraproject.org/rpms/grubby/blob/f34/f/grubby-bls

Comment 11 Chris Murphy 2022-02-02 22:33:58 UTC
>[2] says "For systems that use the GRUB2 bootloader, the command updates the /boot/grub2/grubenv file by adding a new kernel parameter to the kernelopts variable in that file."

That's stale info. It hasn't been the case since Fedora 34. Pretty sure it was part of this change https://fedoraproject.org/wiki/Changes/UnifyGrubConfig

Currently, the BLS snippet contains the kernel options explicitly, no macro.

There are grub-mkconfig generated boot parameters, you get them even if /etc/default/grub has an empty or missing GRUB_CMDLINE_LINUX= line - `ro root=UUID=$rootfsuuid` and additionally on Btrfs grub-mkconfig discovers and inserts `rootflags=subvol=$pathtorootfssubvol` which is typically just "root".

I've pretty much settled on `grubby` exclusively for this task, because it's the same command no matter Fedora release version, the firmware type, or arch. It should just work, and this wiki has been updated accordingly: https://fedoraproject.org/wiki/GRUB_2?rd=Grub2

Comment 13 Adam Williamson 2022-02-15 17:21:32 UTC
Further update on the upstream side of this: Lennart has sent a PR that changes it again - https://github.com/systemd/systemd/pull/22463

Comment 15 Adam Williamson 2022-03-01 23:27:42 UTC
I think the change that was made in Fedora for this may have broken ostree installers: https://bugzilla.redhat.com/show_bug.cgi?id=2059776

Comment 20 Chris Murphy 2022-09-07 19:42:59 UTC
See also bug 2120845.


Note You need to log in before you can comment on or make changes to this bug.