Bug 2032680
Summary: | grub2-mkconfig does not apply settings to BLS entries | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Ian Wienand <iwienand> | ||||||
Component: | grub2 | Assignee: | Bootloader engineering team <bootloader-eng-team> | ||||||
Status: | CLOSED MIGRATED | QA Contact: | Release Test Team <release-test-team-automation> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | CentOS Stream | CC: | apevec, awilliam, bstinson, bugzilla, jaredz, javierm, jwboyer, ngompa13, owalsh, sbarcomb, zbyszek | ||||||
Target Milestone: | rc | Keywords: | MigratedToJIRA, Patch, Triaged | ||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2023-09-16 13:24:09 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Ian Wienand
2021-12-15 00:51:52 UTC
Hi Ian, thank you for the patch. However, upstream grub requires the use of a Signed-off-by line on commits to assert compliance with https://developercertificate.org/ - can you confirm that you're in compliance with the DCO? Otherwise we probably can't consider taking it. I have re-uploaded this with a description and signed-off-by Created attachment 1852352 [details]
Do not check the machine-id when updating BLS entries
Thanks. Your changes have been applied in grub2-2.06-15.fc36. This sounds a lot like https://bugzilla.redhat.com/show_bug.cgi?id=2036199 , which we fixed by having anaconda not include /etc/machine-info when installing from a live image: https://github.com/rhinstaller/anaconda/pull/3770 . The upstream change that triggered the problem is https://github.com/systemd/systemd/pull/21757 . Note that Lennart doesn't like that change and has filed a ticket advocating for it to be re-considered: https://github.com/systemd/systemd/issues/22376 . I considered sending a patch to change this code in the grub2 downstream patch, but decided against it for now. I would have had it read /etc/machine-info as well as /etc/machine-id and check both prefixes, rather than ditch the prefix check altogether. The purpose of the prefix is outlined in the upstream boot loader specification document: https://systemd.io/BOOT_LOADER_SPECIFICATION/ section "Type #1 Boot Loader Specification Entries": "Note: $BOOT should be considered shared among all OS installations of a system. Instead of maintaining one $BOOT per installed OS (as /boot/ was traditionally handled), all installed OS share the same place to drop in their boot-time configuration. ... Inside the $BOOT/loader/entries/ directory each OS vendor may drop one or more configuration snippets with the suffix “.conf”, one for each boot menu item. The file name of the file is used for identification of the boot item but shall never be presented to the user in the UI. The file name may be chosen freely but should be unique enough to avoid clashes between OS installations. More specifically it is suggested to include the machine ID (/etc/machine-id or the D-Bus machine ID for OSes that lack /etc/machine-id), the kernel version (as returned by uname -r) and an OS identifier (The ID field of /etc/os-release). Example: $BOOT/loader/entries/6a9857a393724b7a981ebb5b8495b9ea-3.8.0-2.fc19.x86_64.conf." So the concept, at least, is that other OSes (or other installs of Red Hat-family OSes) may also be storing snippets in that directory, and the purpose of the UUID prefix is to identify which OS each snippet belongs to. At least in theory, by dropping the UUID prefix check, we could wind up editing files that "belong" to a different OS. Things are complicated a bit by the fact that Red Hat-family OSes do not really follow the upstream boot loader specification fully, our implementation of it diverges considerably. Javier stated (in IRC discussion of this issue) that he thinks the whole "multiple OSes sharing $BOOT" thing doesn't really work in practice and thus it's fine to do this and we should just be explicit that RH-family OSes expect to have exclusive ownership of their $BOOT location and do not support sharing it in this way, and that would probably be a supportable position. But I figured it'd be a good idea to explain the likely background to this issue and raise the possibility that we might want to re-consider ditching the UUID prefix check. Chris Murphy points out that anaconda actually explicitly wipes all snippets it finds in `/boot/loader/entries` on installation: https://github.com/rhinstaller/anaconda/blob/master/pyanaconda/modules/storage/bootloader/utils.py#L229-L233 So we're certainly not aligning with the original idea there either. That's been around since https://github.com/rhinstaller/anaconda/commit/6791e7320e6c222b92746e8f228026695fa0e2a0 in March 2019 (which was fixing other problems with live installs. You know, every month I'm more convinced that live installs are a mistake...) That'd be bug 1874724. One other thing that I didn't mention originally to try and keep this report simple is that this behaviour emerged after diskimage-builder merged changes to use grubby [1] where it was found that grubby --update-kernel=ALL --args="root=LABEL=${DIB_ROOT_LABEL}" applied to all installed kernels (irrespective of machine-id). [2] says "For systems that use the GRUB2 bootloader, the command updates the /boot/grub2/grubenv file by adding a new kernel parameter to the kernelopts variable in that file." -- and that's what it seems to do @ [4] -- but I'm not even sure how that works since kernel entries aren't looking at the grub environment file per the aforementioned [3]? To add to the confusion, grubby *also* updates GRUB_CMDLINE_LINUX in /etc/grub/default so without the change to ignore the machine-id it's not quite "idempotent" (for want of a better word) -- grubby writes in config to /etc/grub/default that then doesn't get applied by grub2-mkconfig. So that adds another rather confusing few entries to the "what the heck is going on" matrix for your average admin trying to update a kernel parameter... [1] https://review.opendev.org/c/openstack/diskimage-builder/+/804002 [2] https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9-beta/html/managing_monitoring_and_updating_the_kernel/configuring-kernel-command-line-parameters_managing-monitoring-and-updating-the-kernel [3] https://src.fedoraproject.org/rpms/grub2/c/4a742183a39f344a7685bccdc76d5e64dea3766a?branch=master [4] https://src.fedoraproject.org/rpms/grubby/blob/f34/f/grubby-bls >[2] says "For systems that use the GRUB2 bootloader, the command updates the /boot/grub2/grubenv file by adding a new kernel parameter to the kernelopts variable in that file." That's stale info. It hasn't been the case since Fedora 34. Pretty sure it was part of this change https://fedoraproject.org/wiki/Changes/UnifyGrubConfig Currently, the BLS snippet contains the kernel options explicitly, no macro. There are grub-mkconfig generated boot parameters, you get them even if /etc/default/grub has an empty or missing GRUB_CMDLINE_LINUX= line - `ro root=UUID=$rootfsuuid` and additionally on Btrfs grub-mkconfig discovers and inserts `rootflags=subvol=$pathtorootfssubvol` which is typically just "root". I've pretty much settled on `grubby` exclusively for this task, because it's the same command no matter Fedora release version, the firmware type, or arch. It should just work, and this wiki has been updated accordingly: https://fedoraproject.org/wiki/GRUB_2?rd=Grub2 Further update on the upstream side of this: Lennart has sent a PR that changes it again - https://github.com/systemd/systemd/pull/22463 I think the change that was made in Fedora for this may have broken ostree installers: https://bugzilla.redhat.com/show_bug.cgi?id=2059776 See also bug 2120845. Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug. This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there. Due to differences in account names between systems, some fields were not replicated. Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information. To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer. You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like: "Bugzilla Bug" = 1234567 In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information. |