Bug 1598523 - Make BootLoaderSpec-style configuration files the default
Summary: Make BootLoaderSpec-style configuration files the default
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: Changes Tracking
Version: 30
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Javier Martinez Canillas
QA Contact:
URL:
Whiteboard: RejectedFreezeException
Depends On: 1654841
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-07-05 17:54 UTC by Ben Cotton
Modified: 2019-07-15 08:16 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-04-30 15:49:38 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Bugzilla 1653434 None CLOSED grub.cfg default points to wrong entry 2019-09-19 08:03:05 UTC

Internal Links: 1653434

Description Ben Cotton 2018-07-05 17:54:27 UTC
This is a tracking bug for Change: Make BootLoaderSpec-style configuration files the default
For more details, see: https://fedoraproject.org/wiki/Changes/BootLoaderSpecByDefault

This change enables the use of per-boot-entry configuration files, similar to those described in Boot Loader Specification (BLS), to populate the bootloader's menu entries.

Comment 1 Ben Cotton 2018-07-05 18:05:07 UTC
Change not approved by FESCo yet. Expect discussion 9 July.

Comment 2 Jan Kurik 2018-08-14 09:55:32 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 29 development cycle.
Changing version to '29'.

Comment 3 Ben Cotton 2018-08-14 13:13:00 UTC
According to the Fedora 29 schedule[1], today is the deadline for changes to be in a testable state. If your change is ready to be tested, please set the status to ON_QA. A list of incomplete changes will be sent to FESCo tomorrow for evaluation. If you know your change will not be ready for Fedora 29, you can set the version to rawhide and notify bcotton@fedoraproject.org.

[1] https://fedoraproject.org/wiki/Releases/29/Schedule

Comment 4 Fedora Blocker Bugs Application 2018-09-04 12:41:58 UTC
Proposed as a Freeze Exception for 29-beta by Fedora user javierm using the blocker tracking app because:

 The Anaconda changes to install with a BLS configuration by default were missing. These have been recently approved upstream but the package needs to be updated.

Comment 5 Geoffrey Marr 2018-09-04 20:44:26 UTC
Discussed during the 2018-09-04 blocker review meeting: [1]

The decision to classify this bug as a "RejectedFreezeException" was made as
this is a major functional change which should have landed much earlier; it is not appropriate and potentially highly destabilizing for it to go in as a freeze exception now.

[1] https://meetbot-raw.fedoraproject.org/fedora-blocker-review/2018-09-04/f29-blocker-review.2018-09-04-16.01.txt

Comment 6 Ben Cotton 2018-09-11 19:37:06 UTC
FESCo has voted to defer this change to Fedora 30. https://pagure.io/fesco/issue/1985#comment-530134

Comment 7 Adam Williamson 2018-11-29 19:45:55 UTC
So this Change is now active in F30, with some rather significant differences in implementation compared to the current state of the page:

1. Now, the package called 'grubby' contains the BLS implementation, and the new package called 'grubby-deprecated' contains the old non-BLS implementation. The only way to get grubby-deprecated installed is manually; it does not obsolete or provide anything else and it is not pulled in by any default install package sets (except on 32-bit ARM).

2. Contrary to what the Change page states, on upgrade to F30 *systems are automatically converted to BLS*. There is a call to grub2-switch-to-blscfg / zipl-switch-to-blscfg in grubby %post.

I'd recommend the Change page be updated to reflect the current reality.

There is an issue we identified today with 32-bit ARM systems on upgrade. I have just filed that as https://bugzilla.redhat.com/show_bug.cgi?id=1654841 .

Comment 8 Javier Martinez Canillas 2018-12-03 11:51:16 UTC
Thanks Adam for pointing out these. I've updated the Change page to reflect the latest reality with regard to F30 upgrades and also added a section explaining that BLS won't be available for ARMv7.

Comment 9 Jan Pokorný [poki] 2018-12-03 22:59:37 UTC
To set the context, I've just found this bug when suffering from
a breakage in Fedora.

And my problem is that /etc/sysconfig/grub file ceased to be
considered (GRUB_CMDLINE_LINUX variable in particular) as authoritative,
as I've discovered the setting from here is not picked up when
installing new kernel.

I intentionally speak about /etc/sysconfig/grub and not 
/etc/default/grub, the only file mentioned in the discussion,
even if the former is a symlink for the latter.  But PLEASE,
consider that whatever in /etc/sysconfig subtree is perceived,
by both RHEL-aligned developers and users/administrators, as
an over the years purposedly built safe space, actively cared
about by the distribution(!).

Therefore, I find it rather rude that an update in Fedora
(in Rawhide, but anyway) will just silently drop

  GRUB_ENABLE_BLSCFG=true

line in it (I hadn't noticed until I did carefully read said Fedora
change), de facto deprecating that file completely, without:

  - caring to put some indicative header in there
    ("hey you, forget about changing anything in this
     file, e.g., use 'grubby --args' instead of configuring
     GRUB_CMDLINE_LINUX variable here" -- is that correct,
     btw.? -- "or, if you intend to use this original
     configuration schema, please drop GRUB_ENABLE_BLSCFG
     setting lower in this file and [...]");
    to repeat it, /etc/sysconfig is a "comfort zone" of system
    configuration, please play nicely, especially when
    botched boot process may be an outcome here

  - emitting a single line about this happening during
    grubby (I suppose) update -- I do update the packages
    from command-line so I guess I'd notice -- also I vaguely
    remember that "dnf history info" would show any stderr
    lines emitted during update, and I don't see any such


Thanks for considering this feedback to make the experience of those
eventually updating to F30 substantially more pleasant.

* * *

One more fallout I now observe is a broken kernel boot order -- it will
regularly attempt to boot the oldest one -- which I only accidentally
noticed (thanks to Hans's effort to make the boot process more
streamlined incl. some visually observable changes in kernel during
early boot; consider this verges on a security issue when it happens
to be a vulnerable kernel!, FWIW. I keep a set of around 6 past
kernels around should any regression occur).

This happens with (if not started with earlier packages and never
fixed up later):

> grubby-8.40-22.fc30.x86_64
> grub2-common-2.02-64.fc30.noarch
> dracut-049-11.git20181024.fc30.x86_64

and said silent "GRUB_ENABLE_BLSCFG=true".

Thanks.

Comment 10 Javier Martinez Canillas 2018-12-05 23:04:44 UTC
(In reply to Jan Pokorný [poki] from comment #9)
> To set the context, I've just found this bug when suffering from
> a breakage in Fedora.
> 
> And my problem is that /etc/sysconfig/grub file ceased to be
> considered (GRUB_CMDLINE_LINUX variable in particular) as authoritative,
> as I've discovered the setting from here is not picked up when
> installing new kernel.
> 

On a BLS configuration the menu entries are not defined in the grub.cfg file but instead as BLS snippets in /boot/loader/entries. So re-generating the grub.cfg has no effect on the menu entries or their kernel cmdline parameters.

> 
> line in it (I hadn't noticed until I did carefully read said Fedora
> change), de facto deprecating that file completely, without:
> 
>   - caring to put some indicative header in there
>     ("hey you, forget about changing anything in this
>      file, e.g., use 'grubby --args' instead of configuring
>      GRUB_CMDLINE_LINUX variable here" -- is that correct,
>      btw.? -- "or, if you intend to use this original
>      configuration schema, please drop GRUB_ENABLE_BLSCFG
>      setting lower in this file and [...]");
>     to repeat it, /etc/sysconfig is a "comfort zone" of system
>     configuration, please play nicely, especially when
>     botched boot process may be an outcome here
>

Agreed, I've proposed the following change to Anaconda (that generates the /etc/default/grub) adding a comment that explains that GRUB_CMDLINE_LINUX can't be used to modify the cmdline on a BLS configuration and that instead the grub2-editenv tool should be used to modify the kernelopts environment variable that's defined in /boot/grub2/grubenv.

https://github.com/rhinstaller/anaconda/pull/1717

> 
> One more fallout I now observe is a broken kernel boot order -- it will
> regularly attempt to boot the oldest one -- which I only accidentally
> noticed (thanks to Hans's effort to make the boot process more
> streamlined incl. some visually observable changes in kernel during
> early boot; consider this verges on a security issue when it happens
> to be a vulnerable kernel!, FWIW. I keep a set of around 6 past
> kernels around should any regression occur).
> 

Can you please provide the information I asked in bug 1653434, comment 2 ? We can move the discussion of this issue to bug 1653434.

Comment 11 Jan Pokorný [poki] 2018-12-06 14:27:04 UTC
> Agreed, I've proposed the following change to Anaconda 

Thanks!  But the system upgrade scenario should also receive
a similar treatment, since for those the Anaconda explaining
mitigation won't fly.  And since "GRUB_ENABLE_BLSCFG=true" is
already an active modification, would it be too intrusive to
prepend the finally agreed on header there as well if not
detected already?

> Can you please provide the information I asked in
> bug 1653434, comment 2 ? We can move the discussion of this issue
> to bug 1653434.

Sure, I will follow up there, just wasn't sure I am hitting the same
problem as OP.

Comment 12 Ben Cotton 2019-02-19 17:12:18 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 30 development cycle.
Changing version to '30.

Comment 13 Ben Cotton 2019-03-05 21:50:00 UTC
We have reached the Code Complete (100%) milestone in the Fedora 30 development cycle. At this point, all Changes should be fully code complete and ready for testing during the beta freeze. If your Change has reached this milestone, please set the status to ON_QA. If it has not, this Change will be submitted to FESCo to evaluate the contigency plan and decide if the Change will continue in the Fedora 30 cycle.

Comment 14 Daniel Miranda 2019-03-06 09:31:15 UTC
Here's a report from testing this change in my F29 system. Admittedly, I have a bit of an unusual setup, but it seems it makes for a good test subject.

I have my EFI system partition in one device (SATA SSD) and /boot and / in another (PCIe SSD). I'm forced to do this since my motherboard (ASUS P8P67 PRO) cannot boot directly from PCIe devices. GRUB also does not recognize the PCIe SSD by default, requiring me to add a call to `nativedisk ahci` early in the config to fix it; but that should be inconsequential to my observations.

After running `grub2-switch-to-blscfg` everything seemed to work, with the exception of my custom kernel arguments. I initially assumed `/etc/default/grub` no longer worked for setting them, but that is not the case; I can see my value from there correctly written to `/boot/grub2/grubenv` in the `kernelopts` variable. But that environment file is never loaded by GRUB.

Looking at `/etc/grub.d/10_linux`, I believe the logic for loading the aforementioned env-file is wrong.

a) The dirname of the kernel image is consulted to figure out which device to prepare for access. GRUB_DEVICE is used if the dirname is `/`, and GRUB_DEVICE_BOOT otherwise. But when booting from a complex root filesystem, GRUB_DEVICE points to the device containing the initramfs, and GRUB_DEVICE_BOOT possibly to something else (in my case, the EFI partition). Since my kernel images reside in `/boot`, the wrong device is chosen and GRUB fails to find the env-file.
b) Even if the correct device was chosen, the env-file is loaded with a path of `${prefix}/grubenv`, but `prefix` is not set by the device search commands above. The correct way to access the env file is to use the `make_path_relative_to_its_root` function and avoid `prefix` altogether.

Given all these issues, I'm tempted to ask: why store BLS parameters in the GRUB environment? Wouldn't just adding a snippet to `grub.cfg` that directly sets `kernelopts` and `blsdir` achieve the same results with less hassle?

Comment 15 Daniel Miranda 2019-03-06 09:37:58 UTC
I just realized my issues might be caused by lacking the most recent updates to the BLS support from F30. I'll try to install the GRUB packages from F30 and see if the issues persist.

Comment 16 Javier Martinez Canillas 2019-03-11 13:50:14 UTC
Hello Daniel,

(In reply to Daniel Miranda from comment #15)
> I just realized my issues might be caused by lacking the most recent updates
> to the BLS support from F30. I'll try to install the GRUB packages from F30
> and see if the issues persist.

Did it work for you on F30? There were a lot of BLS fixes for F30.

Comment 17 Steven Haigh 2019-04-29 05:02:58 UTC
As a note, the new BLSCFG type entries kill Fedora 30 running as a Xen DomU with pygrub as the bootloader.

https://bugzilla.redhat.com/show_bug.cgi?id=1703700

In a nutshell, as pygrub parses a grub.cfg it looks for in multiple path locations to start the guest kernel, with this not present, most Xen DomU installs will fail upon upgrading to F30.

I hit this issue with *every* F29 -> F30 upgrade.

There should be some provision to set GRUB_ENABLE_BLSCFG=false on any package script that can run if it is detected running as a Xen DomU. This may require using virt-what to figure out (or some other method).

There is a related issue that the GRUB_DEFAULT=0 option does not seem to apply correctly when generating a grub.cfg now. Both are covered in the above bug report.

Comment 18 John 2019-07-14 00:57:35 UTC
Are there plans to upstream this patch to GRUB, or will it remain a Fedora/RHEL specific feature?

Comment 19 Javier Martinez Canillas 2019-07-15 08:16:17 UTC
(In reply to John from comment #18)
> Are there plans to upstream this patch to GRUB, or will it remain a
> Fedora/RHEL specific feature?

Yes, there are plans to upstream this patch and the others that we are carrying. I'll update the package to the recently released grub-2.04 and then will focus on upstreaming our patches.


Note You need to log in before you can comment on or make changes to this bug.