Bug 1652806 - after move to bls grub can't find blscfg
Summary: after move to bls grub can't find blscfg
Keywords:
Status: MODIFIED
Alias: None
Product: Fedora
Classification: Fedora
Component: grub2
Version: 30
Hardware: i686
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Peter Jones
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: AcceptedFreezeException
: 1678445 1686059 1695701 1698227 1705927 1706094 1706177 1706447 1707199 1708377 1720911 (view as bug list)
Depends On:
Blocks: F30BetaBlocker F30BetaFreezeException
TreeView+ depends on / blocked
 
Reported: 2018-11-23 06:03 UTC by Thorsten Leemhuis
Modified: 2019-11-09 13:08 UTC (History)
45 users (show)

Fixed In Version: grub2-2.02-76.fc30
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-04-27 21:25:02 UTC


Attachments (Terms of Use)
wasa /boot/grub2/grubenv (1.00 KB, text/plain)
2019-03-24 09:59 UTC, Juha Tuomala
no flags Details
wasa /boot/grub2/grub.cfg (5.52 KB, text/plain)
2019-03-24 10:01 UTC, Juha Tuomala
no flags Details
wasa /boot/loader/entries/* (379 bytes, text/plain)
2019-03-24 10:04 UTC, Juha Tuomala
no flags Details
wasa /boot/loader/entries/* (258 bytes, text/plain)
2019-03-24 10:04 UTC, Juha Tuomala
no flags Details
wasa /boot/loader/entries/* (245 bytes, text/plain)
2019-03-24 10:04 UTC, Juha Tuomala
no flags Details
wasa /boot/loader/entries/* (245 bytes, text/plain)
2019-03-24 10:04 UTC, Juha Tuomala
no flags Details
the generated grub.cfg in /tmp/ which is still defective (5.88 KB, text/plain)
2019-05-17 18:25 UTC, Claude Frantz
no flags Details
The grub.cfg finally generated after the mentioned steps (5.88 KB, text/plain)
2019-05-18 14:08 UTC, Claude Frantz
no flags Details

Description Thorsten Leemhuis 2018-11-23 06:03:08 UTC
In a x86-32 VM the recent move to bls failed, as grub shows me just a command prompt on boot. By loading the config file manually with

configfile /grub2/grub.cfg

I get to see three error messages:

error: file '/grub2/grubenv' not found
error: file '/grub2/grubenv' not found
error: can't find command 'blscfg'.

I have no idea why it can't find those, the env file and the grub module (i386-pc/blscfg.mod) are there. I can make the machine boot by loading the old config file using the configfile command. 

The system is booting classical (so no EFI) and yes, it really has a separate /boot/. 

Additional info:
I have a similarly configured x86-64 VM and there the bls move worked fine.

Comment 1 Thorsten Leemhuis 2019-02-17 15:23:20 UTC
happen to me again today when updating to grub2-1:2.02-70.fc30.noarch When typing "configfile /grub2/grub.cfg" in the grub shell I to see four error messages now:
"""
error: file '/grub2/grubenv' not found
error: file '/grub2/i386-pc/increment.mod' not found
error: file '/grub2/grubenv' not found
error: can't find command 'blscfg'.
"""

P.S.: Added BLS driver javierm@redhat.com to CC

Comment 2 Javier Martinez Canillas 2019-02-18 12:49:24 UTC
Hello Thorsten,

I'm not able to reproduce this issue. A couple of questions:

1) Was this when upgrading from a previous Fedora release like mentioned in the bug description or just when upgrading grub2 to a new version already in Rawhide?

2) Does increment.mod exist in /boot/grub2/i386-pc/ ?

3) Does configfile /grub2/grub.cfg.rpmsave works?

4) Could you please try if grub2-install /dev/sda (or whatever is the block device where grub2 is installed) fixes the issue.

Comment 3 Thorsten Leemhuis 2019-02-21 19:44:52 UTC
(In reply to Javier Martinez Canillas from comment #2)
> 1) Was this when upgrading from a previous Fedora release like mentioned in
> the bug description or just when upgrading grub2 to a new version already in
> Rawhide?

This is a rawhide test vm and that happend during a a normal update where grub2 and grubby got updated
 
> 2) Does increment.mod exist in /boot/grub2/i386-pc/ ?

Nope

> 3) Does configfile /grub2/grub.cfg.rpmsave works?

Yes.

> 4) Could you please try if grub2-install /dev/sda (or whatever is the block
> device where grub2 is installed) fixes the issue.

That fixes it for me. 

Side note: This is a VM and I have a snapshot, so I can reproduce the issue.

Comment 4 Javier Martinez Canillas 2019-02-25 14:05:37 UTC
(In reply to Thorsten Leemhuis from comment #3)
> (In reply to Javier Martinez Canillas from comment #2)
> > 1) Was this when upgrading from a previous Fedora release like mentioned in
> > the bug description or just when upgrading grub2 to a new version already in
> > Rawhide?
> 
> This is a rawhide test vm and that happend during a a normal update where
> grub2 and grubby got updated
>  
> > 2) Does increment.mod exist in /boot/grub2/i386-pc/ ?
> 
> Nope
>

What was the Fedora version originally installed in this VM? I think the problem is that for legacy bios neither GRUB nor its modules are updated when a the package is upgraded. That's what causing the following error messages:

error: file '/grub2/i386-pc/increment.mod' not found
error: can't find command 'blscfg'

The increment.mod was added in F29 and the blscfg command in F28.

What I don't understand is what's causing the following:

error: file '/grub2/grubenv' not found

I assume that there's an /boot/grub2/grubenv file but grub2 just fails to find it?

Can you please share check what's the value of the $prefix grub env var?

> > 3) Does configfile /grub2/grub.cfg.rpmsave works?
> 
> Yes.
> 
> > 4) Could you please try if grub2-install /dev/sda (or whatever is the block
> > device where grub2 is installed) fixes the issue.
> 
> That fixes it for me. 
> 
> Side note: This is a VM and I have a snapshot, so I can reproduce the issue.

Thanks, could you please boot using the grub.cfg.rpmsave config file and then do:

$ cp /usr/lib/grub/i386-pc/increment.mod /boot/grub2/i386-pc/
$ cp /usr/lib/grub/i386-pc/blscfg.mod /boot/grub2/i386-pc/
$ grub2-mkconfig -o /boot/grub2/grub.cfg

And check if that's enough to make it work?

Comment 5 Thorsten Leemhuis 2019-02-25 19:14:46 UTC
(In reply to Javier Martinez Canillas from comment #4)
>
> What was the Fedora version originally installed in this VM?

It's a few years old, can't remember for sure, but the files in /boot/grub2/i386-pc/ are from September 2015... #timeflys

> I think the
> problem is that for legacy bios neither GRUB nor its modules are updated
> when a the package is upgraded.

Yeah, seems likely afaics.

> I assume that there's an /boot/grub2/grubenv file 

Yes, there is one.

> but grub2 just fails to find it?

Looks like it. Or is it possible the error messages is displayed wrongly ( e.g. in situations when in fact there was a different error)?

> Can you please share check what's the value of the $prefix grub env var?

(hd0,msdos1)/grub2/ (which is correct, it's /dev/vda1 in the running system and it's the only disk)

> $ cp /usr/lib/grub/i386-pc/increment.mod /boot/grub2/i386-pc/
> $ cp /usr/lib/grub/i386-pc/blscfg.mod /boot/grub2/i386-pc/
> $ grub2-mkconfig -o /boot/grub2/grub.cfg
> 
> And check if that's enough to make it work?

Hmmm. It's enough to make the kernel load, but it doesn't boot as dracut fails to find the root fs (didn't investigate further why) :-/

Side note, just in case it matters (I find it a bit strange): I have a similar x86-64 test vm that should be about the same age, but there the conversation to bls went fine.

Comment 6 Javier Martinez Canillas 2019-02-26 12:38:29 UTC
(In reply to Thorsten Leemhuis from comment #5)
> (In reply to Javier Martinez Canillas from comment #4)
> >
> > What was the Fedora version originally installed in this VM?
> 
> It's a few years old, can't remember for sure, but the files in
> /boot/grub2/i386-pc/ are from September 2015... #timeflys
> 
> > I think the
> > problem is that for legacy bios neither GRUB nor its modules are updated
> > when a the package is upgraded.
> 
> Yeah, seems likely afaics.
> 
> > I assume that there's an /boot/grub2/grubenv file 
> 
> Yes, there is one.
> 
> > but grub2 just fails to find it?
> 
> Looks like it. Or is it possible the error messages is displayed wrongly (
> e.g. in situations when in fact there was a different error)?
> 
> > Can you please share check what's the value of the $prefix grub env var?
> 
> (hd0,msdos1)/grub2/ (which is correct, it's /dev/vda1 in the running system
> and it's the only disk)
> 
> > $ cp /usr/lib/grub/i386-pc/increment.mod /boot/grub2/i386-pc/
> > $ cp /usr/lib/grub/i386-pc/blscfg.mod /boot/grub2/i386-pc/
> > $ grub2-mkconfig -o /boot/grub2/grub.cfg
> > 
> > And check if that's enough to make it work?
> 
> Hmmm. It's enough to make the kernel load, but it doesn't boot as dracut

Thanks for testing, I'll make sure that the increment.mod module is copied when switching to a BLS configuration.

> fails to find the root fs (didn't investigate further why) :-/
> 

I think that fails because the root param isn't set correctly (due $kernelopts being defined in grubenv but your grub failing to load it).

I'm assuming that copying these modules didn't make fix the following error:

error: file '/grub2/grubenv' not found

Can you please check your kernel command line args on the failed boot?

I'll make more robust so there's a default cmdline even without a grubenv.

> Side note, just in case it matters (I find it a bit strange): I have a
> similar x86-64 test vm that should be about the same age, but there the
> conversation to bls went fine.

Yes, I also tried to reproduce your issue but was not able to do it.

Comment 7 Thorsten Leemhuis 2019-02-26 18:26:00 UTC
(In reply to Javier Martinez Canillas from comment #6)
>
> Thanks for testing, I'll make sure that the increment.mod module is copied
> when switching to a BLS configuration.

thx!
 
> I'm assuming that copying these modules didn't make fix the following error:
> error: file '/grub2/grubenv' not found

Found the reason for that, it for some reasons (maybe I fiddled with it, but I doubt that) it was a full link instead of a relative link:

# ls -l /boot/grub2/grubenv 
lrwxrwxrwx. 1 root root 28  5. Sep 2015  /boot/grub2/grubenv -> /boot/efi/EFI/fedora/grubenv

> Yes, I also tried to reproduce your issue but was not able to do it.

thx, don't invest too much time, if it's something that only I'm seeing, I'm fine with the outcome as it is.

Comment 8 Javier Martinez Canillas 2019-02-27 14:44:33 UTC
(In reply to Thorsten Leemhuis from comment #7)

[snip]

> 
> Found the reason for that, it for some reasons (maybe I fiddled with it, but
> I doubt that) it was a full link instead of a relative link:
> 
> # ls -l /boot/grub2/grubenv 
> lrwxrwxrwx. 1 root root 28  5. Sep 2015  /boot/grub2/grubenv ->
> /boot/efi/EFI/fedora/grubenv
>

I see. As mentioned in the previous comment, I'll make grub2-mkconfig to set a default kernelopts in the grub.cfg so the machine can boot even without a grubenv.

Comment 9 Laurent Wandrebeck 2019-03-03 09:11:59 UTC
It looks like I’ve hit something similar when updating (yesterday) from F29 to F30 using gnome-software on bare metal.

3) Does configfile /grub2/grub.cfg.rpmsave works?

yup

4) Could you please try if grub2-install /dev/sda (or whatever is the block device where grub2 is installed) fixes the issue.

indeed.

HTH,

Comment 10 Javier Martinez Canillas 2019-03-11 10:55:15 UTC
*** Bug 1678445 has been marked as a duplicate of this bug. ***

Comment 11 Fedora Update System 2019-03-11 12:51:23 UTC
grub2-2.02-72.fc30 grubby-8.40-29.fc30 has been submitted as an update to Fedora 30. https://bodhi.fedoraproject.org/updates/FEDORA-2019-80aeacc4a7

Comment 12 Fedora Blocker Bugs Application 2019-03-11 13:58:46 UTC
Proposed as a Blocker and Freeze Exception for 30-beta by Fedora user pwalter using the blocker tracking app because:

 Breaks distro upgrades to F30.

Comment 13 Fedora Update System 2019-03-11 14:42:09 UTC
grub2-2.02-72.fc30, grubby-8.40-29.fc30 has been pushed to the Fedora 30 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-80aeacc4a7

Comment 14 Adam Williamson 2019-03-11 16:24:07 UTC
Javier, in the blocker review meeting, we found it pretty hard to follow what's going on here. The bug report talks about issues with files unexpectedly in different locations and stuff, and seems to be possibly related to x86-32 or being an older install that's been upgraded over time. There are also a bunch of 'duplicates' of this bug. However, there's no clear explanation (unless I missed it) about what precisely you think "the bug" here is, and if all the things closed as dupes are actually the exact same problem.

Then, if we look at the commit that claims to fix this bug, it says this:

"
    Switch to BLS in tools package %post scriptlet
    
    The switch to a BLS configuration was made before in the grubby package
    %post scriptlet, but this is wrong since it means that a not up-do-date
    grub2-switch-to-blscfg script could be used to do the switch.
    
    Resolves: rhbz#1652806
"

It's not very clear to us how that change is definitely related to the initial bug reported here, or all/any of the dupes.

Could you join the dots a bit more clearly for us to understand what's going on here? Thanks!

Comment 15 Javier Martinez Canillas 2019-03-11 17:03:34 UTC
Hello Adam,

Sorry for not being clear, I'll try to explain in detail what's going on here.

For EFI installs, the grubx64.efi binary is updated into the ESP and since we build the GRUB modules into the EFI binary, everything is updated when the grub2-efi is package is upgraded.

But that's not the case for legacy BIOS, in this case GRUB is installed in the MBR (fist stage GRUB) and the gap between the MBR and the first partition (second stage GRUB) and these are never touched when the grub2-pc package is installed. The user needs to run grub2-install explicitly in order to update the first and second stage GRUB. My understanding is that we used to update GRUB for legacy BIOS in the past but it was just too fragile.

The modules are not updated for legacy BIOS installs either, the grub2-pc-modules package installs the modules in /usr/lib/grub/i386-pc but grub2-install has to be executed to install the modules in /boot/grub2/i386-pc where GRUB can find them.

So for legacy BIOS installs, the /usr/lib/grub/i386-pc/blscfg.mod (that contains the BLS support) needs to be copied to /boot/grub2/i386-pc to allow GRUB use the blscfg command that populates the menu entries from the BLS snippets. The latest version of the grub2-switch-to-blscfg script does this (and also copies the increment.mod needed for the recent default entry fallback support). That's why the problem was only reported for legacy BIOS, EFI doesn't have this problem because GRUB is always updated as mentioned above.

The bug was that the switch to BLS was done in the grubby %post scriptlet, but that was wrong and had to be done in the grub2-tools %post scriptlet. Otherwise if the grubby package was installed before the grub2-tools package (that contains the grub2-switch-to-blscfg script), the switch is done using an older version of the grub2-switch-to-blscfg script that doesn't copy the blscfg and increment GRUB modules.

That's why the errors in Comment 1, are:

error: file '/grub2/i386-pc/increment.mod' not found
error: can't find command 'blscfg'

The grubenv error was a different one as Thorsten mentioned in Comment, 7.

The reason why it worked for some people and failed for others is because dnf package installation order is non-deterministic. So if grub2-tools happened to be installed before the grubby package, things would work correctly.

I hope things are more clear now, please let me know if something is still confusing in this explanation.

Comment 16 Geoffrey Marr 2019-03-11 20:21:22 UTC
Discussed during the 2019-03-11 blocker review meeting: [1]

The decision to classify this bug as an "AcceptedFreezeException" and delay the classification of this bug as a blocker was made as there's some uncertainty around precisely what bug(s) the reporter here has, and in what configuration they appear, versus what the claimed fix in the package actually does. But we at least agree that it seems worth taking as an FE, and will ask Javier for more details in the bug.

[1] https://meetbot.fedoraproject.org/fedora-blocker-review/2019-03-11/f30-blocker-review.2019-03-11-16.04.txt

Comment 17 Fedora Update System 2019-03-12 20:00:50 UTC
grub2-2.02-72.fc30, grubby-8.40-29.fc30 has been pushed to the Fedora 30 stable repository. If problems still persist, please make note of it in this bug report.

Comment 18 Juha Tuomala 2019-03-23 19:39:03 UTC
(In reply to Fedora Update System from comment #17)
> grub2-2.02-72.fc30, grubby-8.40-29.fc30 has been pushed to the Fedora 30
> stable repository. If problems still persist, please make note of it in this
> bug report.

Last night upgraded from f29 -> f30. System has grub2-pc-2.02-72.fc30.x86_64 and when I came back to see the upgrade, screen read that "it's finalizing and then rebooting the machine" or something. It probably had been like so for hours. No disk activity, i restarted the box and it falls into grub> menu.

After figuring it out how to manually start from grub shell, I got this back up and running. Running the grub2-mkconfig, it makes configs without Fedora entries like people here before reported.

Something is still broken.

# rpm -qa|grep grub
grub2-pc-modules-2.02-72.fc30.noarch
grub2-tools-2.02-72.fc30.x86_64
grub2-tools-extra-2.02-72.fc30.x86_64
grub2-pc-2.02-72.fc30.x86_64
grub2-common-2.02-72.fc30.noarch
grub2-tools-efi-2.02-72.fc30.x86_64
grubby-8.40-30.fc30.x86_64
grub2-tools-minimal-2.02-72.fc30.x86_64

Comment 19 Javier Martinez Canillas 2019-03-23 20:16:57 UTC
The GRUB config not having the menu entries anymore is expected, getting into a GRUB prompt on boot is not.

Is this an EFI or legacy BIOS install? Can you please share your grubenv, grub.cfg content and filed in /boot/loader/entries ?

Comment 20 Juha Tuomala 2019-03-24 09:59:47 UTC
(In reply to Javier Martinez Canillas from comment #19)
> The GRUB config not having the menu entries anymore is expected, getting
> into a GRUB prompt on boot is not.
> 
> Is this an EFI or legacy BIOS install? 

BIOS. Very common Dell box:

    description: Desktop Computer
    product: OptiPlex 980
    vendor: Dell Inc.


> Can you please share your grubenv, grub.cfg content and filed in /boot/loader/entries ?

Sure.

Comment 21 Juha Tuomala 2019-03-24 09:59:54 UTC
Created attachment 1547416 [details]
wasa /boot/grub2/grubenv

Comment 22 Juha Tuomala 2019-03-24 10:01:12 UTC
Created attachment 1547417 [details]
wasa /boot/grub2/grub.cfg

Comment 23 Juha Tuomala 2019-03-24 10:04:04 UTC
Created attachment 1547418 [details]
wasa /boot/loader/entries/*

Comment 24 Juha Tuomala 2019-03-24 10:04:23 UTC
Created attachment 1547419 [details]
wasa /boot/loader/entries/*

Comment 25 Juha Tuomala 2019-03-24 10:04:40 UTC
Created attachment 1547420 [details]
wasa /boot/loader/entries/*

Comment 26 Juha Tuomala 2019-03-24 10:04:59 UTC
Created attachment 1547421 [details]
wasa /boot/loader/entries/*

Comment 27 Juha Tuomala 2019-03-24 10:09:01 UTC
Note that when i boot this now manually from grub>, i need to type:

insmod lvm
linux16 (hd0,msdos1)/vmlinuz-5.0.3-300.fc30.x86_64 root=/dev/mapper/fedora_wasa-root ro rd.lvm.lv=fedora_wasa/root
initrd16 (hd0,msdos1)/initramfs-5.0.3-300.fc30.x86_64.img
set root=(lvm,fedora_wasa-root)/
boot

Comment 28 Javier Martinez Canillas 2019-03-24 10:14:19 UTC
what happen when you execute from the GRUB prompt?

grub> blscfg

You should also be able to use your old GRUB config file with:

grub> configfile /grub2/grub.cfg.rpmsave

If the blscfg command doesn't work, can you please set debug with

grub> set debug=blscfg 

and then execute blscfg again.

If blscfg doesn't work, can you try if updating to the latest version makes it work:

$ cp /usr/lib/grub/i386-pc/blscfg.mod /boot/grub2/i386-pc/

If that also fail, can you please try updating also the GRUB core:

$ grub2-install /dev/sda (or whatever is your block device where GRUB is installed).

Comment 29 Juha Tuomala 2019-03-24 10:28:19 UTC
(In reply to Javier Martinez Canillas from comment #28)
> what happen when you execute from the GRUB prompt?
> 
> grub> blscfg

Nothing, I just get a new, empty grub>.

grub> blscfg<enter>
grub>


> You should also be able to use your old GRUB config file with:
> 
> grub> configfile /grub2/grub.cfg.rpmsave
> 
> If the blscfg command doesn't work, can you please set debug with
> 
> grub> set debug=blscfg 
> 
> and then execute blscfg again.
> 
> If blscfg doesn't work, 

I did, it says:
- opens disk
- probes fs
- scans blsdir 'entries'
  * finds all config files
- bls_create_entries: Creating entries from bls

And we're back in grub> shell.

> can you try if updating to the latest version makes
> it work:
> 
> $ cp /usr/lib/grub/i386-pc/blscfg.mod /boot/grub2/i386-pc/
> 
> If that also fail, can you please try updating also the GRUB core:
> 
> $ grub2-install /dev/sda (or whatever is your block device where GRUB is
> installed).

- if it now has the configuration snipplets, how does the menu that we always see, start?
- what the bls stand for? Boot Loader ?

Something positive. This box has been installed as F20 at 2014 and at some point the runlevel tools broke so that it never rebooted cleanly, but hang up when it was going down. Past couple weeks when I updated first f29 and now f30, that problem is now gone, it cycles cleanly.

Comment 30 Juha Tuomala 2019-03-24 10:31:25 UTC
Also note, that after we executed this blscfg and I say ls, it only lists 

  (hd0) (hd0,msdos2) (hd0,msdos1) (fd0)

and after giving 'insmod lvm', I can see the lvm volumes too. That's where the root and swap are.

Comment 31 Juha Tuomala 2019-03-24 10:35:53 UTC
(In reply to Juha Tuomala from comment #29)
> - if it now has the configuration snipplets, how does the menu that we
> always see, start?

with command 'normal' apparently. It prints that blscfg output *very quickly* and clears it from screen. And returns into grub> shell. So apparently something is wrong in configs. If I manually insmod lvm, it doesn't make difference.

Comment 32 Javier Martinez Canillas 2019-03-24 10:38:12 UTC
(In reply to Juha Tuomala from comment #29)
> (In reply to Javier Martinez Canillas from comment #28)
> > what happen when you execute from the GRUB prompt?
> > 
> > grub> blscfg
> 
> Nothing, I just get a new, empty grub>.
> 
> grub> blscfg<enter>
> grub>
>

Yes, you should press Esc after running that command to check if the GRUB menu was correctly populated from the BLS snippets.
 
> 
> > You should also be able to use your old GRUB config file with:
> > 
> > grub> configfile /grub2/grub.cfg.rpmsave
> > 
> > If the blscfg command doesn't work, can you please set debug with
> > 
> > grub> set debug=blscfg 
> > 
> > and then execute blscfg again.
> > 
> > If blscfg doesn't work, 
> 
> I did, it says:
> - opens disk
> - probes fs
> - scans blsdir 'entries'
>   * finds all config files
> - bls_create_entries: Creating entries from bls
>

It looks as it should work then.
 
> And we're back in grub> shell.
> 
> > can you try if updating to the latest version makes
> > it work:
> > 
> > $ cp /usr/lib/grub/i386-pc/blscfg.mod /boot/grub2/i386-pc/
> > 
> > If that also fail, can you please try updating also the GRUB core:
> > 
> > $ grub2-install /dev/sda (or whatever is your block device where GRUB is
> > installed).
> 
> - if it now has the configuration snipplets, how does the menu that we
> always see, start?
> - what the bls stand for? Boot Loader ?
>

Boot Loader Spec, because we are using the BLS file format to define the boot entries.
 
> Something positive. This box has been installed as F20 at 2014 and at some
> point the runlevel tools broke so that it never rebooted cleanly, but hang
> up when it was going down. Past couple weeks when I updated first f29 and
> now f30, that problem is now gone, it cycles cleanly.

Comment 33 Juha Tuomala 2019-03-24 10:38:37 UTC

(In reply to Javier Martinez Canillas from comment #28)
> You should also be able to use your old GRUB config file with:
> 
> grub> configfile /grub2/grub.cfg.rpmsave

Yes, that brings up the menu. However those entries don't work anymore as kernels are not in the system anymore.

Comment 34 Juha Tuomala 2019-03-24 10:41:41 UTC
(In reply to Javier Martinez Canillas from comment #32)
> > > grub> blscfg
> > 
> > Nothing, I just get a new, empty grub>.
> > 
> > grub> blscfg<enter>
> > grub>
> >
> 
> Yes, you should press Esc after running that command to check if the GRUB
> menu was correctly populated from the BLS snippets.

Hitting Esc just gives a new, empty grub> shell prompt.

Comment 35 Juha Tuomala 2019-03-24 10:52:05 UTC
Can I somehow run these routines and get the menu up in running system? Booting this manually is really tedious and prone to errors (requires powercycle).

Comment 36 Javier Martinez Canillas 2019-03-24 10:57:43 UTC
(In reply to Juha Tuomala from comment #34)
> (In reply to Javier Martinez Canillas from comment #32)
> > > > grub> blscfg
> > > 
> > > Nothing, I just get a new, empty grub>.
> > > 
> > > grub> blscfg<enter>
> > > grub>
> > >
> > 
> > Yes, you should press Esc after running that command to check if the GRUB
> > menu was correctly populated from the BLS snippets.
> 
> Hitting Esc just gives a new, empty grub> shell prompt.

Ok, this is a different issue than the one originally reported in this bugzilla then. Since it was about the blscfg module not being installed, but you have the latest version of it (from what the debug log says) but for some reason the menu is not being populated.

You can disable BLS support by setting GRUB_ENABLE_BLSCFG=false in /etc/default/grub and re-generating your GRUB config file with:

$ grub2-mkconfig -o /boot/grub2/grub.cfg

You will need the grubby-deprecated package to be able to install new kernels.

Comment 37 Juha Tuomala 2019-03-24 11:02:08 UTC

(In reply to Javier Martinez Canillas from comment #36)
> Ok, this is a different issue than the one originally reported in this
> bugzilla then. Since it was about the blscfg module not being installed, but
> you have the latest version of it (from what the debug log says) but for
> some reason the menu is not being populated.

Yes.

> You can disable BLS support by setting GRUB_ENABLE_BLSCFG=false in
> /etc/default/grub and re-generating your GRUB config file with:
> 
> $ grub2-mkconfig -o /boot/grub2/grub.cfg

That at least generated entries into grub.cfg. 

> You will need the grubby-deprecated package to be able to install new
> kernels.

Installed, thanks. I learned a lot from grub.

Comment 38 Berend De Schouwer 2019-04-10 12:02:34 UTC
(In reply to Javier Martinez Canillas from comment #28)
> what happen when you execute from the GRUB prompt?
> 
> grub> blscfg

For me (not the original reporter), currently (up-to-date), F29 -> F30 upgrade, BIOS (well, broken EFI, so legacy boot):

grub> set debug=all
grub> blscfg

script/lexer.c:321: token 288 text [blscfg]
script/lexer.c:321: token 259 text [
]
script/lexer.c:321: token 0 text []
error: ../../grub-core/commands/blscfg.c:959:variable 'boot' isn't set.

Comment 39 Javier Martinez Canillas 2019-04-10 12:16:14 UTC
(In reply to Berend De Schouwer from comment #38)
> (In reply to Javier Martinez Canillas from comment #28)
> > what happen when you execute from the GRUB prompt?
> > 
> > grub> blscfg
> 
> For me (not the original reporter), currently (up-to-date), F29 -> F30
> upgrade, BIOS (well, broken EFI, so legacy boot):
> 
> grub> set debug=all
> grub> blscfg
> 
> script/lexer.c:321: token 288 text [blscfg]
> script/lexer.c:321: token 259 text [
> ]
> script/lexer.c:321: token 0 text []
> error: ../../grub-core/commands/blscfg.c:959:variable 'boot' isn't set.

Can you please share your grub.cfg file? The boot variable should be set there.

Comment 40 Berend De Schouwer 2019-04-10 12:26:23 UTC
(In reply to Javier Martinez Canillas from comment #39)
> (In reply to Berend De Schouwer from comment #38)
> > (In reply to Javier Martinez Canillas from comment #28)
> > > what happen when you execute from the GRUB prompt?
> > > 
> > > grub> blscfg
> > 
> > For me (not the original reporter), currently (up-to-date), F29 -> F30
> > upgrade, BIOS (well, broken EFI, so legacy boot):
> > 
> > grub> set debug=all
> > grub> blscfg
> > 
> > script/lexer.c:321: token 288 text [blscfg]
> > script/lexer.c:321: token 259 text [
> > ]
> > script/lexer.c:321: token 0 text []
> > error: ../../grub-core/commands/blscfg.c:959:variable 'boot' isn't set.
> 
> Can you please share your grub.cfg file? The boot variable should be set
> there.

Sorry, will have to redo.

If I set BLS, I get "set boot='hd0,msdos3'", but I get an empty grub.cfg, so I need to boot from a rescue USB stick and fix everything.

If I don't set BLS, I don't get "set boot=..."

I ran the debug with a non-BLS grub.cfg, and manually loaded the module to save time.

Comment 41 Javier Martinez Canillas 2019-04-10 12:32:07 UTC
(In reply to Berend De Schouwer from comment #40)
> (In reply to Javier Martinez Canillas from comment #39)
> > (In reply to Berend De Schouwer from comment #38)
> > > (In reply to Javier Martinez Canillas from comment #28)
> > > > what happen when you execute from the GRUB prompt?
> > > > 
> > > > grub> blscfg
> > > 
> > > For me (not the original reporter), currently (up-to-date), F29 -> F30
> > > upgrade, BIOS (well, broken EFI, so legacy boot):
> > > 
> > > grub> set debug=all
> > > grub> blscfg
> > > 
> > > script/lexer.c:321: token 288 text [blscfg]
> > > script/lexer.c:321: token 259 text [
> > > ]
> > > script/lexer.c:321: token 0 text []
> > > error: ../../grub-core/commands/blscfg.c:959:variable 'boot' isn't set.
> > 
> > Can you please share your grub.cfg file? The boot variable should be set
> > there.
> 
> Sorry, will have to redo.
> 

Ok, is not necessary. I was mostly interested in the boot variable there.

> If I set BLS, I get "set boot='hd0,msdos3'", but I get an empty grub.cfg, so
> I need to boot from a rescue USB stick and fix everything.
>

A grub.cfg without menu entries is not a bug, the blscfg command failing to populate the grub menu from the BLS files is though.
 
> If I don't set BLS, I don't get "set boot=..."
>
> I ran the debug with a non-BLS grub.cfg, and manually loaded the module to
> save time.

Right, if you generated your grub.cfg with BLS disabled then I think that the blscfg command won't work, unless you set the boot variable.

Comment 42 Berend De Schouwer 2019-04-10 12:43:55 UTC
(In reply to Javier Martinez Canillas from comment #41)
> (In reply to Berend De Schouwer from comment #40)
> > If I set BLS, I get "set boot='hd0,msdos3'", but I get an empty grub.cfg, so
> > I need to boot from a rescue USB stick and fix everything.
> >
> 
> A grub.cfg without menu entries is not a bug, the blscfg command failing to
> populate the grub menu from the BLS files is though.

That is what's happening, which is why I need to redo the test and get proper debug output.

F30 upgrade switched to BLS, and grub.cfg and the grub menu are empty even though /boot/loader/entries is populated.

Switched BLS off using a USB stick to be able to boot.


Can I just 'grub> set boot...' or do I need to wipe my grub.cfg, get debug, and then recover from a USB stick?

Comment 43 Javier Martinez Canillas 2019-04-10 12:47:19 UTC
(In reply to Berend De Schouwer from comment #42)
> (In reply to Javier Martinez Canillas from comment #41)
> > (In reply to Berend De Schouwer from comment #40)
> > > If I set BLS, I get "set boot='hd0,msdos3'", but I get an empty grub.cfg, so
> > > I need to boot from a rescue USB stick and fix everything.
> > >
> > 
> > A grub.cfg without menu entries is not a bug, the blscfg command failing to
> > populate the grub menu from the BLS files is though.
> 
> That is what's happening, which is why I need to redo the test and get
> proper debug output.
>

Ok, it shouldn't be needed to change your grub.cfg to test this.
 
> F30 upgrade switched to BLS, and grub.cfg and the grub menu are empty even
> though /boot/loader/entries is populated.
> 
> Switched BLS off using a USB stick to be able to boot.
> 
> 
> Can I just 'grub> set boot...' or do I need to wipe my grub.cfg, get debug,
> and then recover from a USB stick?

Yes, setting it from the grub prompt should work. Something like the following:

grub> set boot='hd0,msdos3'
grub> set debug=blscfg
grub> insmod blscfg
grub> blscfg

By the way, what was the Fedora release originally installed on this machine?

Comment 44 Berend De Schouwer 2019-04-10 13:00:39 UTC
(In reply to Javier Martinez Canillas from comment #43)
> By the way, what was the Fedora release originally installed on this machine?

Can't remember.  Guessing F22 (educated guess; 'rpm -qa | grep fc22' has results)

Comment 45 Berend De Schouwer 2019-04-12 08:57:01 UTC
I can't replicate it anymore.  Running with /etc/default/grub set to BLS all works as expected now, loading all the kernels from /boot/loader/entries.

So, for completion, I'm adding what I think happened:
- have F22, upgrade, upgrade, upgrade...
- have F29
- have xen installed, but not used (I sometimes run vms for dev)
- upgrade F30
- realise I only have xen kernels
- xen doesn't properly boot (it's working, but no GUI login)
- systemctl change target to text login
- remove xen
- now I have no menuentries in grub, and can't boot
- boot from USB stick
- re-install grub
- re-run grub-install
- re-run grub-mkconfig
- see BLS, google it
- find /boot/loader/entries, so that's not the problem
- disable BLS
- re-run grub-mkconfig
- reboot
- have kernels, boot to text
- systemctl change target to gui login
- everything works

Comment 46 Javier Martinez Canillas 2019-04-12 12:59:29 UTC
(In reply to Berend De Schouwer from comment #45)
> I can't replicate it anymore.  Running with /etc/default/grub set to BLS all
> works as expected now, loading all the kernels from /boot/loader/entries.
> 
> So, for completion, I'm adding what I think happened:
> - have F22, upgrade, upgrade, upgrade...
> - have F29
> - have xen installed, but not used (I sometimes run vms for dev)
> - upgrade F30
> - realise I only have xen kernels
> - xen doesn't properly boot (it's working, but no GUI login)
> - systemctl change target to text login
> - remove xen
> - now I have no menuentries in grub, and can't boot
> - boot from USB stick
> - re-install grub
> - re-run grub-install

I think this is what fixed your issue since the problem seems to be that the blscfg module isn't compatible with very old versions of the GRUB core image. And this is never updated on a grub2 package update.

When running grub2-install, the GRUB core and modules under /boot/grub2/i386-pc/ where correctly updated so BLS work in that case.

Comment 47 Adam Williamson 2019-04-12 14:55:15 UTC
Javier: so basically, should we be advising people who upgrade systems that were initially installed as older Fedora releases to run 'grub-install' before upgrading? And write up some steps they can follow if they don't do that and wind up with a non-booting system?

Would it be possible to write some sort of test program to see if the installed grub core is compatible with blscfg?

Comment 48 Javier Martinez Canillas 2019-04-12 16:17:38 UTC
(In reply to Adam Williamson from comment #47)
> Javier: so basically, should we be advising people who upgrade systems that
> were initially installed as older Fedora releases to run 'grub-install'
> before upgrading? And write up some steps they can follow if they don't do
> that and wind up with a non-booting system?
>

Yes, the easiest way to get the system working system is to use the older GRUB config file that was saved during the grub2 package upgrade:

grub> configfile /grub2/grub.cfg.rpmsave
 
> Would it be possible to write some sort of test program to see if the
> installed grub core is compatible with blscfg?

I don't think that there's information about the GRUB version, and also even if it were the wasn't been a GRUB upstream release for at least two years so it's hard to tell. We could try to run grub2-install on upgrade, something like the following:

ARCH="$(uname -m)"
if [ ! -d /sys/firmware/efi ] && [ "$ARCH" = "x86_64" ]; then
    DISK="$(grub2-probe --target=device /boot/ | sed -e 's/[0-9].*//g')"
    if dd status=none if=$DISK bs=1 count=512 | grep -qao GRUB; then
        grub2-install $DISK
    fi
fi

But I don't know if that would be case more harm than good, since we never updated GRUB core on upgrade. So it means that could replace the GRUB installed by another OS, if the user has a multi boot setup and uses the GRUB from another distro to boot.

Comment 49 Adam Williamson 2019-04-12 16:53:04 UTC
"I don't think that there's information about the GRUB version, and also even if it were the wasn't been a GRUB upstream release for at least two years so it's hard to tell."

I was thinking more along the lines of a program that actually looks for or tries to use the functionality of grub that blscfg needs (but doesn't actually change anything). If that's not possible, then never mind :)

Comment 50 Loïc Yhuel 2019-04-13 22:44:07 UTC
In my case, the errors were :
error: symbol 'grub_qsort' not found.
error: can't find command 'blscfg'
error: invalid version

So the installed grub needs at least 0110-Add-quicksort-implementation.patch, added in 2.02-26 (Feb 28 2018).
That means for legacy boot :
 - no upstream grub would work
 - everyone who originally installed F27 or older, and didn't run grub-install since will have the issue

If there is a way to locate the grub core on the disk, perhaps the upgrade script could check for the symbol (a grep would probably be enough).

Comment 51 Javier Martinez Canillas 2019-04-13 23:24:34 UTC
(In reply to Loïc Yhuel from comment #50)
> In my case, the errors were :
> error: symbol 'grub_qsort' not found.
> error: can't find command 'blscfg'
> error: invalid version
> 
> So the installed grub needs at least
> 0110-Add-quicksort-implementation.patch, added in 2.02-26 (Feb 28 2018).

Do you still have the blscfg module that caused that issue? If that's the case, could you please check if is the same than the one installed by the latest grub2-pc-modules package in /usr/lib/grub/i386-pc?

Because I found the problem with that dependency before so I moved the quicksort implementation from the core to the blscfg module (that was the only user) in patch:

0246-Move-quicksort-function-from-kernel.exec-to-the-blsc.patch

And then the the grub_qsort function was removed and instead the entries are stored in a sorted linked list, since patch:

0276-blscfg-store-the-BLS-entries-in-a-sorted-linked-list.patch

So definitely the latest version of the blscfg doesn't use grub_qsort and your error seems to imply that the module wasn't correctly updated on the f30 upgrade. A bug I noticed after thinking about this is that the switch to BLS is done in the grub2-tools %post scriptlet (to make sure that the latest version of the grub2-switch-to-blscfg script is used, which is part of the grub2-tools package).

But the blscfg module is part of the grub2-pc-modules package, so we should not only copy the blscfg module in the grub2-switch-to-blscfg to make sure that it's using the latest one if a user explicitly switch to BLS. But also in the %post scriptlet of the grub2-pc-modules package, to make sure that the latest blscfg module is really used. This won't be the case if the grub2-tools package is installed before the grub2-pc-modules package.

> That means for legacy boot :
>  - no upstream grub would work

Yes, but that's already the case with other changes in the Fedora grub2.

>  - everyone who originally installed F27 or older, and didn't run
> grub-install since will have the issue
>

That shouldn't be the case, I've been testing older Fedora releases and it seems to work at least up until F24. It did fail when I tested on F20 though, I need to test the releases between F24 and F20 to see how far we can go.
 
> If there is a way to locate the grub core on the disk, perhaps the upgrade

Yes, it's copied in /boot/grub2/core.img, even in the case when it's stored in the sectors after the MBR and the start of the first partition.

> script could check for the symbol (a grep would probably be enough).

As mentioned the grub_qsort symbol isn't used anymore by the blscfg module. So old GRUB may have that symbol if where installed in releases when that was used (i.e: F27).

Comment 52 Matt Fagnani 2019-04-13 23:42:06 UTC
I ran dnf system-upgrade from F29 to F30 on a computer with a BIOS using the i686 rpms starting April 6. After the system upgrade completed and the system was restarted, I briefly saw an error ending increment.mod not found before grub started. grub had no Fedora menu entries. I ran most of the commands from 5.0.6 kernel entry for F29 in grub2/grub.cfg.rpmsave with the version switched to F30, and the kernel and system booted. When I first ran sudo grub2-mkconfig -o /boot/grub2/grub.cfg in konsole in Plasma, no Fedora kernel entries were added. /grub2/i386-pc/increment.mod didn't exist. /boot/grub2/i386-pc/blscfg.mod and other files in that directory had last modified dates of Sept 2016 which was around when I first installed F24 on the drive. grub2-common-1:2.02-75.fc30.noarch and grubby-8.40-30.fc30.i686 were the versions during the system upgrade.

I changed GRUB_ENABLE_BLSCFG to false in /etc/default/grub as suggested in comment 36. sudo grub2-mkconfig -o /boot/grub2/grub.cfg added the Fedora kernel entries correctly. Thanks for the workaround.

After the F30 kernel was selected in grub by pressing enter or automatically after 5 seconds, my system has rebooted after 1-2 seconds at least 10 times out of more than 50 boots in the last week. No messages are usually shown even when I removed rhgb quiet from the kernel command line in grub2 before booting. journalctl doesn't show anything on the boots that reboot immediately. I'm unsure if these reboots are related to this grub issue, but I hadn't seen them in F29 and earlier.

Comment 53 Loïc Yhuel 2019-04-14 00:39:38 UTC
(In reply to Javier Martinez Canillas from comment #51)
> (In reply to Loïc Yhuel from comment #50)
> > In my case, the errors were :
> > error: symbol 'grub_qsort' not found.
> > error: can't find command 'blscfg'
> > error: invalid version
> > 
> > So the installed grub needs at least
> > 0110-Add-quicksort-implementation.patch, added in 2.02-26 (Feb 28 2018).
> 
> Do you still have the blscfg module that caused that issue? If that's the
> case, could you please check if is the same than the one installed by the
> latest grub2-pc-modules package in /usr/lib/grub/i386-pc?
No, I did a grub2-install to fix the issue, so now it's the same as in /usr/lib/grub/i386-pc.

> So definitely the latest version of the blscfg doesn't use grub_qsort and
> your error seems to imply that the module wasn't correctly updated on the
> f30 upgrade. A bug I noticed after thinking about this is that the switch to
> BLS is done in the grub2-tools %post scriptlet (to make sure that the latest
> version of the grub2-switch-to-blscfg script is used, which is part of the
> grub2-tools package).
> 
> But the blscfg module is part of the grub2-pc-modules package, so we should
> not only copy the blscfg module in the grub2-switch-to-blscfg to make sure
> that it's using the latest one if a user explicitly switch to BLS. But also
> in the %post scriptlet of the grub2-pc-modules package, to make sure that
> the latest blscfg module is really used. This won't be the case if the
> grub2-tools package is installed before the grub2-pc-modules package.
> 
It seems you found the root cause : from rpm install dates, grub2-pc-modules was installed after grub2-tools during the F30 upgrade, so grub2-switch-to-blscfg would have copied the blscfg.mod from the previously installed grub2-pc-modules-1:2.02-62.fc29.noarch.
The 0246-Move-quicksort-function-from-kernel.exec-to-the-blsc.patch is in 2.02-60.fc30 and later, but not in 2.02-62.fc29.

Comment 54 Matt Fagnani 2019-04-14 22:24:05 UTC
When I selected the 5.0.7 kernel in grub once today, I saw "fatal error: token too large, exceeds YYLMAX" 20+ times with More at the bottom of the screen. More of the same errors were shown when I scrolled down. I restarted with control-alt-del, and the system started normally the next boot. I reported the reboots and errors I saw at https://bugzilla.redhat.com/show_bug.cgi?id=1699681 since they might be a different problem.

Comment 55 Javier Martinez Canillas 2019-04-15 16:28:51 UTC
*** Bug 1695701 has been marked as a duplicate of this bug. ***

Comment 56 Javier Martinez Canillas 2019-04-15 16:31:37 UTC
*** Bug 1698227 has been marked as a duplicate of this bug. ***

Comment 57 Fedora Update System 2019-04-15 18:31:10 UTC
grub2-2.02-76.fc30 has been submitted as an update to Fedora 30. https://bodhi.fedoraproject.org/updates/FEDORA-2019-8c2f69d44b

Comment 58 Fedora Update System 2019-04-16 01:35:38 UTC
grub2-2.02-76.fc30 has been pushed to the Fedora 30 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-8c2f69d44b

Comment 59 Steven Haigh 2019-04-27 07:37:35 UTC
I've just done a heap of updates from F29 -> F30 as Xen DomU guests.

If GRUB_ENABLE_BLSCFG=true is set in /etc/default/grub, then none of the systems can boot after the upgrade.

Setting GRUB_ENABLE_BLSCFG=false restores the boot menu - however GRUB_DEFAULT=0 does not seem to correctly apply when generating the grub.cfg.

If there are two boot entries - 1) the kernel, 2) the rescue image, even with GRUB_DEFAULT=0, the second entry (the rescue image) becomes the default boot target.

This seems somewhat related to this issue.

Comment 60 Steven Haigh 2019-04-27 07:44:50 UTC
Also, for the record, these Xen DomU's use pygrub as a bootloader.

As such, BLSCFG will probably never work.

There should be a way to detect if the running system is a Xen guest - and if so, to not activate BLSCFG as a default.

Comment 61 Adam Williamson 2019-04-27 16:57:59 UTC
Thanks for the report, Steven! Can you please file a new bug for that? As it has a fairly clear scope and is not the same as the initial bug reported here, a new report would be best. Thanks again.

Comment 62 Steven Haigh 2019-04-27 17:14:39 UTC
Thanks Adam,

I wasn't sure if the root cause was the same - logged as #1703700

Comment 63 Fedora Update System 2019-04-27 21:25:02 UTC
grub2-2.02-76.fc30 has been pushed to the Fedora 30 stable repository. If problems still persist, please make note of it in this bug report.

Comment 64 Peter Bieringer 2019-05-01 08:45:49 UTC
I ran into same issue upgrading an older Dell Laptop (i386) from 29 to 30. No Linux entries in grub after upgrade.

installed: grub2-pc-1:2:02-78.fc30

Booted from DVD in rescue mode worked fine, bug grub2-mkconfig is not showing Linux entries.

"blscfg" on grub commandline is not working ("command not found")

"configfile" hint worked to boot at least the system

switching in /etc/default/grub.conf to
GRUB_ENABLE_BLSCFG=false (which looks like to be added during upgrade 29 -> 30)

will reenable list of Linux entries

=> looks like the issue is not really fixed completly

Comment 65 Adam Williamson 2019-05-01 18:13:32 UTC
Did you fully update your F29 install before you upgraded to F30?

Comment 66 Peter Bieringer 2019-05-01 20:34:02 UTC
Yes, F29 update -> reboot -> start system-upgrade to F30 -> finally system booted into (dual-boot) Windows (and missed all the Linux entries) - but it also turned out that the XFS of the root partition crashed in addition or before (I assume in more or less last steps of the system-upgrade), had to repair in advance to start rescue mode successful.

Nevertheless, XFS repaired, rpm -Va is not showing unusual things, but ..BLSCFG=false is required to get Linux entries back in grub.conf.

Just note on a similar laptop having 64-bit installed system-upgrade worked fine, ..BLSCFG=true is not causing issues.

Comment 67 Javier Martinez Canillas 2019-05-02 07:21:53 UTC
(In reply to Peter Bieringer from comment #66)
> Yes, F29 update -> reboot -> start system-upgrade to F30 -> finally system
> booted into (dual-boot) Windows (and missed all the Linux entries) - but it
> also turned out that the XFS of the root partition crashed in addition or
> before (I assume in more or less last steps of the system-upgrade), had to
> repair in advance to start rescue mode successful.
> 
> Nevertheless, XFS repaired, rpm -Va is not showing unusual things, but
> ..BLSCFG=false is required to get Linux entries back in grub.conf.
> 
> Just note on a similar laptop having 64-bit installed system-upgrade worked
> fine, ..BLSCFG=true is not causing issues.

What was your original Fedora version installed on the machine? Also, did you try what's mentioned here:

It's mentioned in https://fedoraproject.org/wiki/Common_F30_bugs#GRUB_boot_menu_is_not_populated_after_an_upgrade

Comment 68 Matt Prahl 2019-05-03 00:55:53 UTC
Hello,
I just upgraded from a fully updated Fedora 29 to Fedora 30 today, and had the same issue described here. No Fedora entries were populated in the grub menu. Reverting to using grub.cfg.rpmsave did allow me to boot with the latest Fedora 29 kernel. For the time being, I manually added a grub entry to use the Fedora 30 kernel.

My original Fedora installation was Fedora 26. The one difference between my setup and a standard install is that my UEFI boot entry is called "Fedora Work" and located in /boot/efi/EFI/fedora-work. My `/etc/grub2-efi.cfg` symlink properly points to `../boot/efi/EFI/fedora-work/grub.cfg` though.

Any suggestions on what to do to get BLS working? I've read to not run `grub2-install` if UEFI is used, but I'd like to get confirmation here before I try anything.

Thank you for the help.

Comment 69 Peter Bieringer 2019-05-03 05:38:53 UTC
First time Fedora was installed on that particular system is around February 2016 (using find / -ctime +1000), so I assume F23.

Updated grub in /dev/sda and reenable ...BLSCFG=true -> grub2-mkconfig is still showing Linux entries.

So root cause can be as in the FAQ described missing update of the grub core => potentially such issues should be catched by the upgrade plugin before a first reboot (e.g. latest on the step where a new grub config is created, but no Linux entries are visible).

Comment 70 Artem S. Tashkinov 2019-05-03 19:14:28 UTC
*** Bug 1706177 has been marked as a duplicate of this bug. ***

Comment 71 Artem S. Tashkinov 2019-05-03 19:17:37 UTC
I also ran into this issue after upgrading from F29.

How on Earth did this bug slip through your QA/QC and it STILL NOT resolved?

Hundreds of people upgrading from F29 have been and will be affected and they will have no way of knowing why their kernels are missing from grub2-mkconfig output. There's nothing in F30 release notes either.

It's a gigantic f*** up to say the least.

Comment 72 Artem S. Tashkinov 2019-05-03 19:32:51 UTC
While we are at it, where's up to date documentation for grub2?

Even https://docs.fedoraproject.org/en-US/fedora/rawhide/system-administrators-guide/kernel-module-driver-configuration/Working_with_the_GRUB_2_Boot_Loader/ barely lists any /etc/default/grub options , has major omissions and contains outdated information.

For instance:

> Setting Up Users and Password Protection, Specifying Menu Entries
> To specify a superuser, add the following lines in the /etc/grub.d/01_users file, where john is the name of the user designated as the superuser, and johnspassword is the superuser’s password:

This is wrong!

Fedora uses \${prefix}/user.cfg which is placed God knows where in God knows what format.

Seriously, guys, does anyone in RedHat/Fedora maintain grub2 package? It all looks like a giant clusterf*ck.

Comment 73 Artem S. Tashkinov 2019-05-03 19:41:14 UTC
Oh, there's this completely useless /etc/grub.d/README file.

Then there's this useless /usr/share/doc/grub2-common/grub.html file which describes how grub2 works in general (of course, without tons of Fedora patches).

Could anyone please tell me how to configure grub2 in Fedora?

I want to know everything about:

/etc/default/grub

/etc/grub.d/*

Also, which grub2 packages are required for which systems (BIOS, EFI, etc).

Also, what's the proper layout for installing Grub2 in Fedora - I mean which directories and files are essential and how they can be populated.

Also, how default entries can be set in a running system.

Also, how default entries can be set on boot (and if there's a shortcut for that).

Also, how Grub2 in Fedora can be themed.

Comment 74 Panu Matilainen 2019-05-06 08:04:43 UTC
/me too got into a "fun" situation with this, my home desktop wouldn't even get to a grub prompt, it just got into an endless and fast reboot loop at that point. Never seen anything quite like that. I had to grab a rescue image and boot from usb, only the F30 rescue image seems to be broken and would not find any of the Linux filesystems on the box. Fortunately F29 rescue image worked without a hitch, and then found this bug and the workaround of GRUB_ENABLE_BLSCFG=false to get it back up and running.

To others running into this, https://docs.fedoraproject.org/en-US/fedora/rawhide/system-administrators-guide/kernel-module-driver-configuration/Working_with_the_GRUB_2_Boot_Loader/#sec-Reinstalling_GRUB_2 documents the steps of reinstalling grub on BIOS and UEFI systems. It'd seem to me that such differences should be hidden away in the tooling, and at least grub2-install should hint at the proper sequence for UEFI instead of failing out with mysterious missing-files error.

After a good deal of googling and fiddling around, reinstalling grub and running grub2-switch-to-blscfg the thing boots again, now based on those BLS entries, I guess. But what a mess.

Comment 75 Mike Gahagan 2019-05-06 16:17:00 UTC
I ran into this issue as well with a slightly different system. All the needed menu entries were present however choosing any entry would just cause the system to reboot again. I was able to recover the system by running grub2-install from the F30 live image (https://docs.pagure.org/docs-fedora/the-grub2-bootloader.html See "Additional Scenerios") 

I don't recall what Fedora release I installed originally on my boot disk, but it was configured as BIOS/CSM. I replaced the motherboard in the system due to a hardware failure several months ago, the new mb supports UEFI and I had to make sure the live image booted in BIOS mode rather than UEFI in order for grub2-install to work properly.

Comment 76 Alan Hamilton 2019-05-07 02:36:53 UTC
This one bit me too on a T500 Thinkpad (BIOS). It's a dual boot with Windows, but the boot menu only showed the Windows entry after upgrading from 29->30. I had to manually use the grub shell to select the kernel and ramdisk to get it to boot.

The fix, as suggested above, was to run grub2-install /dev/sda

Comment 77 Miroslav Suchý 2019-05-07 08:04:33 UTC
*** Bug 1706094 has been marked as a duplicate of this bug. ***

Comment 78 Dimitrios Apostolou 2019-05-08 21:45:02 UTC
#metoo

But the fix was not as straightforward, because I have GRUB2 installed on the *boot sector* of the /boot partition, instead of the MBR of the disk, and the windows boot loader chain loads it.

The fix involved 
- running grub2-install /dev/sdaX to re-install GRUB2 to the boot sector and copy the modules to /boot
- copying the boot sector to a file `dd if=/dev/sdaX of=grub2-bls.img bs=512 count=1` and copying that file to Windows C:\grub2-bls.img
- booting to windows and configuring the Windows bootloader to chain-load that file using BCDEDIT.exe utility (see guide here https://www.iceflatline.com/2009/09/how-to-dual-boot-windows-7-and-linux-using-bcdedit/)

Quite a bit of unscheduled downtime, I had to remember what I had done 10 years ago when I installed Fedora to that old laptop.

Comment 79 Matt Prahl 2019-05-09 13:03:30 UTC
(In reply to Matt Prahl from comment #68)
> Hello,
> I just upgraded from a fully updated Fedora 29 to Fedora 30 today, and had
> the same issue described here. No Fedora entries were populated in the grub
> menu. Reverting to using grub.cfg.rpmsave did allow me to boot with the
> latest Fedora 29 kernel. For the time being, I manually added a grub entry
> to use the Fedora 30 kernel.
> 
> My original Fedora installation was Fedora 26. The one difference between my
> setup and a standard install is that my UEFI boot entry is called "Fedora
> Work" and located in /boot/efi/EFI/fedora-work. My `/etc/grub2-efi.cfg`
> symlink properly points to `../boot/efi/EFI/fedora-work/grub.cfg` though.
> 
> Any suggestions on what to do to get BLS working? I've read to not run
> `grub2-install` if UEFI is used, but I'd like to get confirmation here
> before I try anything.
> 
> Thank you for the help.

I tried reinstalling grub with the following instructions:
https://docs.fedoraproject.org/en-US/fedora/rawhide/system-administrators-guide/kernel-module-driver-configuration/Working_with_the_GRUB_2_Boot_Loader/#sec-Reinstalling_GRUB_2

I also ran `grub2-switch-to-blscfg` and the same issue returned.

I also verified that "blscfg" returns "command not found" from the Grub shell.

I don't know if this will help, but this is my Fedora entry from efibootmgr:
Boot0008* Fedora Work	HD(1,GPT,cae0a6d3-b7f7-4c63-8472-f4f7b830eb57,0x800,0x64000)/File(\EFI\fedora-work\shim.efi)

Should it be `shimx64.efi` instead?

Any help would be appreciated because every time there's a kernel update, I need to manually edit my grub configuration since `grub2-mkconfig` uses the non-EFI executables ("initrd" instead of "initrdefi"), and thus provides a broken configuration.

Comment 80 Javier Martinez Canillas 2019-05-09 14:11:33 UTC
(In reply to Matt Prahl from comment #79)
> (In reply to Matt Prahl from comment #68)
> > Hello,
> > I just upgraded from a fully updated Fedora 29 to Fedora 30 today, and had
> > the same issue described here. No Fedora entries were populated in the grub
> > menu. Reverting to using grub.cfg.rpmsave did allow me to boot with the
> > latest Fedora 29 kernel. For the time being, I manually added a grub entry
> > to use the Fedora 30 kernel.
> > 
> > My original Fedora installation was Fedora 26. The one difference between my
> > setup and a standard install is that my UEFI boot entry is called "Fedora
> > Work" and located in /boot/efi/EFI/fedora-work. My `/etc/grub2-efi.cfg`

Since you have a non-default setup, you are probably using a very old GRUB EFI binary since the gru2-efi package installs it in /boot/efi/EFI/fedora/grubx64.efi.

If you want to keep using that device path in your Boot Entry, then you need to copy /boot/efi/EFI/fedora/grubx64.efi to /boot/efi/EFI/fedora-work/grubx64.efi (if that's the path you have in your Boot Entry).

Since you are using a Boot Entry with a different device path than the default, you need to make sure that the GRUB you are using is the latest one. 

> > symlink properly points to `../boot/efi/EFI/fedora-work/grub.cfg` though.
> > 
> > Any suggestions on what to do to get BLS working? I've read to not run
> > `grub2-install` if UEFI is used, but I'd like to get confirmation here
> > before I try anything.
> > 
> > Thank you for the help.
> 
> I tried reinstalling grub with the following instructions:
> https://docs.fedoraproject.org/en-US/fedora/rawhide/system-administrators-
> guide/kernel-module-driver-configuration/Working_with_the_GRUB_2_Boot_Loader/
> #sec-Reinstalling_GRUB_2
> 
> I also ran `grub2-switch-to-blscfg` and the same issue returned.
> 
> I also verified that "blscfg" returns "command not found" from the Grub
> shell.
> 

This probably is because you are using an old GRUB as mentioned, since it was never updated due not using the default path in the ESP.

> I don't know if this will help, but this is my Fedora entry from efibootmgr:
> Boot0008* Fedora Work
> HD(1,GPT,cae0a6d3-b7f7-4c63-8472-f4f7b830eb57,0x800,0x64000)/
> File(\EFI\fedora-work\shim.efi)
> 
> Should it be `shimx64.efi` instead?
> 
> Any help would be appreciated because every time there's a kernel update, I
> need to manually edit my grub configuration since `grub2-mkconfig` uses the
> non-EFI executables ("initrd" instead of "initrdefi"), and thus provides a
> broken configuration.

In latest GRUB versions, the linux and initrd commands also work for EFI. There's no need to use the {linux,initrd}efi anymore.

Comment 81 Matt Prahl 2019-05-09 15:32:28 UTC
(In reply to Javier Martinez Canillas from comment #80)
> (In reply to Matt Prahl from comment #79)
> > (In reply to Matt Prahl from comment #68)
> > > Hello,
> > > I just upgraded from a fully updated Fedora 29 to Fedora 30 today, and had
> > > the same issue described here. No Fedora entries were populated in the grub
> > > menu. Reverting to using grub.cfg.rpmsave did allow me to boot with the
> > > latest Fedora 29 kernel. For the time being, I manually added a grub entry
> > > to use the Fedora 30 kernel.
> > > 
> > > My original Fedora installation was Fedora 26. The one difference between my
> > > setup and a standard install is that my UEFI boot entry is called "Fedora
> > > Work" and located in /boot/efi/EFI/fedora-work. My `/etc/grub2-efi.cfg`
> 
> Since you have a non-default setup, you are probably using a very old GRUB
> EFI binary since the gru2-efi package installs it in
> /boot/efi/EFI/fedora/grubx64.efi.
> 
> If you want to keep using that device path in your Boot Entry, then you need
> to copy /boot/efi/EFI/fedora/grubx64.efi to
> /boot/efi/EFI/fedora-work/grubx64.efi (if that's the path you have in your
> Boot Entry).
> 
> Since you are using a Boot Entry with a different device path than the
> default, you need to make sure that the GRUB you are using is the latest
> one. 
> 
> > > symlink properly points to `../boot/efi/EFI/fedora-work/grub.cfg` though.
> > > 
> > > Any suggestions on what to do to get BLS working? I've read to not run
> > > `grub2-install` if UEFI is used, but I'd like to get confirmation here
> > > before I try anything.
> > > 
> > > Thank you for the help.
> > 
> > I tried reinstalling grub with the following instructions:
> > https://docs.fedoraproject.org/en-US/fedora/rawhide/system-administrators-
> > guide/kernel-module-driver-configuration/Working_with_the_GRUB_2_Boot_Loader/
> > #sec-Reinstalling_GRUB_2
> > 
> > I also ran `grub2-switch-to-blscfg` and the same issue returned.
> > 
> > I also verified that "blscfg" returns "command not found" from the Grub
> > shell.
> > 
> 
> This probably is because you are using an old GRUB as mentioned, since it
> was never updated due not using the default path in the ESP.
> 
> > I don't know if this will help, but this is my Fedora entry from efibootmgr:
> > Boot0008* Fedora Work
> > HD(1,GPT,cae0a6d3-b7f7-4c63-8472-f4f7b830eb57,0x800,0x64000)/
> > File(\EFI\fedora-work\shim.efi)
> > 
> > Should it be `shimx64.efi` instead?
> > 
> > Any help would be appreciated because every time there's a kernel update, I
> > need to manually edit my grub configuration since `grub2-mkconfig` uses the
> > non-EFI executables ("initrd" instead of "initrdefi"), and thus provides a
> > broken configuration.
> 
> In latest GRUB versions, the linux and initrd commands also work for EFI.
> There's no need to use the {linux,initrd}efi anymore.


Thanks so much. That worked.

Here were the commands that fixed it for me:
# Copy the latest grub/shim files
cp /boot/efi/EFI/fedora/*64* /boot/efi/EFI/fedora-work/
# Delete the "Fedora Work" EFI entry
efibootmgr -b 8 -B
# Recreate the "Fedora Work" EFI entry using the "shimx64.efi" loader instead of "shim.efi"
efibootmgr -c -b 8 -L "Fedora Work" -l "/EFI/fedora-work/shimx64.efi" -d /dev/nvme0n1p2
reboot

Comment 82 Chris Murphy 2019-05-10 01:39:34 UTC
(In reply to Dimitrios Apostolou from comment #78)
> #metoo
> 
> But the fix was not as straightforward, because I have GRUB2 installed on
> the *boot sector* of the /boot partition, 

This is explicitly not supported by Fedora anymore, not by the installer, and even upstream GRUB doesn't like it or recommend it. It's just untenable for Fedora to even remotely suggest we can handle this case when upstreams have given up on it.

It's also an example of a flawed policy, of not always reinstalling GRUB on BIOS during major version upgrades. That policy is there to avoid stepping on custom setups, just like Dimitrios has. But as a consequence of not stepping on his setup, we end up allowing the other 95% of Fedora users to have an installed bootloader that gets stale over time because it's never updated. There are way more people potentially hurt by stale bootloaders, than hurt by intentionally stomping on their preferred bootloader (that isn't GRUB).

This conversation is on-going on devel@ in "Upgrade to F30 gone wrong" because this bug format is really about *the bug* not providing support or discussing future plans.
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/message/IZJ6VH5T645IWEWZKHYDS4CQJFREKIGL/

(In reply to Artem S. Tashkinov from comment #71)
 
> How on Earth did this bug slip through your QA/QC and it STILL NOT resolved?

It didn't slip through, it was discovered during pre-release and we really can't block on it because pre-release criteria for upgrades are predicated on clean installs of Fedora n-1 and n-2. And because the GRUB core in the MBR gap (or BIOSBoot) of a clean install of F28 and F29 don't exhibit this bug when upgraded to Fedora 30, the criterion is met. So then we discussed the risk of just doing grub2-install (only on BIOS GRUB installs) during major version upgrades to make sure the installed bootloader isn't getting stale, and that too comes with a variety of consequences. And testing for the various use cases is non-trivial and not really tenable, so far as we have discussed up to this point.

And finally, while it could be placed in more prominent fashion, it has been recommended for a very long time (decade?) that users read Common Bugs before they do upgrades, and if you had been lucky enough to read it, you'd see it was recommended to 'grub2-install' before getting started, and thereby avoid the bug.

> There's nothing in F30 release notes either.

The link to Common Bugs is buried in the Welcome>Feedback section of the release notes. I agree it should be more prominent and I've filed a ticket about it.
https://docs.fedoraproject.org/en-US/fedora/f30/release-notes/welcome/Feedback/

> It's a gigantic f*** up to say the least.

Sure, and it's been a long time coming, because really we haven't been aggressive enough in keeping BIOS GRUB up to date, and being too accommodating (in my opinion) to custom layouts and not stepping on them. On BIOS, it's just not possible to compromise, you have to pick your poison. And hindsight being 20/20, my view is we should stomp on the old bootloader with what's current and supportable, and those who must have custom setups will just have to reinstall their custom setup on their own. Hence custom. Custom setups do have consequences. It's not about right and wrong.

Comment 83 Juha Luoma 2019-05-10 05:43:02 UTC
I hit this issue too with old desktop originally installed with Fedora 1x. No dual boot, only Fedora installation. All upgrades have been working fine expect now from 29 to 30. System ended up to fast reboot loop. Had to capture video from boot loop in order to be able to read the messages. Messages shown are:

-------------------------------
GRUB loading.
Welcome to GRUB!

error: file '/grub2/local/en.mo.gz' not found.
-------------------------------

I was able to recover by booting from Fedora 30 live USB and by following instructions from
https://docs.pagure.org/docs-fedora/the-grub2-bootloader.html under "Restoring the bootloader using the Live disk"

Problem could have been avoided if system-upgrade had checked for this critical dependency before allowing upgrade reboot. It could have automatically reinstall grub if configuration looks basic or in other cases refuse to start upgrade before user has manually reinstalled grub.

Comment 84 Javier Martinez Canillas 2019-05-10 09:31:15 UTC
*** Bug 1708377 has been marked as a duplicate of this bug. ***

Comment 85 Dominique Brazziel 2019-05-10 17:21:56 UTC
This was not fixed for x86_64 architecture as of 2.02-78-FC30. The info for Fedora Updates I received today read like this:

  grub2-2.02-79.fc30
===============================================================================
  Update ID: FEDORA-2019-37fbde5e5b
       Type: enhancement
    Updated: 2019-05-09 20:04:16
Description: The current blscfg module is only compatible with GRUB core images installed by Fedora 21 or newer releases. Legacy BIOS users
+upgrading a system originally installed with Fedora 20 or earlier won't have the GRUB menu populated due this incompatibility.
           :
           : This update makes the blscfg module to work with prior Fedora releases up to Fedora 19.
   Severity: None

Also, reference https://bugzilla.redhat.com/show_bug.cgi?id=1693515

Comment 86 Chris Murphy 2019-05-10 17:54:38 UTC
(In reply to Dominique Brazziel from comment #85)
> This was not fixed for x86_64 architecture as of 2.02-78-FC30.

It is arguably not a bug because the version GRUB that created the embedded core.img is different than the modules that will be loaded. That mismatch is not supported upstream, and there is no practical way for Fedora to support it. That Javier is able to lessen the impact of this disconnect is a courtesy. It's not a requirement.

The proper fix is to use 'grub2-install', and it must be done manually because there's neither a policy nor a mechanism to automatically update it (either during minor updates or major upgrades). I recognize that asking users with legacy BIOS systems to do 'grub2-install' to update the embedded core.img and all modules on /boot to something current is tedious and esoteric, but that's the way it has always been with BIOS GRUB.

Comment 87 Alan Hamilton 2019-05-11 00:50:58 UTC
Well... the problem is that updating grub2 updates the modules that are available at boot, it updates the grub.cfg file... but it doesn't update the boot sector that uses these files. There's an argument that if it can't do this, it shouldn't update the other files either. Applying an upgrade and getting an unbootable system shouldn't happen. If the boot sector is too old, grub2-As I noted above, I had a system that I had to spend some time recovering because I didn't know it still had the boot sector from Fedora 17.

I agree that how this would happen is an issue. Just overwriting the MBR is dangerous given the unpredictable system setups people have. I'm not sure what facilities dnf system-upgrade has to warn users of incompatibilities. Ideally it should give some sort of warning that grub2-install <boot device> needs to be run since the MBR is out of date. Does system-upgrade have any provisions for detecting critical dependency failures?

Comment 88 Dominique Brazziel 2019-05-11 01:28:10 UTC
I suggested in another bug (https://bugzilla.redhat.com/show_bug.cgi?id=1693515) that dnf.plugin.system-upgrade should reference https://fedoraproject.org/wiki/Common_F{Releasever}_bugs and issue a strongly urge the user to review it thoroughly. All of the pages of this bug and BZ#1693515 are post mortem and of little use to a user stuck at the GRUB prompt after the reboot stage of the upgrade. I was lucky to have access to another machine and a USB stick to get sysrescuecd, which I used to fix this problem:  

http://www.system-rescue-cd.org/disk-partitioning/Repairing-a-damaged-Grub/

I feel much sorrow for users without rescue medium and access to another machine.

Comment 89 Claude Frantz 2019-05-11 14:56:05 UTC
Because, at my site, the solution was simply to run "grub-install /dev/sda", I suggest the following improvement to the dnf system-upgrade. 

At first, the program grub2-install should have an additional function. It should test if the "grub-install /dev/sda" is necessary and/or possible. The result should be the setting of the exit code and a diagnostic message as appropriate. As possible alternative, I see a command line option to this program, which performs the above mentioned test and then, when no reason has be found to avoid the execution of the action, to perform it. Perhaps, after a confirmation by the user. 

I consider this way the preferable one because grub2-install is the specialized program for such operations. It contains probably already a large part of the necessary addition code. Further, this is in concordance with the UNIX philosophy. 

When this addition has been made, "dnf system-upgrade" can simply use it at the beginning of its operation. This additional function can probably be used in a further context too.

Comment 90 Dimitrios Apostolou 2019-05-11 23:17:01 UTC
(In reply to Chris Murphy from comment #82)
> (In reply to Dimitrios Apostolou from comment #78)
> > #metoo
> > 
> > But the fix was not as straightforward, because I have GRUB2 installed on
> > the *boot sector* of the /boot partition, 
> 
> This is explicitly not supported by Fedora anymore, not by the installer,
> and even upstream GRUB doesn't like it or recommend it. It's just untenable
> for Fedora to even remotely suggest we can handle this case when upstreams
> have given up on it.

Sure, I never expected Fedora to fix my custom setup. I would have appreciated a warning though during the "dnf system-upgrade download" phase. It could have detected that the system is not EFI and printed a message. Or print a message anyway, since the breakage affects so many.


> It's also an example of a flawed policy, of not always reinstalling GRUB on
> BIOS during major version upgrades. That policy is there to avoid stepping
> on custom setups, just like Dimitrios has. 

Correct, and I appreciate that Fedora is not overwriting my MBR. There is a reason that I had put windows boot loader in the MBR.

> But as a consequence of not
> stepping on his setup, we end up allowing the other 95% of Fedora users to
> have an installed bootloader that gets stale over time because it's never
> updated. There are way more people potentially hurt by stale bootloaders,
> than hurt by intentionally stomping on their preferred bootloader (that
> isn't GRUB).

Do you really think that Fedora should overwrite the customisations that Linux users choose to make? Why don't you consider other, safer approaches:

+ Check if default version of grub is in the MBR, and only then write to the MBR, otherwise warn
+ Or just warn: print a message suggesting the user should refresh the MBR
+ Keep updating grub.cfg for some versions in order to retain backwards compatibility, before going exclusively to blscfg.

IMHO this ticket here is not about a software bug, but policy: Fedora pushed too much breakage, too fast, too silently, in a core package upgrade. And I don't believe the solution is to be more aggressive in enforcing policies, as you are suggesting.

Comment 91 Chris Murphy 2019-05-12 06:28:46 UTC
Guys, this is a bug reporting system. It's not a forum. If you want to have a policy discussion, it's over on devel@  - I mentioned this in comment 82.

And as for version testing GRUB, that's pretty unlikely for reasons I go into here:
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/message/BXGDSGDSHJQRNMRWSXR7OBVUFRPN4FEA/

Comment 92 Claude Frantz 2019-05-12 17:45:47 UTC
Your are right, Chris. But it's the lack of an appropriate feature at the right place which was at the origin of the bug. Please forward the request in the appropriate manner to the right persons, so that our remarks and ideas can be used to improve the upgrade procedure. Probably, you known better how to do this. Many thanks !

Comment 93 Claude Frantz 2019-05-17 18:19:57 UTC
This is still not working right for me.

After having entered "grub2-install /dev/sda", I was able to boot again. But after "dnf update" has installed new kernels, grub2 is presenting the old fc28 kernels again in the boot menu. The new installed kernels are not presented, although there are in the /boot directory.

I have tried to generate a new tentative grub.cfg in the /tmp/ directory using "grub2-mkconfig -o /tmp/grub.cfg", but this file is not well formed. 

What do I have to do in order to return to a working system ? At the present time, I'm working with an old kernel from fc28, which was presented in the grub2 boot menu.

Comment 94 Claude Frantz 2019-05-17 18:25:10 UTC
Created attachment 1570247 [details]
the generated grub.cfg in /tmp/ which is still defective

The attached file is the one I have generated according to the procedure cited just above. This file is obviously wrong.

Comment 95 Chris Murphy 2019-05-18 03:52:03 UTC
I don't see anything obviously wrong with it. There is a tuned section that my grub.cfg does not have but it looks benign.

The issue must be in /boot/loader/entries/

Since this bug is about blscfg can't be found, and has been fixed. I suggest you either file a new bug for this problem, and include attaching:

/boot/grub2/grub.cfg
/boot/grub2/grubenv
/etc/default/grub
/boot/loader/entries/*   ## all of them

And also output from

$ sudo efibootmgr -v     ## confirm whether UEFI and what bootloader it points to

There is a 2.02-80.fc30.x86_64 that fixes a couple bugs that might make it worthwhile to update to that. On UEFI do not use 'grub2-install'. On BIOS do use it, pointed to the proper whole block device, not partition. And on both UEFI and BIOS use 'grub2-mkconfig' to replace the grub.cfg at the proper location.

Comment 96 Claude Frantz 2019-05-18 05:11:14 UTC
I'm here because my initial bug report (Bug 1708377) has been merged. In my opinion, the bug which I have reported is another matter, as you mentioned too. Is it possible to revert the bug merging ?

At first, I'm using the old MBR structure, not EFI and my machine is running in the 32 bit x86 architecture. 

I have run "grub2-install /dev/sda" and I have got a new grub2 menu, but after "dnf update" having installed new kernels, the grub2 menu is not up to date. 

Is it a good idea to repeat "grub2-install /dev/sda" again ? What do I have to expect at the next "dnf update" ?

Comment 97 Matt Fagnani 2019-05-18 05:52:54 UTC
(In reply to Claude Frantz from comment #93)
> This is still not working right for me.
> 
> After having entered "grub2-install /dev/sda", I was able to boot again. But
> after "dnf update" has installed new kernels, grub2 is presenting the old
> fc28 kernels again in the boot menu. The new installed kernels are not
> presented, although there are in the /boot directory.
> 
> I have tried to generate a new tentative grub.cfg in the /tmp/ directory
> using "grub2-mkconfig -o /tmp/grub.cfg", but this file is not well formed. 
> 
> What do I have to do in order to return to a working system ? At the present
> time, I'm working with an old kernel from fc28, which was presented in the
> grub2 boot menu.

Claude, installing the grubby-deprecated package is needed for newly installed kernels to be added to grub if GRUB_ENABLE_BLSCFG=true was changed to GRUB_ENABLE_BLSCFG=false or GRUB_ENABLE_BLSCFG=true was commented out or removed from /etc/default/grub. Regenerating the grub.cfg file might be needed afterwards by running
sudo grub2-mkconfig -o /boot/grub2/grub.cfg

Javier Martinez Canillas described this in more detail in comment 36

Comment 98 Javier Martinez Canillas 2019-05-18 07:49:48 UTC
(In reply to Claude Frantz from comment #96)
> I'm here because my initial bug report (Bug 1708377) has been merged. In my
> opinion, the bug which I have reported is another matter, as you mentioned
> too. Is it possible to revert the bug merging ?
> 

You mentioned that the problem you mentioned in that bug (the menu not being shown) has been fixed. So it was correctly set as a duplicate of this bug. If new installed kernels are not added, you need to file a new bug.

> At first, I'm using the old MBR structure, not EFI and my machine is running
> in the 32 bit x86 architecture. 
>
> I have run "grub2-install /dev/sda" and I have got a new grub2 menu, but
> after "dnf update" having installed new kernels, the grub2 menu is not up to
> date. 
>

Please provide the information that Chris asked, otherwise is hard to guess what's your problem.
 
> Is it a good idea to repeat "grub2-install /dev/sda" again ? What do I have
> to expect at the next "dnf update" ?

I would say that it's a good idea to run grub2-install /dev/sda every time that you upgrade the grub package. That way you are sure that are using the latest version that contain the fixes in the package. At least until is decided if that should be made automatically or not.

Comment 99 Steven Haigh 2019-05-18 13:45:24 UTC
It would be great if grub updates don't overwrite the value of GRUB_ENABLE_BLSCFG on every damn grub update.

I've lost count of the number of times I've set GRUB_ENABLE_BLSCFG="false" in /etc/default/grub only for a grub update to reset it again at leave me with an unbootable bunch of VMs without manual intervention in each one...

Comment 100 Claude Frantz 2019-05-18 14:08:26 UTC
Created attachment 1570505 [details]
The grub.cfg finally generated after the mentioned steps

Comment 101 Chris Murphy 2019-05-18 18:40:05 UTC
(In reply to Steven Haigh from comment #99)
> It would be great if grub updates don't overwrite the value of
> GRUB_ENABLE_BLSCFG on every damn grub update.
> 
> I've lost count of the number of times I've set GRUB_ENABLE_BLSCFG="false"
> in /etc/default/grub only for a grub update to reset it again at leave me
> with an unbootable bunch of VMs without manual intervention in each one...

That's a separate bug and needs a separate bug report. /etc/default/grub is user domain, I can't think of what would change this, but you could make the file immutable and then reinstall grub using verbose and rpm debugging options in dnf and see what script gets mad at being unable to change it. And attach that information to the new bug report.

(In reply to Claude Frantz from comment #100)
> Created attachment 1570505 [details]
> The grub.cfg finally generated after the mentioned steps

I don't see a difference. It's still a different bug and needs a new bug report.


Using this bug as a dumping ground is not workable, folks. It's cluttering up the bug report, and will screw up searches for people looking for solutions to their problem and cause them to land here rather than an appropriate, discrete bug report.

Comment 102 Javier Martinez Canillas 2019-05-19 12:34:16 UTC
(In reply to Steven Haigh from comment #99)
> It would be great if grub updates don't overwrite the value of
> GRUB_ENABLE_BLSCFG on every damn grub update.
> 
> I've lost count of the number of times I've set GRUB_ENABLE_BLSCFG="false"
> in /etc/default/grub only for a grub update to reset it again at leave me
> with an unbootable bunch of VMs without manual intervention in each one...

That already should be the case, the grub2-tools %posttrans scriptlet would only attempt to set GRUB_ENABLE_BLSCFG="true" if isn't set to GRUB_ENABLE_BLSCFG="false":

https://src.fedoraproject.org/rpms/grub2/blob/f30/f/grub2.spec#_279

Comment 103 Steven Haigh 2019-05-19 13:11:43 UTC
(In reply to Javier Martinez Canillas from comment #102)
> That already should be the case, the grub2-tools %posttrans scriptlet would
> only attempt to set GRUB_ENABLE_BLSCFG="true" if isn't set to
> GRUB_ENABLE_BLSCFG="false":
> 
> https://src.fedoraproject.org/rpms/grub2/blob/f30/f/grub2.spec#_279

Ah - I see, it doesn't handle a quoted "false" - which is probably more correct... I'll check my other systems to see - as I'm pretty sure there's some ambiguity about this with options like:

GRUB_DISABLE_RECOVERY="true"

Whereas, "false" for GRUB_ENABLE_BLSCFG won't match.

Comment 104 Javier Martinez Canillas 2019-05-20 07:07:35 UTC
(In reply to Steven Haigh from comment #103)
> (In reply to Javier Martinez Canillas from comment #102)
> > That already should be the case, the grub2-tools %posttrans scriptlet would
> > only attempt to set GRUB_ENABLE_BLSCFG="true" if isn't set to
> > GRUB_ENABLE_BLSCFG="false":
> > 
> > https://src.fedoraproject.org/rpms/grub2/blob/f30/f/grub2.spec#_279
> 
> Ah - I see, it doesn't handle a quoted "false" - which is probably more
> correct... I'll check my other systems to see - as I'm pretty sure there's
> some ambiguity about this with options like:
> 
> GRUB_DISABLE_RECOVERY="true"
> 
> Whereas, "false" for GRUB_ENABLE_BLSCFG won't match.

Right, that's a bug indeed. I'll fix it.

Comment 105 Bob Gustafson 2019-05-22 19:51:39 UTC
I sent in a bug comment. I was guessing that the 'component' was bootconf - no replies to that

https://bugzilla.redhat.com/show_bug.cgi?id=1711710

If you wish, you can merge my comments under this one, although my title is different

Comment 106 RobbieTheK 2019-06-18 15:01:27 UTC
*** Bug 1720911 has been marked as a duplicate of this bug. ***

Comment 107 r3obh 2019-06-21 03:59:05 UTC
Upgrade to Fedora 30 on a server with BIOS boot a few years old resulted in looping boot => grub fail => reboot => grub fail...  Despite using live image on USB stick to lvscan this, mount that, chroot, edit config files etc etc I could not recover the system except by a fresh install.  Start over on a laptop, taking measures to avoid trouble.  No joy, same nightmare.  I'm holding off on any more upgrades for now.

Linux for the masses!  As long as you don't mind unmitigated disasters, that even years of experience won't save you from.

Over and out,
Rob.

Comment 108 Adam Williamson 2019-06-21 04:21:10 UTC
Sorry for the trouble, Robert. Did you try just doing `grub2-install /dev/sda` (or whatever the relevant disk device is)? That usually fixes it.

Comment 109 Claude Frantz 2019-06-21 05:16:59 UTC
It is important to run `grub2-install /dev/sda` AFTER the system upgrade. To run it BEFORE was not sufficient, in my case. After the upgrade, the system was not able to boot itself, but the GRUB2 shell was available. Entering the right commands on this shell, I was able to boot a kernel and to enter `grub2-install /dev/sda` in a root shell.

Comment 110 Dominique Brazziel 2019-06-30 11:58:40 UTC
Grub menu still not populated after kernel upgrade. I set 'GRUB_ENABLE_BLSCFG=false' in '/etc/default/grub' but it is not honored due to this line in 
'/usr/lib/kernel/install.d/20-grub.install':

if [[ "x${GRUB_ENABLE_BLSCFG}" = "xtrue" ]] || [[ ! -f /sbin/new-kernel-pkg ]]; then

For some reason '/sbin/new-kernel-pkg' disappeared from grubby, it now exists in package 'grubby-deprecated'. Imagine the surprise
after reboot and the new kernel is not in the grub menu. A manual 'grub2-mkconfig -o /boot/grub2/grub.cfg' is still required unless
'grubby-deprecated' is installed (I think).

Comment 111 Steven Haigh 2019-06-30 13:05:13 UTC
I still have problems with the grub.cfg not being updated in Xen DomU's....

Kernel updates don't appear until I run grub2-mkconfig manually.

I have tried this with both 'grubby-deprecated' installed, and not installed - same issue.

Something is still broken.

Comment 112 Adam Williamson 2019-06-30 15:47:18 UTC
Dominique: I believe that's all as intended. If you want to disable BLS, you have to install grubby-deprecated. Did you find docs or comments where this was not adequately explained?

Comment 113 Bob Gustafson 2019-07-01 05:16:57 UTC
The 'fixes' to the missing boot menu items depend on the type of system.

Comments should be identified as whether:

32 bit or 64 bit

UEFI or Legacy Boot.

My system is UEFI and 64 bit - and it still does not have F30 boot menu items.

Comment 114 Steven Haigh 2019-07-01 05:45:21 UTC
Mine are all Xen guests. 64 bit and considered legacy boot...

Lodged the full report here:
https://bugzilla.redhat.com/show_bug.cgi?id=1703700

Comment 115 Federic 2019-07-08 12:56:48 UTC
Jeez , what a mess this is. I hope some lessons have been learnt about the implications of redesigning such essential and low level stuff as grub. Leaving users without a viable boot menu is pretty serious karma.

Dimitrios:
"IMHO this ticket here is not about a software bug, but policy: Fedora pushed too much breakage, too fast, too silently, in a core package upgrade. "

Yes, it would seem obvious, the need to check if there was a legacy BIOS system and STOP and warn in that case.  I prefer BIOS and don't want UEFI, I have that choice in h/w config. 


Javier Martinez Canillas:
"I would say that it's a good idea to run grub2-install /dev/sda every time that you upgrade the grub package."

No. Fedora need to handle this better.


This Fedora box was originally installed from Fed23. It is likely running on the same boot device since then, so no reason to re-run grub2-install.

I have done that now and appended the following to /etc/default/grub 
GRUB_ENABLE_BLSCFG=false

cp /usr/lib/grub/i386-pc/increment.mod /boot/grub2/i386-pc/
cp /usr/lib/grub/i386-pc/blscfg.mod /boot/grub2/i386-pc/

 grub2-mkconfig -o /boot/grub2/grub.cfg

I now have a recent timestamp. 

What part of the Fed30 upgrade do I now need to reproduce to actually upgrade to see whether this is resolved?

Comment 116 Chris Murphy 2019-07-08 15:01:30 UTC
> Jeez , what a mess this is. I hope some lessons have been learnt about the
> implications of redesigning such essential and low level stuff as grub.
> Leaving users without a viable boot menu is pretty serious karma.

Much has been learned. On BIOS systems, the bootloader becomes stale over time, and there is increasing recognition that it needs to be updated regularly, as it is on UEFI. Because it's low level and serious, exactly how to do this must be taken deliberately and cautiously.

Otherwise I refer you to comment 82, rather than repeating myself.

> Yes, it would seem obvious, the need to check if there was a legacy BIOS
> system and STOP and warn in that case.  I prefer BIOS and don't want UEFI, I
> have that choice in h/w config.

What is obvious is that would be a lot of new code, and that's not allowed during code freeze. Neither dnf nor GNOME Software have such a warning mechanism or way to test based on firmware type. The warning language would have to be sanity tested to make sure it doesn't confuse users who have no idea what either a bootloader or GRUB are. It's likely the warning needs translations, and the translation check point had long past.

As suboptimal as the bug is, there are two work arounds listed in Common Bugs. And the bug does not violate any release criteria.

> No. Fedora need to handle this better.

That is absolutely true, and no one is arguing otherwise. Making this better is being worked on. It's also true users are advised to read Common Bugs in advance of installation or upgrade. Just because it's a nasty bug doesn't mean it's reasonable either block release or revert the feature change.


> This Fedora box was originally installed from Fed23. It is likely running on
> the same boot device since then, so no reason to re-run grub2-install.

From the information you've provided, I can't tell whether you've run into this bug, or a different as yet undiscovered bug. But that is part of the consequence of the bootloader becoming stale. It's way too much testing to understand all of the permutations and liabilities. And a Fedora 23 era embedded bootloader is definitely stale, as in not based on currently installed and supported GRUB packages.

> I have done that now and appended the following to /etc/default/grub 
> GRUB_ENABLE_BLSCFG=false
> 
> cp /usr/lib/grub/i386-pc/increment.mod /boot/grub2/i386-pc/
> cp /usr/lib/grub/i386-pc/blscfg.mod /boot/grub2/i386-pc/
> 
>  grub2-mkconfig -o /boot/grub2/grub.cfg
> 
> I now have a recent timestamp. 
> 
> What part of the Fed30 upgrade do I now need to reproduce to actually
> upgrade to see whether this is resolved?

I don't understand the question. Have you upgraded to Fedora 30? Or are you still on Fedora 28/29? Per Common Bugs, all you need to do is run 'grub2-install' prior to initiating the upgrade. If you didn't do that and are stuck without a grub menu, Common Bugs has a work around for that too, to use the old grub.cfg so you can boot, and then grub2-install will permanently fix the problem.

https://fedoraproject.org/wiki/Common_F30_bugs#blscfg-fail

If you still have a boot problem, chances are it's not this bug.

Comment 117 Chris Murphy 2019-07-08 15:08:39 UTC
These changes are not indicated:

> GRUB_ENABLE_BLSCFG=false
> 
> cp /usr/lib/grub/i386-pc/increment.mod /boot/grub2/i386-pc/
> cp /usr/lib/grub/i386-pc/blscfg.mod /boot/grub2/i386-pc/
> 
>  grub2-mkconfig -o /boot/grub2/grub.cfg


You should revert by changing first line back to true, and then remake the grub.cfg.

Comment 118 Federic 2019-07-08 15:31:06 UTC
Thanks for the comments Chris.

OK, to summarise: 

I was attempting to "upgrade" from a duely updated Fed29. I'm still stuck on that since the kernel does not update and upgrade does not finish.

I did not end up with and empty grub. It simply is not changing ( date stamp , content ). 
In case it was never done before since Fed23, I ran grub2-install today. I did not get the file listed above when I did that, so I copied them across anyway.

>> You should revert by changing first line back to true

That is not a reversion since that variable did not exist before. I have now changed it to true per your suggestion and remade grub.cfg 
 

I had read Common Bugs but since my system was newer than  "Fedora 20 or older" cited, I concluded it did not apply. It appears that advice may be inaccurate. 

What stage of the dnf upgrade plugin writes to grub.cfg ?  Or is that no longer the case with the new paradigm? 

thanks.

Comment 119 Javier Martinez Canillas 2019-07-08 15:58:36 UTC
(In reply to Federic from comment #118)
> Thanks for the comments Chris.
> 
> OK, to summarise: 
> 
> I was attempting to "upgrade" from a duely updated Fed29. I'm still stuck on
> that since the kernel does not update and upgrade does not finish.
>

I remember that there was a bug that caused the machine to not reboot after a correct F30 upgrade. Maybe that's still the case?
 
> I did not end up with and empty grub. It simply is not changing ( date stamp
> , content ). 

Do you mean that /boot/grub2/i386-pc/{increment,blscfg}.mod files didn't change after the upgrade?

> In case it was never done before since Fed23, I ran grub2-install today. I
> did not get the file listed above when I did that, so I copied them across
> anyway.
> 
> >> You should revert by changing first line back to true
> 
> That is not a reversion since that variable did not exist before. I have now
> changed it to true per your suggestion and remade grub.cfg 
>

You shouldn't need to set GRUB_ENABLE_BLSCFG to any value before the upgrade to F30. If it's not present in /etc/default/grub, then the grub2-switch-to-blscfg script is executed that switches your configuration to BLS and sets GRUB_ENABLE_BLSCFG to true. If GRUB_ENABLE_BLSCFG is set to false before the upgrade, then grub2-switch-to-blscfg won't be executed as part of the upgrade (that's the way that people have to opt-out switching to a BLS config during the F30 upgrade).

For the latter (GRUB_ENABLE_BLSCFG=false) though, the grubby-deprecated package must be installed or new entries won't be added when kernels are installed.
> 
> I had read Common Bugs but since my system was newer than  "Fedora 20 or
> older" cited, I concluded it did not apply. It appears that advice may be
> inaccurate. 
>

I tested with F20 and latter and it did work for me, but as Chris said the recommendation is to not keep a stale GRUB core.img and run grub2-install before the upgrade.
 
> What stage of the dnf upgrade plugin writes to grub.cfg ?  Or is that no
> longer the case with the new paradigm? 
> 

The grub2-switch-to-blscfg is executed in a grub2 package %posttrans scriptlet. This script re-generates the grub.cfg using a BLS configuration.

Comment 120 Federic 2019-07-08 16:43:29 UTC
 >> but as Chris said the recommendation is to not keep a stale GRUB core.img and run grub2-install before the upgrade.

Where is that recommended? I did not see that anywhere before attempting the upgrade.


>> For the latter (GRUB_ENABLE_BLSCFG=false) though, the grubby-deprecated package must be installed  ....

dnf search grubby-deprecated
Last metadata expiration check: 0:10:05 ago on Mon 08 Jul 2019 17:19:38 BST.
No matches found.

Where is this "grubby-deprecated" ?  If that is supposed to be a joke, maybe it would be better to use correct names for packages in a bug report. 


>> Do you mean that /boot/grub2/i386-pc/{increment,blscfg}.mod files didn't change after the upgrade?

No, the para in Common Bugs says the grub menu can be empty and user gets a grub shell. That was not my case.

Nothing happens "after the upgrade" since, as I reported, it fails to complete. I'm stuck in fed29. What I did report was that those files were absent from /boot/grub and I had to copy them across by hand. Chris said those changes were "not indicated" , so I imagine he means they were not necessary. 

>>The grub2-switch-to-blscfg is executed in a grub2 package %posttrans scriptlet. This script re-generates the grub.cfg using a BLS configuration.

# which grub2-switch-to-blscfg
/sbin/grub2-switch-to-blscfg

I suspect some/most/all of the packages may have been updated , all seemed OK upto the reboot. It's neither upgraded nor not upgraded, it's all a bit of a mess. 

Key problem seems to be the kernel is not getting installed.

Comment 121 Adam Williamson 2019-07-08 17:37:40 UTC
Federic, it doesn't sound like your problem has much to do with this bug at all, to be honest. If your F29 to F30 upgrade is not completing, *that* is the problem. This bug is not about that at all.

grubby-deprecated doesn't exist in F29 because it wasn't deprecated in F29. It was deprecated in F30.

Comment 122 Federic 2019-07-08 18:11:25 UTC
Thanks Adam. Yes I had come to the same conclusion, I've opened another bug about my issue. 

So someone really named a Fedora package grubby-deprecated. Oh dear.

Comment 123 Chris Murphy 2019-07-08 19:44:13 UTC
> I was attempting to "upgrade" from a duely updated Fed29. I'm still stuck on
> that since the kernel does not update and upgrade does not finish.

That's a different problem than this one. This bug is specifically with a successful upgrade that completes, and then people have no GRUB menu entries at all.

> I did not end up with and empty grub. 

Yeah you're definitely experiencing something different than this bug. An empty GRUB menu, specifically a grub> prompt, is what this bug is about. You've somehow fallen into an upgrade failure of some kind, possibly a partial upgrade did work and then failed because it looks like some of it succeeded, e.g. you have the new /etc/default/grub BLS=true line, but the kernel didn't get updated. 

Extracting the journal for the boot that performed the offline update, and attaching that to a bug report against dnf-plugin-system-upgrade. You can use 'journalctl --list-boots' to get an idea which boot is responsible; and also do something like 'journalctl -b -5 | grep offline' and see if you get a bunch of lines with a process containing that word, along with the status of rpms being installed. You can iterate the boot number to find the actual boot. Then e.g. 'journalctl -b -6 > journal.log' and attach that log to the bug report.

But then also how to fix your system is probably best discussed on either IRC freenode.net #fedora, or https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org/ which you can subscribe to like a conventional email list or you can get a user login and use the web interface.

Comment 124 Chris Murphy 2019-07-08 20:07:28 UTC
(In reply to Federic from comment #120)
>  >> but as Chris said the recommendation is to not keep a stale GRUB
> core.img and run grub2-install before the upgrade.
> 
> Where is that recommended? I did not see that anywhere before attempting the
> upgrade.

Yep, it's reasonable that you didn't update it because the Common Bugs only recommends it if the bootloader is from Fedora 20 or older. And you have a Fedora 23 or newer bootloader. But also I don't think you've run into this bug anyway.

Looking through Fedora documentation, we're not currently recommending users manually update their bootloader on BIOS to keep it from getting stale. I see it both ways: on the one hand I think Fedora just needs to automatically do the right thing and keep the bootloader up to date, not document that the user do it manually; on the other hand, we haven't actually been doing automatic bootloader updates on BIOS, so maybe we should have been recommending users manually do it all along.

Since Fedora 31 system wide change deadline is passed, and it's decently likely automatic bootloader updates on x86 BIOS would be a system wide change, it's a valid question whether it makes sense to explicitly recommend all users do this manually, possibly for just one release.


> Nothing happens "after the upgrade" since, as I reported, it fails to
> complete. I'm stuck in fed29. What I did report was that those files were
> absent from /boot/grub and I had to copy them across by hand. Chris said
> those changes were "not indicated" , so I imagine he means they were not
> necessary. 

Correct: not indicated in the Common Bugs suggesting for avoiding *this* bug. But you have a different bug. Can you cite your new bug URL in this bug so it can be followed?

Comment 125 Claude Frantz 2019-07-09 09:48:09 UTC
As I have mentioned in my last comment, IMHO it's essential to reinstall grub2 in the upgraded system environment, with `grub2-install /dev/sda` (use your appropriate device), independently of the version you are coming from. In order to avoid bad surprises, running 'grub2-mkconfig -o /boot/grub2/grub.cfg' with 'GRUB_ENABLE_BLSCFG=true' is probably recommended. 

The real (only little) challenge is to boot into the upgraded system, when GRUB2 is not able to boot itself and alone. See my previous comment.

Comment 126 Steven Haigh 2019-07-20 06:29:18 UTC
So there's been that many changes to grub2 recently - that now seem to be seeping into F29 as well...

What settings should be used now?

I have multiple Xen VMs running F30 and without manually re-creating grub.cfg on every kernel update, at least some will fail to boot when used with pygrub. All of these have grubby-depreciated installed, and they all have 'GRUB_ENABLE_BLSCFG=false' set.

My Fedora 29 Xen server won't pick up newer kernels in the 'Fedora, with Xen hypervisor' boot option without a manual run of grub2-mkconfig. That system has GRUB_ENABLE_BLSCFG=true

Really, this entire F30 set of grub problems have been a complete mess.

Comment 127 Chris Murphy 2019-07-23 16:53:13 UTC
(In reply to Steven Haigh from comment #126)
> My Fedora 29 Xen server won't pick up newer kernels in the 'Fedora, with Xen
> hypervisor' boot option without a manual run of grub2-mkconfig. That system
> has GRUB_ENABLE_BLSCFG=true

There's nothing in Fedora 29 updates that would have switched this to true. It must have been done manually using `grub2-switch-to-blscfg` which the feature page for Booloaderspec by default says you also need to install `grubby-deprecated` package.
https://fedoraproject.org/wiki/Changes/BootLoaderSpecByDefault#Upgrade.2Fcompatibility_impact

Anyway, this is off topic for this bug, the two aren't related. I suggest filing a new bug report against Fedora 29 if you can reproduce the problem; or if there's some problem with grubby-deprecated properly adding menu entries.

Comment 128 Germano Massullo 2019-09-21 13:20:55 UTC
I am very disappointed to have experienced this bug, regardless the reasons.
The first Fedora that has been installed on this system was 17 and the motherboard is Gigabyte GA-890FXA-UD5

After having installed again Grub, I runned some tests about GRUB_ENABLE_BLSCFG flag in /etc/default/grub.

1) If GRUB_ENABLE_BLSCFG=true then I experienced the problem when running grub2-mkconfig -o /boot/grub2/grub.cfg
2) If GRUB_ENABLE_BLSCFG=false then I have not experienced the problem when running grub2-mkconfig -o /boot/grub2/grub.cfg
3) If there was no GRUB_ENABLE_BLSCFG flag then I have not experienced the problem when running grub2-mkconfig -o /boot/grub2/grub.cfg

Comment 129 Germano Massullo 2019-09-21 13:21:59 UTC
(In reply to Germano Massullo from comment #128)
> I am very disappointed to have experienced this bug, regardless the reasons.
> The first Fedora that has been installed on this system was 17 and the
> motherboard is Gigabyte GA-890FXA-UD5
> 
> After having installed again Grub, I runned some tests about
> GRUB_ENABLE_BLSCFG flag in /etc/default/grub.
> 
> 1) If GRUB_ENABLE_BLSCFG=true then I experienced the problem when running
> grub2-mkconfig -o /boot/grub2/grub.cfg
> 2) If GRUB_ENABLE_BLSCFG=false then I have not experienced the problem when
> running grub2-mkconfig -o /boot/grub2/grub.cfg
> 3) If there was no GRUB_ENABLE_BLSCFG flag then I have not experienced the
> problem when running grub2-mkconfig -o /boot/grub2/grub.cfg

By the way with "the problem" I meant https://fedoraproject.org/wiki/Common_F30_bugs#GRUB_boot_menu_is_not_populated_after_an_upgrade

Comment 130 Javier Martinez Canillas 2019-10-15 07:37:19 UTC
*** Bug 1705927 has been marked as a duplicate of this bug. ***

Comment 131 Javier Martinez Canillas 2019-10-15 07:38:23 UTC
*** Bug 1706447 has been marked as a duplicate of this bug. ***

Comment 132 Javier Martinez Canillas 2019-10-15 07:39:42 UTC
*** Bug 1707199 has been marked as a duplicate of this bug. ***

Comment 133 Javier Martinez Canillas 2019-10-15 08:14:55 UTC
*** Bug 1686059 has been marked as a duplicate of this bug. ***

Comment 134 Federic 2019-10-15 09:39:52 UTC
Man this whole thing is a mess.  Users should not have to deal with this kind of crap when upgrading. 

The pre-Fed20 condition is maybe an excusable oversight, but this clearly has not been properly tested before being pushed for inclusion. 

grubby.x86_64 : Command line tool for updating bootloader configs

When I see the kind of juvenile mentality which names a package grubby, presumably because they find it amusing, instead of something informative like grub-cli I'm not surprised this is a mess. 

Presence of files are being assumed, state of machines are being assumed, it does not seem that this got more than a quick VM, "worksforme" before being included in a major release.  

I will keep my Fedora installation since I have invested time in it and don't want a clean new system where I have to recreate everything I have but will not be installing Fedora for others any longer.  I don't want to inflict this kind of mess on others or feel obliged to help them out when their system won't work. 

However this got included needs reviewing and a strategic review of package testing and approval needs to be done. That this got through, is far more wide ranging and worrying than the bug itself.

Thanks to Javier for his dedicated efforts to get this straightened out.

Comment 135 Bob Gustafson 2019-10-27 17:07:03 UTC
Will it be possible to successfully upgrade to Fedora 31 (after Tuesday) from Fedora 29 (or Fedora 30 in zombie-land - install thinks it is 30, but only booting 29 kernels)?

Or will this problem just go away in EOL land?

Comment 136 Enrique Meléndez 2019-11-05 20:21:04 UTC
Hi there.

I was hit by this bug in a most unfortunate way. I have my Fedora box since fc16, to wit, 8 years ago. I have updated the OS ever since with only minor issues. Unbeknownst to me, in the update from fc29 to fc30 grub2 messed up, no script to upgrade to bls was executed or was not successful but for some reason a grub menu was maintained, albeit with the fc29 kernels. I was always booting to the same fc29 kernel. Bear in mind that it was originally a grub (not grub2) system, lots of changes between then and now. I did not notice this failure. Sure enough, I rebooted when necessary, but I tend to do so remotely, at night, to save me the pain of an old computer reboot, so I don't actually see the grub menu. Never had a problem before.

Now I was about to make the upgrade from fc30 to fc31 and I checked if I was running the correct kernel and I was not. I googled around, found the info and decided a grub2-mkconfig would do the trick. Shame on me, I did not read enough and did not make a grub2-install so I found myself with nonfunctional grub2 that was unable to find any config files, and did not even provide a grub> prompt.

I had to boot from a live USB, reinstall grub, reboot from the hard disk to have the system up an running and make a grub2-switch-to-blscfg by hand (and have it updated to fc31).

Boot problems are what most scare users. Best scenario is they find themselves with a grub> prompt that is so very unpleasant to work with, to say the very least (lots of commands to be input, nowhere to copy from, no easy internet, info not at hand, no easy copy-paste, error prone). Worst case scenario is, well, still worse. And the uncertainty remains whether user data are still there.

I can't agree more with Frederic. High QA should be exercised at all stages, but especially on the bootloader packages where a failure has severe consequences. If, as Javier says, the upgrade process is not deterministic and doubts remain whether the result a bootable computer. this is really a serious matter that should have been carefully sorted out before release (of fc30).

Bob, I'd advise to make sure grub2 is installed properly, (-install and -mkconfig) and that /etc/default/grub reflects grub2-switch-to-blscfg has been run.

Comment 137 Bob Gustafson 2019-11-05 23:55:53 UTC
(In reply to Enrique Meléndez from comment #136)
> Hi there.
> 
> I was hit by this bug in a most unfortunate way. I have my Fedora box since
> fc16, to wit, 8 years ago. I have updated the OS ever since with only minor
>

See bug https://bugzilla.redhat.com/show_bug.cgi?id=1711710

Slightly different, but similar to your problem. Possibly the same underneath.

> 
> Bob, I'd advise to make sure grub2 is installed properly, (-install and
> -mkconfig) and that /etc/default/grub reflects grub2-switch-to-blscfg has
> been run.

Some of the bugs and notes I have read seem to indicate that they are working with BIOS systems and not UEFI.
Mine is UEFI

blscfg switch:
GRUB_ENABLE_BLSCFG=true

So far, no definitive solution from RH on the cyclic Error situation - see near the end of comments on
Bug https://bugzilla.redhat.com/show_bug.cgi?id=1711710

Comment 138 Enrique Meléndez 2019-11-06 17:26:44 UTC
(In reply to Bob Gustafson from comment #137)


> See bug https://bugzilla.redhat.com/show_bug.cgi?id=1711710
> 
> Slightly different, but similar to your problem. Possibly the same
> underneath.

You are correct, thank you. But "in the grand scheme of things", I think the underlying cause is less than adequate QA.

> Some of the bugs and notes I have read seem to indicate that they are
> working with BIOS systems and not UEFI.
> Mine is UEFI

That may be the case; mine is BIOS

Comment 139 Bob Gustafson 2019-11-06 17:41:29 UTC
> You are correct, thank you. But "in the grand scheme of things", I think the
> underlying cause is less than adequate QA.
> 

The Anaconda/Install/Upgrade period of a Fedora instance is very short compared to its running existence.

If it fails for a particular set of circumstances our of the hundreds of possible install environments..

One can say that it is not cost/resource effective to spend a lot of time on a particular problem.

In my case, I can buy a disk, install it in the problem system, copy the problem disk, wipe problem disk, do a fresh install of F31, and over the next month or so, copy back files from my 'backup'. Maybe a day of time total. Downtime of problem system (my gateway system..) maybe 2 hours.

I'm not going to switch to Ubuntu over the problem.

Comment 140 Federic 2019-11-07 12:47:28 UTC
> The Anaconda/Install/Upgrade period of a Fedora instance is very short compared to its running existence.

Well it only short when it WORKS.  I have been trying to upgrade fed29 to fed30 for over two months. There has been basically no progress on this. 

It is important that it WORKS, having such an important and esoteric part of the system go tits up is just not acceptable. That is a reason why it has to be done PROPERLY, not a justification for a half baked release cycle with insufficient testing. 

I have no wish to do a clean installation and try to reconstruct all the software I've installed over 6 or 7 years. There's a bit more to it than copying across my firefox bookmarks from my home directory. 

I gave up using windows about 15 years ago because it seemed the only solution to most problems was clean re-installation. I hope RH is not going the same way.

Comment 141 Chris Murphy 2019-11-07 14:44:25 UTC
(In reply to Federic from comment #140)
> > The Anaconda/Install/Upgrade period of a Fedora instance is very short compared to its running existence.
> 
> Well it only short when it WORKS.  I have been trying to upgrade fed29 to
> fed30 for over two months. There has been basically no progress on this.

I don't know what you mean by "this". This bug is fixable only as described in the Common Bugs for Fedora 30, on BIOS systems you have to run 'grub2-install /dev/sdX' that's just the reality. There is an idea of something owning the bootloader and making sure it's up to date, but right now this only happens on UEFI (and maybe uboot on ARM?).


> It is important that it WORKS, having such an important and esoteric part of
> the system go tits up is just not acceptable. That is a reason why it has to
> be done PROPERLY, not a justification for a half baked release cycle with
> insufficient testing.

You need to refine your argument. There was sufficient testing, and the problem was discovered before release.


> I have no wish to do a clean installation and try to reconstruct all the
> software I've installed over 6 or 7 years. There's a bit more to it than
> copying across my firefox bookmarks from my home directory. 
> 
> I gave up using windows about 15 years ago because it seemed the only
> solution to most problems was clean re-installation. I hope RH is not going
> the same way.

There simply aren't resources to guarantee any of this, even merely as a nice to have feature, let alone as anything release blocking. To be a release blocking bug, there must be a release criterion. The applicable release criterion in this case related to upgrades, specifically says upgrades must work from Fn-2 and Fn-1 to Fn *from clean installs*. As the bootloader in the MBR gap is much older, is the problem. That's a long standing known GRUB liability.

The alternative to finding a clever solution, is to do what Microsoft has done on BIOS since forever: always stomp on the bootloader, forcibly upgrade it always, even if it means stomping on some other bootloader. Right now we have no signatures or way to determine whether we own the bootloader in the MBR or MBR gap. It can be blindly updated. Or never updated. And the long standing strategy has been to never update it on BIOS while updating it (incidentally in some sense) on UEFI. For sure that strategy has a negative consequence, this bug. But the alternative strategy is making quite a lot of other users who don't use a Fedora GRUB bootloader pretty angry by always stomping on the MBR and MBR gap locations with a forced reinstall.

And I'd argue that forced reinstall of the bootloader is probably a better policy. But it will annoy more users, at least in the near term.

Comment 142 Federic 2019-11-07 16:47:59 UTC
Thanks for the reply Chris. by "this" I meant the whole bls migration, it seems to have caused  a stack problems. I regard the fed20 or earlier bug a tolerable oversight and don't think further windozification of Linux as a good idea, having spent years railing against MS for stomping on everyone's MBR. Having everyone do that means that nothing will work without constant retroaction. Not having dualboot easily installable will severely impact Linux adoption. 

I find Bob's argument that it doesn't really warrant much effort because its lifetime is very short, totally misses the point that when it goes wrong it's a show stopper which is not "short" at all.

Comment 143 Chris Murphy 2019-11-07 17:09:07 UTC
Right, so the problem on BIOS is that there's only one place for a bootloader, and there's no bootloader signature so we don't know whose bootloader (or variant of such) it is, and whether or not we've got some implicit invitation or obligation to keep it up to date.

And in the meantime, it's either, step on it. Or don't step on it. It's totally binary.

It might seem like an alternative is to not have done BLS migration on upgrades. Only do BLS by default on new clean installs. The problem there is that we then have to support and test two very different bootloader paradigms for at least two cycles. But then that leaves Fn+3 users where? Totally abanadoned at some point. The only way to not abandon them, was to bring them forward to BLS. And as a consequence of that, some of them have to do a manual update of their bootloader (via grub2-install).

Now if you're having other problems that prevent Fedora 29 from being upgraded to Fedora 30 or 31, that's a bug. And at the moment I'm not clear on what that problem is, but if it's not related to *this* bug ID, then it needs its own bug report. I also recommend that you post a description on the Fedora test@ list along with the new bug ID so there's a better chance it gets triaged. Feel free to cc my email on that bug report also.

Comment 144 Bob Gustafson 2019-11-07 19:08:38 UTC
(In reply to Chris Murphy from comment #143)

> Now if you're having other problems that prevent Fedora 29 from being
> upgraded to Fedora 30 or 31, that's a bug. And at the moment I'm not clear
> on what that problem is, but if it's not related to *this* bug ID, then it
> needs its own bug report. I also recommend that you post a description on
> the Fedora test@ list along with the new bug ID so there's a better chance
> it gets triaged. Feel free to cc my email on that bug report also.

My problem is detailed on:
See bug https://bugzilla.redhat.com/show_bug.cgi?id=1711710

It is also an upgrade problem 29->30, but happens on a UEFI system where the grub2 solution proposed here does not work.

Seems to be a problem which causes the 

    sudo dnf system-upgrade download --releasever=30

To error out and never complete.

Comment 145 Federic 2019-11-07 19:28:38 UTC
> Feel free to cc my email on that bug report also.

thanks Chris, details here:
https://bugzilla.redhat.com/show_bug.cgi?id=1761251

it got reasigned to Grub at one stage because I suspected one of these grub issue may be the cause but I'm not so sure , so right now I have a upgrade problem with no clear indication.  It would be good if you could throw your 2c in there, it may help clarify what is going on.


Note You need to log in before you can comment on or make changes to this bug.