1548729 – grub.cfg isn't updated by kernel updates

Bug 1548729 - grub.cfg isn't updated by kernel updates

Summary: grub.cfg isn't updated by kernel updates

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	grubby
Sub Component:
Version:	27
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Peter Jones
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-02-24 15:10 UTC by Georg Sauthoff
Modified:	2018-12-09 17:15 UTC (History)
CC List:	19 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2018-11-30 20:52:25 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Georg Sauthoff 2018-02-24 15:10:53 UTC

Description of problem:
After regular system updates (via `dnf update) I noticed that grub.cfg isn't updated and thus an outdated kernel is booted instead of the last installed one.


Version-Release number of selected component (if applicable):
kernel-4.15.4-300.fc27.x86_64

How reproducible: always

Steps to Reproduce:
1. regularly execute `dnf update`
2.
3.

Actual results:

The last dnf transaction that involved a kernel update:

Begin time     : Sat 24 Feb 2018 10:17:42 AM CET

    Erase    kernel-4.14.8-300.fc27.x86_64                     @updates
    Install  kernel-4.15.4-300.fc27.x86_64                     @updates
[..]

The installed kernels:
rpm -q kernel
kernel-4.14.11-300.fc27.x86_64
kernel-4.14.13-300.fc27.x86_64
kernel-4.15.4-300.fc27.x86_64

But an out-of-date grub2.cfg:

grep vmlinuz-4 /etc/grub2-efi.cfg | sed 's/root.*//'
        linuxefi /vmlinuz-4.14.13-300.fc27.x86_64
        linuxefi /vmlinuz-4.14.11-300.fc27.x86_64
        linuxefi /vmlinuz-4.14.3-300.fc27.x86_64
        linuxefi /vmlinuz-4.13.16-302.fc27.x86_64



Expected results:

An up-to-date grub2.cfg whose entries are in sync with the rpm -q kernel output,  i.e.:

	linuxefi /vmlinuz-4.15.4-300.fc27.x86_64 
	linuxefi /vmlinuz-4.14.13-300.fc27.x86_64 
	linuxefi /vmlinuz-4.14.11-300.fc27.x86_64 

Additional info:


The system only supports UEFI boot and was installed using the standard iso.

I can work-around this issue via manually recreating the grub.cfg:

mount /boot/efi
grub2-mkconfig -o  /etc/grub2-efi.cfg
grub2-set-default 0
shutdown -r now

The system reboots fine with the latest kernel, then.

FWIW, the grub2.cfg points to a non-existant file:

ls -l /etc/grub2.cfg                                
lrwxrwxrwx. 1 root root 22 2018-01-23 22:25 /etc/grub2.cfg -> ../boot/grub2/grub.cfg
ls //boot/grub2/grub.cfg
ls: cannot access '//boot/grub2/grub.cfg': No such file or directory

And grub2-efi.cfg points to:

ls -l /etc/grub2-efi.cfg 
lrwxrwxrwx. 1 root root 31 2018-01-23 22:25 /etc/grub2-efi.cfg -> ../boot/efi/EFI/fedora/grub.cfg
ls -l /boot/efi/EFI/fedora/grub.cfg
-rwx------. 1 root root 6431 2018-02-24 15:36 /boot/efi/EFI/fedora/grub.cfg

A

rpm -q --scripts kernel-4.15.4-300.fc27.x86_64

prints nothing, although I can see some post script macro definitions in the corresponding spec file.

Comment 1 Laura Abbott 2018-02-26 18:20:59 UTC

The updates to grub aren't handled by the kernel package itself. Moving to the appropriate component.

Comment 2 Georg Sauthoff 2018-04-07 19:17:42 UTC

I think all this is caused by some weird services race condition/issue that results in /boot/efi not being mounted at the end of the systemd boot process.

Thus, when I update the kernels the grubby tool likely does call `grub2-mkconfig -o  /etc/grub2-efi.cfg` which (silently) fails due to the symlink

/etc/grub2-efi.cfg -> ../boot/efi/EFI/fedora/grub.cfg

pointing to a (currently) non-existent file.

Besides fixing the root cause, this grubby change would be useful:

On EFI systems, check if /boot/efi is mounted - if not, error out. Also, check the exit status of `grub2-mkconfig` - if it's unequal 0 error out.

(Yes, grub2-mkconfig exits with code 1 when the output file is a broken symlink.)


Additional information:

I installed the system using the standard graphical installer and configured a BTRFS mirror install (on 2 SSD drives).


The boot log shows these relevant entries:

Apr 07 19:14:48 systemd[1]: Started File System Check on /dev/disk/by-uuid/8898-7E5F.
Apr 07 19:14:48 systemd[1]: Started Timer to wait for more drives before activating degraded array..
Apr 07 19:14:48 systemd[1]: Mounting /boot...
Apr 07 19:14:48 kernel: EXT4-fs (md127): mounted filesystem with ordered data mode. Opts: (null)
Apr 07 19:14:48 systemd[1]: Mounted /boot.
Apr 07 19:14:48 systemd[1]: Mounting /boot/efi...
Apr 07 19:14:48 systemd[1]: Mounted /boot/efi.
Apr 07 19:14:48 systemd[1]: Reached target Local File Systems.
[..]
Apr 07 19:15:20 systemd[1]: Stopped target Local File Systems.
Apr 07 19:15:20 systemd[1]: Created slice system-mdadm\x2dlast\x2dresort.slice.
Apr 07 19:15:20 systemd[1]: Starting Activate md array even though degraded...
Apr 07 19:15:20 systemd[1]: Unmounting /boot/efi...
Apr 07 19:15:20 systemd[1]: Started Activate md array even though degraded.
Apr 07 19:15:20 systemd[1]: Unmounted /boot/efi.
Apr 07 19:15:20 systemd[1]: Stopped File System Check on /dev/disk/by-uuid/8898-7E5F.


I don't understand why systemd thinks that the array is degraded since /proc/mdstats output looks fine:

md126 : active raid1 sdd2[1] sdc2[0]
      205760 blocks super 1.0 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

Also, a

echo check > /sys/block/md126/md/sync_action

yields no mismatches.

A manual `fsck.vfat /dev/md126` immediately finishes and returns success:
     
fsck.fat 4.1 (2017-01-24)
/dev/md126: 23 files, 2744/51383 clusters

The FS also manually mounts without any issues:

# grep boot/efi /etc/fstab 
UUID=8898-7E5F          /boot/efi               vfat    umask=0077,shortname=winnt 0 2
# mount /boot/efi
# mount | grep boot/efi
/dev/md126 on /boot/efi type vfat (rw,relatime,fmask=0077,dmask=0077,codepage=437,iocharset=ascii,shortname=winnt,errors=remount-ro)


Another system, a BTRFS non-mirror 1-disk install of Fedora 27 doesn't show this issue, /boot/efi is automatically mounted during boot, as expected.

Comment 3 Georg Sauthoff 2018-04-08 15:20:50 UTC

More details:

The /usr/lib/systemd/system/mdadm-last-resort@.timer causes this.

See also:

[systemd-devel] Errorneous detection of degraded array
https://lists.freedesktop.org/archives/systemd-devel/2017-January/038223.html

The workaround described in the thread fixes the issue for me:

# cp /usr/lib/systemd/system/mdadm-last-resort@.* /etc/systemd/system/
# sed -i 's@^Conflicts=sys-devices-virtual-block-%i.device@ConditionPathExists=/sys/devices/virtual/block/%i@' \
    /etc/systemd/system/mdadm-last-resort@.*
# shutdown -r now

That means after this the /boot/efi FS is properly mounted and stays properly mounted, during boot, i.e. after boot is finished:

# mount | grep boot
/dev/md127 on /boot type ext4 (rw,relatime,seclabel,data=ordered)
/dev/md126 on /boot/efi type vfat (rw,relatime,fmask=0077,dmask=0077,codepage=437,iocharset=ascii,shortname=winnt,errors=remount-ro)


See also:
Need a uni-directional version of "Conflicts", e.g. "Aborted-by".
https://github.com/systemd/systemd/issues/5266

RAID1 unmounted during boot as degraded but can be mounted fine manually (Fedora 26)
https://unix.stackexchange.com/questions/401308/raid1-unmounted-during-boot-as-degraded-but-can-be-mounted-fine-manually/436362#436362

Comment 4 Ben Cotton 2018-11-27 16:17:01 UTC

This message is a reminder that Fedora 27 is nearing its end of life.
On 2018-Nov-30  Fedora will stop maintaining and issuing updates for
Fedora 27. It is Fedora's policy to close all bug reports from releases
that are no longer maintained. At that time this bug will be closed as
EOL if it remains open with a Fedora  'version' of '27'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 27 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 5 Ben Cotton 2018-11-30 20:52:25 UTC

Fedora 27 changed to end-of-life (EOL) status on 2018-11-30. Fedora 27 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 6 Georg Sauthoff 2018-12-09 17:15:14 UTC

On a fresh Fedora 29 system (same hardware), this issue doesn't occur anymore. No workaround necessary, anymore.

Note You need to log in before you can comment on or make changes to this bug.