Bug 1665060

Summary: "grubby --default-kernel" reports different kernel on s390/zipl
Product: Red Hat Enterprise Linux 8 Reporter: Lukáš Doktor <ldoktor>
Component: s390utilsAssignee: Dan Horák <dhorak>
Status: CLOSED CURRENTRELEASE QA Contact: Vilém Maršík <vmarsik>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 8.0CC: fmartine, jstancek, mhayden, thuth, vkabatov, vmarsik, wchadwic
Target Milestone: rc   
Target Release: 8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: s390utils-2.6.0-12.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-14 00:47:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
"/etc/zipl" probably created by RHEL-8-ALPHA ("yum update")
none
/boot/loader/entries
none
[PATCH] Make kernel-install to update default if present in zipl.conf none

Description Lukáš Doktor 2019-01-10 12:37:36 UTC
Created attachment 1519773 [details]
"/etc/zipl" probably created by RHEL-8-ALPHA ("yum update")

Description of problem:
Running "grubby --default-kernel" reports different kernel, than which is booted by default. I also tried "grubby --default-kernel --zipl" to be sure the correct  backend is used.

Note that this comes from RHEL-8-ALPHA installed host that is being regularly updated to date but recently I noticed "yum update kernel" does not changes the default kernel, which is ancient kernel-4.18.0-9.el8.s390x. I'm mentioning this because IIRC the configuration ways changed between alpha and now and there might be some left-over config files.

The correct booted kernel is defined in "/etc/zipl.conf", but that was probably created automatically (I haven't modified anything with regards to zipl)

Version-Release number of selected component (if applicable):
grubby-8.40-34.el8.s390x
kernel-4.18.0-51.el8.s390x
kernel-4.18.0-58.el8.s390x
kernel-4.18.0-9.el8.s390x
s390utils-base-2.6.0-11.el8.s390x

How reproducible:
Always on my machine

Steps to Reproduce:
1. reboot
2. grubby --default-kernel
3. uname -r

Actual results:
/boot/vmlinuz-4.18.0-58.el8.s390x
4.18.0-9.el8.s390x

Expected results:
/boot/vmlinuz-4.18.0-9.el8.s390x
4.18.0-9.el8.s390x

Additional info:
I also tried executing:

    # grubby --set-default=/boot/vmlinuz-4.18.0-58.el8.s390x
    The default is /boot/loader/entries/fbea514bceb9461aa6a653cade22a446-4.18.0-58.el8.s390x.conf with index 0 and kernel /boot/vmlinuz-4.18.0-58.el8.s390x

but it still booted into kernel.4.18.0-9.el8.s390x

Comment 1 Lukáš Doktor 2019-01-10 12:39:07 UTC
Created attachment 1519774 [details]
/boot/loader/entries

Comment 2 Lukáš Doktor 2019-01-10 12:44:30 UTC
I forgot to mention that this is executed on z13 LPAR. After modifying the /etc/zipl.conf and running zipl it successfully booted into the desired kernel.

Comment 3 Javier Martinez Canillas 2019-01-10 12:55:16 UTC
As you mention, I think the problem is that the installation was made with RHEL-8-ALPHA that didn't install with BLS by default. So the zipl.conf file had the following left over entries that should be removed:

[4.18.0-9.el8.s390x]
        image=/boot/vmlinuz-4.18.0-9.el8.s390x
        parameters="root=/dev/mapper/rhel_kerneldev16-root crashkernel=auto rd.dasd=0.0.7007 rd.dasd=0.0.7107 rd.dasd=0.0.7207 rd.dasd=0.0.7307 rd.lvm.lv=rhel_kerneldev16/root rd.lvm.lv=rhel_kerneldev16/swap cio_ignore=all,!condev,!0.0.7308-0.0.730f rd.znet=qeth,0.0.0900,0.0.0901,0.0.0902,layer2=1,portno=0 LANG=en_US.UTF-8"
        ramdisk=/boot/initramfs-4.18.0-9.el8.s390x.img
[linux-0-rescue-fbea514bceb9461aa6a653cade22a446]
        image=/boot/vmlinuz-0-rescue-fbea514bceb9461aa6a653cade22a446
        ramdisk=/boot/initramfs-0-rescue-fbea514bceb9461aa6a653cade22a446.img
        parameters="root=/dev/mapper/rhel_kerneldev16-root crashkernel=auto rd.dasd=0.0.7007 rd.dasd=0.0.7107 rd.dasd=0.0.7207 rd.dasd=0.0.7307 rd.lvm.lv=rhel_kerneldev16/root rd.lvm.lv=rhel_kerneldev16/swap cio_ignore=all,!condev,!0.0.7308-0.0.730f rd.znet=qeth,0.0.0900,0.0.0901,0.0.0902,layer2=1,portno=0"

Comment 4 Lukáš Doktor 2019-01-10 13:41:37 UTC
Well, I don't really know the hierarchy, but I'd expect grubby to combine the information. The "[4.18.0-9.el8.s390x]" should not collide with "/boot/loader/entries/fbea514bceb9461aa6a653cade22a446-4.18.0-58.el8.s390x.conf" and the default in zipl.conf is `default=Red Hat Enterprise Linux (4.18.0-58.el8.s390x) 8.0 (Ootpa)`. So are you sure I should simply remove those 2 entries and it will magically start working? (do I need to re-run zipl?) I don't want to cripple my LPAR as I don't have access to the console.

Comment 5 Javier Martinez Canillas 2019-01-10 14:33:17 UTC
The old grubby tool was moved to the grubby-deprecated package and is no longer installed. The grubby tool now is just a wrapper script that only understands BLS snippets, that's why it doesn't know about the entries defined in the zipl.conf file.

The zipl command line tool did combine the entries in zipl.conf and the BLS snippets when preparing the IPL device for boot. But the entries defined in the zipl.conf take precedence over the ones defined in /boot/loader/entries. So the problem I think is that the default in zipl.conf was set to 4.18.0-9.el8.s390x and that's why this kernel was booted.

Anaconda no longer sets default in zipl.conf on installation, since the idea with BLS is that adding a new kernel should just be dropping a file in /boot/loader/entries without the need to modify the zipl.conf file. The zipl tool will sort the BLS entries by filename using the RPM sorting algorithm.

One bug though is that on kernel installation the default should be updated if this was set in zipl.conf (i.e: with grubby --set-default) and UPDATEDEFAULT=yes in /etc/sysconfig/kernel.

I mentioned that the left overs entries in zipl.conf should be manually removed, because otherwise they will always be added by the zipl tool. And also the default keyword should be removed from zipl.conf since otherwise that kernel will always be booted. At least until the kernel installation is fixed to update the default.

About your question, yes you should always re-run zipl since otherwise the changes are not made in the IPL device. The changes to zipl.conf or the /boot/loader/entries BLS snippets are not reflected until zipl is executed.

To make sure that you don't cripple your machine, you can try zipl with verbose and dry-run to see what changes will be made:

$ zipl -V --dry-run

Comment 6 Lukáš Doktor 2019-01-10 14:55:48 UTC
Thank you for the detailed explanation. Removed the sections and "default" and it seems to work well now. So let's only focus on "grubby --set-default" as s390x people are used to "/etc/zipl.conf" and might still use this on their machines.

Comment 7 Javier Martinez Canillas 2019-01-10 16:31:52 UTC
Created attachment 1519871 [details]
[PATCH] Make kernel-install to update default if present in zipl.conf

This has to be fixed in the 20-zipl-kernel.install script that's in the s390utils package.

I've attached a patch with the fix for this package and also changed the component.

Comment 9 Javier Martinez Canillas 2019-01-11 10:25:58 UTC
To reproduce this bug, a default has to be set in zipl.conf (i.e: with grubby) and then update the kernel package. On kernel update the zipl.conf default is not updated.

An easy reproducer are the following commands:

$ grubby --set-default /boot/vmlinuz-0-rescue-$(cat /etc/machine-id)

$ grep default /etc/zipl.conf
default=Red Hat Enterprise Linux (0-rescue-e7d8d6d4c5d14f2c82bcae6c79d3b8d8) 8.0 (Ootpa)

$ dnf reinstall kernel-core

$ grep default /etc/zipl.conf
default=Red Hat Enterprise Linux (0-rescue-e7d8d6d4c5d14f2c82bcae6c79d3b8d8) 8.0 (Ootpa)

When reinstalling the kernel-core package, the default in zipl.conf has to be updated with the installed kernel. But instead the previously set value remains.

After the fix, it should be:

$ dnf reinstall kernel-core

$ grep default /etc/zipl.conf
default=Red Hat Enterprise Linux (4.18.0-58.el8.x86_64) 8.0 (Ootpa)

Comment 10 Dan Horák 2019-01-11 10:40:36 UTC
let's see if we can get exception approved

Comment 15 Vilém Maršík 2019-02-20 13:00:36 UTC
Looks fixed in s390utils-base-2.6.0-13.el8.s390x:

# rpm -q s390utils-base
s390utils-base-2.6.0-11.el8.s390x

# grep default= /etc/zipl.conf
default=Red Hat Enterprise Linux (0-rescue-39f6a359b23c4a1f95fd2d56fade0cd1) 8.0 (Ootpa)

# dnf reinstall kernel-core
(...)
# grep default= /etc/zipl.conf
default=Red Hat Enterprise Linux (0-rescue-39f6a359b23c4a1f95fd2d56fade0cd1) 8.0 (Ootpa)

# dnf upgrade s390utils
(...)
# rpm -q s390utils-base
s390utils-base-2.6.0-13.el8.s390x

# grep default= /etc/zipl.conf
default=Red Hat Enterprise Linux (0-rescue-39f6a359b23c4a1f95fd2d56fade0cd1) 8.0 (Ootpa)

# dnf reinstall kernel-core
(...)
# grep default= /etc/zipl.conf
default=Red Hat Enterprise Linux (4.18.0-68.el8.s390x) 8.0 (Ootpa)

Comment 16 Jan Stancek 2019-04-26 22:30:57 UTC
(In reply to Javier Martinez Canillas from comment #9)
> One bug though is that on kernel installation the default should be updated if this was set in zipl.conf

Why is here this requirement for default to be present before new one is set?

As you mentioned, after installation it's not present:
# cat /etc/zipl.conf
[defaultboot]
defaultauto
prompt=1
timeout=5
target=/boot

So, after kernel update nothing happens:
...
+ sed -i -e 's,^default=.*,default=Red Hat Enterprise Linux (4.18.0-80.el8.cki.s390x) 8.0 (Ootpa),' /etc/zipl.conf

Comment 17 Javier Martinez Canillas 2019-04-27 07:11:11 UTC
(In reply to Jan Stancek from comment #16)
> (In reply to Javier Martinez Canillas from comment #9)
> > One bug though is that on kernel installation the default should be updated if this was set in zipl.conf
> 
> Why is here this requirement for default to be present before new one is set?
> 
> As you mentioned, after installation it's not present:
> # cat /etc/zipl.conf
> [defaultboot]
> defaultauto
> prompt=1
> timeout=5
> target=/boot
> 
> So, after kernel update nothing happens:
> ...
> + sed -i -e 's,^default=.*,default=Red Hat Enterprise Linux
> (4.18.0-80.el8.cki.s390x) 8.0 (Ootpa),' /etc/zipl.conf

Yes, it's not set but that shouldn't be a problem since zipl will sort the entries before updating the IPL.

Comment 18 Jan Stancek 2019-04-27 07:25:14 UTC
(In reply to Javier Martinez Canillas from comment #17)
> Yes, it's not set but that shouldn't be a problem since zipl will sort the
> entries before updating the IPL.

So, I'm assuming this is a case where new kernel is viewed as 'smaller' than current one:
  4.18.0-80.el8.cki.s390x < 4.18.0-80.el8.s390x

Veronika (and her team) observation is, that there is inconsistency between arches. While other arches boot 'cki' kernel after update, s390 does not.

Comment 19 Javier Martinez Canillas 2019-04-29 07:53:39 UTC
(In reply to Jan Stancek from comment #18)
> (In reply to Javier Martinez Canillas from comment #17)
> > Yes, it's not set but that shouldn't be a problem since zipl will sort the
> > entries before updating the IPL.
> 
> So, I'm assuming this is a case where new kernel is viewed as 'smaller' than
> current one:
>   4.18.0-80.el8.cki.s390x < 4.18.0-80.el8.s390x
> 
> Veronika (and her team) observation is, that there is inconsistency between
> arches. While other arches boot 'cki' kernel after update, s390 does not.

Yes, I think that's bug 1698363. I've to take a look to that. In the meantime, they can run as a workaround in the s390x case: grubby --set-default /boot/4.18.0-80.el8.cki.s390x && zipl

Comment 21 Major Hayden 🤠 2019-10-17 16:24:42 UTC
Javier -- We've had that workaround in our scripts[0] since April 2019, but it has stopped working very recently. We started seeing the same issues (higher version number kernel being selected as default) starting today.

Comment 22 Javier Martinez Canillas 2019-10-17 18:59:16 UTC
(In reply to Major Hayden from comment #21)
> Javier -- We've had that workaround in our scripts[0] since April 2019, but

It seems that you forgot to add the reference for [0]?

> it has stopped working very recently. We started seeing the same issues
> (higher version number kernel being selected as default) starting today.

That's strange since the grubby package has not been updated since May 2019 and I don't see a change in the s390utils package that could have caused that regression.

What's the version of the installed grubby and s390utils packages?

Comment 23 Javier Martinez Canillas 2019-10-18 10:45:22 UTC
(In reply to Javier Martinez Canillas from comment #19)
> (In reply to Jan Stancek from comment #18)
> > (In reply to Javier Martinez Canillas from comment #17)
> > > Yes, it's not set but that shouldn't be a problem since zipl will sort the
> > > entries before updating the IPL.
> > 
> > So, I'm assuming this is a case where new kernel is viewed as 'smaller' than
> > current one:
> >   4.18.0-80.el8.cki.s390x < 4.18.0-80.el8.s390x
> > 
> > Veronika (and her team) observation is, that there is inconsistency between
> > arches. While other arches boot 'cki' kernel after update, s390 does not.
> 
> Yes, I think that's bug 1698363. I've to take a look to that. In the
> meantime, they can run as a workaround in the s390x case: grubby
> --set-default /boot/4.18.0-80.el8.cki.s390x && zipl

So I had a typo here... sorry about that. I meant the following:

 $ grubby --set-default /boot/vmlinuz-4.18.0-80.el8.cki.s390x && zipl

Comment 24 Major Hayden 🤠 2019-10-18 12:46:48 UTC
Thanks for that, Javier. Can you let us know if we got it right this time? https://github.com/CKI-project/tests-beaker/pull/409

Comment 25 Javier Martinez Canillas 2019-10-18 12:55:41 UTC
(In reply to Major Hayden from comment #24)
> Thanks for that, Javier. Can you let us know if we got it right this time?
> https://github.com/CKI-project/tests-beaker/pull/409

Yes, looks good. Please let me know if is still not working after your change.