Bug 1394259 - RHV-H 4.0 grub2-mkconfig fails
Summary: RHV-H 4.0 grub2-mkconfig fails
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: rhev-hypervisor-ng
Version: 4.0.0
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ovirt-4.1.1
: ---
Assignee: Ryan Barry
QA Contact: Huijuan Zhao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-11-11 13:49 UTC by Alessio
Modified: 2017-04-20 19:01 UTC (History)
15 users (show)

Fixed In Version: imgbased-0.9.11-0.1.el7ev
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of:
Environment:
Last Closed: 2017-04-20 19:01:49 UTC
oVirt Team: Node
Target Upstream Version:
Embargoed:
rbarry: needinfo-


Attachments (Terms of Use)
comment 9: grub files (2.04 KB, application/x-gzip)
2017-02-04 09:49 UTC, Huijuan Zhao
no flags Details
Comment 12: grub files after reboot several times (1.75 KB, application/x-gzip)
2017-02-06 03:06 UTC, Huijuan Zhao
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:1114 0 normal SHIPPED_LIVE redhat-virtualization-host bug fix and enhancement update 2017-04-20 22:57:46 UTC
oVirt gerrit 68749 0 'None' MERGED service: copy kernel and initrd to /boot on startup 2020-04-24 15:53:58 UTC
oVirt gerrit 70051 0 'None' MERGED service: copy kernel and initrd to /boot on startup 2020-04-24 15:53:58 UTC
oVirt gerrit 72047 0 'None' MERGED Add a grub2 config file 2020-04-24 15:53:58 UTC

Description Alessio 2016-11-11 13:49:15 UTC
Description of problem:

Hello all,
our RHEV 4.0 infrastructure is composited by:

1x RHEV-M hosted engine vm
2x RHEV-H hypervisors


We are facing a GRUB issue.
Let's think we have 2 scenarios:

1) In the first scenario we just installed both hypervisors with default options. Running grub2-mkconfig -o /boot/grub2/grub.cfg will correctly update the grub.cfg with the current kernel references.
In this moment we have 3.10.0-327.28.2.el7.x86_64 kernel and both vmlinuz and initramfs files are present under /boot

2) In the second scenario we updated both hypervisors. We are running now 3.10.0-327.36.1.el7.x86_64. This time running grub2-mkconfig -o /boot/grub2/grub.cfg will overwrite grub.cfg omitting the new kernel references because his files now are present under /boot/rhvh-4.0-0.20161012.0+1 directory:

# ls -l /boot
total 80344
-rw-r--r--. 1 root root   126431 Jun 27 20:52 config-3.10.0-327.28.2.el7.x86_64
drwxr-xr-x. 3 root root     4096 Aug 17 22:07 efi
-rw-r--r--. 1 root root   178176 Sep  5  2014 elf-memtest86+-4.20
drwxr-xr-x. 2 root root     4096 Aug 17 22:11 extlinux
drwx------. 6 root root     4096 Nov 11 13:26 grub2
-rw-r--r--. 1 root root 47977751 Sep 26 13:06 initramfs-3.10.0-327.28.2.el7.x86_64.img
-rw-r--r--. 1 root root 24429118 Nov 11 13:10 initramfs-3.10.0-327.36.1.el7.x86_64kdump.img
-rw-r--r--. 1 root root   603547 Aug 17 22:20 initrd-plymouth.img
drwx------. 2 root root    16384 Sep 26 13:02 lost+found
-rw-r--r--. 1 root root   176500 Sep  5  2014 memtest86+-4.20
drwxr-xr-x. 2 root root     4096 Sep 26 13:07 rhvh-4.0-0.20160817.0+1
drwxr-xr-x. 2 root root     4096 Oct 19 17:49 rhvh-4.0-0.20160919.0+1
drwxr-xr-x. 2 root root     4096 Nov  3 14:33 rhvh-4.0-0.20161012.0+1
-rw-r--r--. 1 root root   252632 Jun 27 20:54 symvers-3.10.0-327.28.2.el7.x86_64.gz
-rw-------. 1 root root  2964948 Jun 27 20:52 System.map-3.10.0-327.28.2.el7.x86_64
-rw-r--r--. 1 root root   326628 Nov 11  2014 tboot.gz
-rw-r--r--. 1 root root    12620 Nov 11  2014 tboot-syms
-rwxr-xr-x. 1 root root  5157728 Jun 27 20:52 vmlinuz-3.10.0-327.28.2.el7.x86_64



# ls -l /boot/rhvh-4.0-0.20161012.0+1
total 55044
-rw-r--r--. 1 root root   126431 Nov  3 14:32 config-3.10.0-327.36.1.el7.x86_64
-rw-r--r--. 1 root root 47860173 Nov  3 14:33 initramfs-3.10.0-327.36.1.el7.x86_64.img
-rw-r--r--. 1 root root   252739 Nov  3 14:32 symvers-3.10.0-327.36.1.el7.x86_64.gz
-rw-------. 1 root root  2965270 Nov  3 14:32 System.map-3.10.0-327.36.1.el7.x86_64
-rwxr-xr-x. 1 root root  5155840 Nov  3 14:32 vmlinuz-3.10.0-327.36.1.el7.x86_64



# grub2-mkconfig | grep 3.10.0
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-3.10.0-327.28.2.el7.x86_64
Found initrd image: /boot/initramfs-3.10.0-327.28.2.el7.x86_64.img
menuentry 'Red Hat Enterprise Linux (3.10.0-327.28.2.el7.x86_64) 7.2' --class red --class gnu-linux --class gnu --class os --unrestricted $menuentry_id_option 'gnulinux-3.10.0-327.28.2.el7.x86_64-advanced-/dev/mapper/rhvh-rhvh--4.0--0.20161012.0+1' {
        linux16 /vmlinuz-3.10.0-327.28.2.el7.x86_64 root=/dev/mapper/rhvh-rhvh--4.0--0.20161012.0+1 ro rd.lvm.lv=rhvh/rhvh-4.0-0.20161012.0+1 rd.lvm.lv=rhvh/swap rhgb hpsa.hpsa_allow_any=1 quiet
        initrd16 /initramfs-3.10.0-327.28.2.el7.x86_64.img
Found linux image: /boot/vmlinuz-3.10.0-327.28.2.el7.x86_64
Found initrd image: /boot/initramfs-3.10.0-327.28.2.el7.x86_64.img
menuentry 'Red Hat Enterprise Linux GNU/Linux, with tboot 1.8.1 and Linux 3.10.0-327.28.2.el7.x86_64' --class red --class gnu-linux --class gnu --class os --class tboot {
        echo    'Loading Linux 3.10.0-327.28.2.el7.x86_64 ...'
        module /vmlinuz-3.10.0-327.28.2.el7.x86_64 /vmlinuz-3.10.0-327.28.2.el7.x86_64 root=/dev/mapper/rhvh-rhvh--4.0--0.20161012.0+1 ro rd.lvm.lv=rhvh/rhvh-4.0-0.20161012.0+1 rd.lvm.lv=rhvh/swap rhgb hpsa.hpsa_allow_any=1 quiet  intel_iommu=on
        module /initramfs-3.10.0-327.28.2.el7.x86_64.img /initramfs-3.10.0-327.28.2.el7.x86_64.img
done


I also noticed that during the last update it renamed root logical volume:

# df -h /
Filesystem                                  Size  Used Avail Use% Mounted on
/dev/mapper/rhvh-rhvh--4.0--0.20161012.0+1  3.9G  1.8G  2.1G  47% /


# lvs | grep 2016
  rhvh-4.0-0.20160817.0                rhvh                                 Vwi---tz-k  14.78g pool00 root                                                      
  rhvh-4.0-0.20160817.0+1              rhvh                                 Vwi---tz--  14.78g pool00 rhvh-4.0-0.20160817.0                                     
  rhvh-4.0-0.20160919.0                rhvh                                 Vri---tz-k   3.81g pool00                                                           
  rhvh-4.0-0.20160919.0+1              rhvh                                 Vwi---tz--   3.81g pool00 rhvh-4.0-0.20160919.0                                     
  rhvh-4.0-0.20161012.0                rhvh                                 Vri---tz-k   3.81g pool00                                                           
  rhvh-4.0-0.20161012.0+1              rhvh                                 Vwi-aotz--   3.81g pool00 rhvh-4.0-0.20161012.0 47.11
  

Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1. Install rhev-h 4.0
2. Update the hypervisor
3. Run grub2-mkconfig

Actual results:
Last kernel updated not present at boot time

Expected results:
Last kernel updated present at boot time

Additional info:

Comment 1 Fabian Deutsch 2016-11-11 20:12:51 UTC
My first question is: What do you try to achieve?

it looks like you try to install a custom kernel or kernel update, is this correct?

Please note that RHVH is only intended to get image updates, and not individual package (like kernel package) updates.

Comment 2 Alessio 2016-11-12 17:29:28 UTC
I wanted to make permanent "hpsa.hpsa_allow_any=1" directive instead of stop the boot and edit the linux16 string during grub phase.
So I added that parameter to /etc/default/grub and I ran grub2-mkconfig -o /boot/grub2/grub.cfg

Comment 3 Fabian Deutsch 2016-11-12 20:40:30 UTC
Thanks - That's something we should fix.

For now I'd recommend to manually edit grub.cfg until we have fixed the issue.

Comment 4 Sergio Seabra 2016-12-09 18:40:15 UTC
I've just hit this exact same bug (Internal Revenue Service in Portugal) on a upgrade of RHVH initiated from the UI.
This is rather critical since any host that's upgraded via the UI exhibits this behaviour.
For correction, we've moved the new vmlinuz image to /boot.

Comment 10 Huijuan Zhao 2017-02-04 09:49:11 UTC
Created attachment 1247661 [details]
comment 9: grub files

Comment 11 Ryan Barry 2017-02-04 16:19:00 UTC
Looks like this requires a follow-up patch.

Nice that grub2-mkconfig works. Not nice that the title is wrong (though it appears that it'll boot into RHVH correctly).

Can you please try rebooting after this (either twice, or once and checking grub.Cfg)? I'm a little worried that we'll remove all those boot entries and leave the system unbootable, in which case this will need a new build (either to revert, which will also break virt-v2v again, or with path to /etc/grub2.conf.d/)

Comment 12 Huijuan Zhao 2017-02-06 03:04:53 UTC
(In reply to Ryan Barry from comment #11)
> Can you please try rebooting after this (either twice, or once and checking
> grub.Cfg)? I'm a little worried that we'll remove all those boot entries and
> leave the system unbootable, in which case this will need a new build
> (either to revert, which will also break virt-v2v again, or with path to
> /etc/grub2.conf.d/)

After reboot several times(once, or twice, or three times), the boot entries change again, it shows like:
----------------------------
tboot 1.8.1
----------------------------

Enter it(tboot 1.8.1), submenu of boot entries shows like:
---------------------------
Red Hat Enterprise Linux GNU/Linux, with tboot 1.8.1 and Linux 3.10.0-514.6.1.el7.x86_64
Red Hat Enterprise Linux GNU/Linux, with tboot 1.8.1 and Linux 3.10.0-327.36.1.el7.x86_64
---------------------------

Select the first boot entry and enter it, can boot successful to new build(RHVH-4.1-20170202.0).
Select the second boot entry and enter it, it is emergency mode of new build(RHVH-4.1-20170202.0).

So it means can not boot old build(RHVH-4.0-20160919.0) after reboot twice.

Please refer to attachment for detailed grub.cfg info.

Comment 13 Huijuan Zhao 2017-02-06 03:06:48 UTC
Created attachment 1247931 [details]
Comment 12: grub files after reboot several times

Comment 14 Ryan Barry 2017-02-06 04:34:54 UTC
Sandro -

The results of this were reported late, but we need to block/respin on this.

Either revert the patch which caused this (which will also break virt-v2v for the beta) while a grub2 script is written for GA, or write/patch before beta.

I'd guess the patch will be quick (~2 hours to write/test), but it's another thing to verify very late. Your call.

Comment 15 Sandro Bonazzola 2017-02-06 07:32:03 UTC
Let's fix this ASAP and go async if needed

Comment 16 Sandro Bonazzola 2017-02-06 08:01:47 UTC
(In reply to Ryan Barry from comment #11)
> I'm a little worried that we'll remove all those boot entries and
> leave the system unbootable, in which case this will need a new build
> (either to revert, which will also break virt-v2v again, or with path to
> /etc/grub2.conf.d/)

Can you please detail how this affects virt-v2v?

Comment 17 Ryan Barry 2017-02-06 13:10:52 UTC
See: https://bugzilla.redhat.com/show_bug.cgi?id=1392904

The fix for both bugs is to put the running kernel in /boot, which also causes the problem from comment#10 and comment#12.

Reverting the patch (to no longer put the kernel and initrd in /boot) will break v2v again, since /boot/kernel-... will no longer be correct.

Comment 18 Huijuan Zhao 2017-02-09 07:32:05 UTC
Still encounter this issue as comment 9 and comment 12 in redhat-virtualization-host-4.1-20170208.0.

From   redhat-virtualization-host-4.0-20170201.0
To     redhat-virtualization-host-4.1-20170208.0

Small difference:
1. After first boot, boot entry shows like:
---------------
Red Hat Enterprise Linux (3.10.0-514.6.1.el7.x86_64) 7.3
Red Hat Enterprise Linux (3.10.0-514.2.2.el7.x86_64) 7.3
tboot 1.9.4
---------------

2. After second boot, boot entry shows like:
---------------
tboot 1.9.4
---------------

Enter it, sub boot entry shows like:
---------------
Red Hat Enterprise Linux GNU/Linux, with tboot 1.9.4 and Linux 3.10.0-514.6.1.el7.x86_64
Red Hat Enterprise Linux GNU/Linux, with tboot 1.9.4 and Linux 3.10.0-514.2.2.el7.x86_64
---------------

Comment 23 Huijuan Zhao 2017-02-28 06:21:29 UTC
Test version:
From:
redhat-virtualization-host-4.0-20160919.0
To:
redhat-virtualization-host-4.1-20170222.0
imgbased-0.9.13-0.1.el7ev.noarch

Test steps:
Tested according to comment 9.

Test results:
1. Checks in step3 are all correct.
2. But in step4,  run "# grub2-mkconfig -o /boot/grub2/grub.cfg" in host.

2.1 After first boot, boot entry shows like:
--------------------------------
rhvh-4.1-0.20170223.0+1
rhvh-4.0-0.20160919.0+1
Red Hat Enterprise Linux (3.10.0-514.6.1.el7.x86_64) 7.3
Red Hat Enterprise Linux (3.10.0-327.36.1.el7.x86_64) 7.3
tboot 1.9.4
Red Hat Enterprise Linux Release 7.2 (on /dev/mapper/rhvh_dhcp--10--16-rhvh--4.0--0.20160919.0+1)
Advanced options for Red Hat Enterprise Linux release 7.2 (on /dev/mapper/rhvh_dhcp--10--16-rhvh--4.0--0.20160919.0+1)
Red Hat Enterprise Linux Release 7.2 (on /dev/mapper/rhvh_dhcp--10--16-root)
Advanced options for Red Hat Enterprise Linux release 7.2 (on /dev/mapper/rhvh_dhcp--10--16-root)
--------------------------------

This is unnormal?

2.2 After second or several times boot, boot entry shows like:
--------------------------------
rhvh-4.1-0.20170223.0+1
rhvh-4.0-0.20160919.0+1
tboot 1.9.4
--------------------------------
This is correct.

So the boot entry of first boot is not expected results, is it right? could I verify this bug?

Comment 24 Ryan Barry 2017-02-28 06:27:49 UTC
This is expected.

We could totally disable 10_linux, but this may have bad side effects when installing...

As it is, the new script relies on imgbased-clean-grub to pick out the right changes as normal, to avoid the risk of an unbootable system (for the same reason as the RHEL boot entries are only removed after the first boot)

Comment 25 Huijuan Zhao 2017-02-28 07:01:16 UTC
Thanks Ryan.

According to comment 23 and comment 24, change the status to VERIFIED.

Comment 26 errata-xmlrpc 2017-04-20 19:01:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1114


Note You need to log in before you can comment on or make changes to this bug.