Bug 2305981 - OSP16.2 to OSP17.1 upgrade breaks GRUB and makes it try to boot RHEL7
Summary: OSP16.2 to OSP17.1 upgrade breaks GRUB and makes it try to boot RHEL7
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 17.1 (Wallaby)
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: z4
: 17.1
Assignee: Juan Badia Payno
QA Contact: Archana Singh
URL:
Whiteboard:
: 2327390 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-08-20 11:33 UTC by Kenny Tordeurs
Modified: 2025-04-03 04:25 UTC (History)
11 users (show)

Fixed In Version: openstack-tripleo-heat-templates-14.3.1-17.1.20240919130753.el9ost openstack-tripleo-heat-templates-14.3.1-17.1.20240919123750.el8ost
Doc Type: Known Issue
Doc Text:
When you upgrade from RHOSP 16.2 to 17.1, during the system upgrade, a known issue causes GRUB to contain RHEL 7 entries instead of RHEL 8 entries. As a result, the hosts cannot reboot. This issue affects environments that previously ran RHOSP 13.0 or earlier. + *Workaround:* See the Red Hat Knowledgebase solution link:https://access.redhat.com/solutions/7096899[Openstack 16 to 17 FFU - During LEAPP upgrade UEFI systems do not boot due to invalid /boot/grub2/grub.cfg].
Clone Of:
Environment:
Last Closed: 2024-11-21 09:30:46 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-32635 0 None None None 2024-08-22 02:51:47 UTC
Red Hat Knowledge Base (Solution) 7096899 0 None None None 2024-11-25 21:19:07 UTC
Red Hat Product Errata RHSA-2024:9978 0 None None None 2024-11-21 09:30:49 UTC

Description Kenny Tordeurs 2024-08-20 11:33:26 UTC
Description of problem:
During the LEAPP upgrade phase of our servers LEAPP breaks the grub, we have this issue on a wide range of hosts `openstack overcloud upgrade run --yes --stack openstack07 --tags system_upgrade --limit openstackcontroller`

Your grub will suddenly only contain RHEL7 entries instead of the correct RHEL8 + upgrade entries.
I can confirm that before the upgrade there were no issues with grub, it contained RHEL8 entries and you could reboot the hosts just fine.

We manually booted the nodes to make sure the upgrade could continue from the grub shell (very labor intensive).
But now even post upgrade, you can't reboot the hosts, you always end up in the rhel7 grub menu, which most probably is a leftover from the time this cluster was running OSP13 years back which LEAPP for some weird reason reinstated.


Version-Release number of selected component (if applicable):
17.1

How reproducible:
/

Steps to Reproduce:
/

Actual results:
grub corrupt

Expected results:
no issues with grub

Additional info:
We hit this rhel7 boot issue on almost all nodes, controller, ceph and computes.

I actually just managed to find/fix the issue on the ceph and computes.
It appeared that we were hitting the following known bug: https://access.redhat.com/solutions/7034430
/boot/grub2/grubenv was a file and not a symlink to /boot/efi/EFI/redhat/grubenv, after creating this symlink and a grub2-mkconfig they now all boot fine by themselves.

But our 3 controllers are a different case, they had already a correct /boot/grub2/grubenv symlink in place.
Unfortunately we don't have a screenshot of the successful booting before the upgrade, but once it failed for our controller1, i explicitly rebooted controller2 before running leapp and it just booted fine and showed only rhel8.4 related grub entries.

For example when doing a grep on those controller nodes:
[root@openstackcontoller ~]# grep -ri menuentry /boot/
grep: /boot/grub2/i386-pc/gfxterm_menu.mod: binary file matches
grep: /boot/grub2/i386-pc/normal.mod: binary file matches
/boot/grub2/i386-pc/command.lst:*menuentry: normal
grep: /boot/grub2/i386-pc/syslinuxcfg.mod: binary file matches
grep: /boot/grub2/i386-pc/legacycfg.mod: binary file matches
/boot/grub2/grub.cfg:if [ x"${feature_menuentry_id}" = xy ]; then
/boot/grub2/grub.cfg:  menuentry_id_option="--id"
/boot/grub2/grub.cfg:  menuentry_id_option=""
/boot/grub2/grub.cfg:export menuentry_id_option
/boot/grub2/grub.cfg:menuentry 'Red Hat Enterprise Linux Server 7.9 Rescue d8a79cee73f84c04aa6da7a494db5c92 (3.10.0-1160.11.1.el7.x86_64)' --class red --class gnu-linux --class gnu --class os --unrestricted $menuentry_id_option 'gnulinux-3.10.0-1160.6.1.el7.x86_64-advanced-4b91c3b4-480b-48a0-94f2-a0c4f19923c2' {
/boot/grub2/grub.cfg:menuentry 'Red Hat Enterprise Linux Server (3.10.0-1160.11.1.el7.x86_64) 7.9 (Maipo)' --class red --class gnu-linux --class gnu --class os --unrestricted $menuentry_id_option 'gnulinux-3.10.0-1160.6.1.el7.x86_64-advanced-4b91c3b4-480b-48a0-94f2-a0c4f19923c2' {
/boot/grub2/grub.cfg:menuentry 'Red Hat Enterprise Linux Server (3.10.0-1160.6.1.el7.x86_64) 7.9 (Maipo)' --class red --class gnu-linux --class gnu --class os --unrestricted $menuentry_id_option 'gnulinux-3.10.0-1160.6.1.el7.x86_64-advanced-4b91c3b4-480b-48a0-94f2-a0c4f19923c2' {
/boot/grub2/grub.cfg:menuentry 'Red Hat Enterprise Linux Server (0-rescue-ba23abcd5d1f469f9a5fd4e16664c6f4) 7.9 (Maipo)' --class red --class gnu-linux --class gnu --class os --unrestricted $menuentry_id_option 'gnulinux-0-rescue-ba23abcd5d1f469f9a5fd4e16664c6f4-advanced-4b91c3b4-480b-48a0-94f2-a0c4f19923c2' {
/boot/efi/EFI/redhat/grub.cfg:if [ x"${feature_menuentry_id}" = xy ]; then
/boot/efi/EFI/redhat/grub.cfg:  menuentry_id_option="--id"
/boot/efi/EFI/redhat/grub.cfg:  menuentry_id_option=""
/boot/efi/EFI/redhat/grub.cfg:export menuentry_id_option
/boot/efi/EFI/redhat/grub.cfg:  menuentry 'UEFI Firmware Settings' $menuentry_id_option 'uefi-firmware' {

As you can see both files are drastically different.

When i compare it to for example a compute in the same cluster both files actually contain the same enties and not those static ones.
[root@openstackcompute ~]# grep -ri menuentry /boot/
grep: /boot/efi/EFI/redhat/grubx64.efi: binary file matches
/boot/efi/EFI/redhat/grub.cfg:if [ x"${feature_menuentry_id}" = xy ]; then
/boot/efi/EFI/redhat/grub.cfg:  menuentry_id_option="--id"
/boot/efi/EFI/redhat/grub.cfg:  menuentry_id_option=""
/boot/efi/EFI/redhat/grub.cfg:export menuentry_id_option
/boot/efi/EFI/redhat/grub.cfg:  menuentry 'UEFI Firmware Settings' $menuentry_id_option 'uefi-firmware' {
/boot/efi/EFI/BOOT/grub.cfg:if [ x"${feature_menuentry_id}" = xy ]; then
/boot/efi/EFI/BOOT/grub.cfg:  menuentry_id_option="--id"
/boot/efi/EFI/BOOT/grub.cfg:  menuentry_id_option=""
/boot/efi/EFI/BOOT/grub.cfg:export menuentry_id_option
/boot/efi/EFI/BOOT/grub.cfg:menuentry 'System setup' $menuentry_id_option 'uefi-firmware' {
/boot/grub2/grub.cfg:if [ x"${feature_menuentry_id}" = xy ]; then
/boot/grub2/grub.cfg:  menuentry_id_option="--id"
/boot/grub2/grub.cfg:  menuentry_id_option=""
/boot/grub2/grub.cfg:export menuentry_id_option
/boot/grub2/grub.cfg:menuentry 'System setup' $menuentry_id_option 'uefi-firmware' {

Comment 35 errata-xmlrpc 2024-11-21 09:30:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: RHOSP 17.1.4 (openstack-tripleo-heat-templates) security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:9978

Comment 36 Steve Baker 2024-11-25 21:19:07 UTC
*** Bug 2327390 has been marked as a duplicate of this bug. ***

Comment 38 Red Hat Bugzilla 2025-04-03 04:25:15 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.