Bug 2082813

Summary: [Azure][RHEL9] GEN1, GEN2 dual boot failing due to missing /etc/grub2-efi.cfg and /boot/efi/EFI/redhat/grubenv files in RHEL 9 RC 1
Product: Red Hat Enterprise Linux 9 Reporter: anujmaurya
Component: grub2Assignee: Bootloader engineering team <bootloader-eng-team>
Status: CLOSED NOTABUG QA Contact: Release Test Team <release-test-team-automation>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 9.0CC: eterrell, mamccoma, manikroy, mathapli, mlewando, rmetrich, sisatia, vkuznets, yacao, yuxisun
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-06-15 13:34:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Kickstart file used by packer
none
grub2 symlinks none

Description anujmaurya 2022-05-07 13:20:32 UTC
Description of problem:
In Azure, we support dual boot on GEN1/GEN2 in the same image.
For RHEL8, we were using the following configurations to meet the above-mentioned requirement.

# setup bios boot
grub2-install --target=i386-pc /dev/sda
rm -f /boot/grub2/grubenv
cp -pr /boot/efi/EFI/redhat/grubenv /boot/grub2/
rm -f /boot/efi/EFI/redhat/grubenv
/usr/sbin/grub2-mkconfig -o /etc/grub2.cfg

# setup uefi boot
fs_uuid="$(grub2-probe --target=fs_uuid /boot/grub2)"
cat << EOF > /etc/grub2-efi.cfg
search --no-floppy --fs-uuid --set=dev $fs_uuid
set prefix=(\$dev)/grub2
export \$prefix
configfile \$prefix/grub.cfg
EOF

As per the article: https://fedoraproject.org/wiki/Changes/UnifyGrubConfig, the files grubenv, and the /etc/grub2-efi.cfg symlinks are not generated anymore.

how can modify the scripts to support both UEFI/BIOS boot?

I tried to modify the script to support GEN1 boot:
# setup bios boot
grub2-install --target=i386-pc /dev/sda

/usr/sbin/grub2-mkconfig -o /etc/grub2.cfg

but the VM created from the vhd was not able to boot with the error:

 "out of range pointer..Aborted."


Version-Release number of selected component (if applicable):
RHEL-9.0.0-RC-1.0/BaseOS/x86_64/iso/RHEL-9.0.0-20220420.0-x86_64-dvd1.iso

How reproducible:
Always

Steps to Reproduce:
1. Create a VM on Hyper-V and install using a kickstart file having the above grub2 configurations.
2. the above build process succeeds without boot setup.
3. save VHD. Now boot the same VHD on a Hyper-V, The VM is not able to boot.

Actual results:
the VM should successfully boot on GEN1/GEN2. Thus supporting BIOS and UEFI both

Expected results:
Add configuration to gurb2 to enable dual boot on both generations of VMs.

Additional info:

Comment 1 mamccoma 2022-05-09 16:15:55 UTC
Hello,

MSFT is highlighting this a blocker as they are currently trying to build RHEL 9.0 from the RC 1 image.

Comment 2 Marta Lewandowska 2022-05-10 07:57:36 UTC
@anujmaurya Hi, did this work for you in RHEL-9 Beta, or just in RHEL-8?

Comment 3 anujmaurya 2022-05-10 08:48:32 UTC
No, we were not able to test the dual boot on RHEL 9 beta as well as we were blocked on a previous step. REF: https://bugzilla.redhat.com/show_bug.cgi?id=2074800.

Since, Beta was not supporting installation of the systemd-resolved packaged, we moved to Nightly builds and recently RC1 for building The image

Comment 4 anujmaurya 2022-05-10 09:36:56 UTC
Adding some more info about the RHEL 9 Test VM

>>cat /boot/efi/EFI/redhat/grub.cfg
search --no-floppy --fs-uuid --set=dev d5f10580-d676-4588-8765-5e394a7a9b67
set prefix=($dev)/grub2

export $prefix
configfile $prefix/grub.cfg


>>ls -ltr /boot/efi/EFI/redhat/
total 6160
-rwx------. 1 root root 2599440 Mar 10 17:18 grubx64.efi
-rwx------. 1 root root  929792 Apr 14 22:23 shimx64-redhat.efi
-rwx------. 1 root root  936808 Apr 14 22:23 shimx64.efi
-rwx------. 1 root root  936808 Apr 14 22:23 shim.efi
-rwx------. 1 root root  857304 Apr 14 22:23 mmx64.efi
-rwx------. 1 root root     108 Apr 14 22:23 BOOTX64.CSV
-rwx------. 1 root root    7458 May  9 15:54 grub.cfg.rpmsave
-rwx------. 1 root root     144 May  9 15:56 grub.cfg

>>ls -ltr /boot/grub2/
total 12
drwx------. 2 root root   25 May  9 15:54 fonts
-rw-r--r--. 1 root root 1024 May  9 15:56 grubenv
-rwx------. 1 root root 6825 May  9 15:56 grub.cfg


>>cat /boot/grub2/grubenv
# GRUB Environment Block
# WARNING: Do not edit this file by tools other than grub-editenv!!!
saved_entry=5d23009c6ea4498cb87575df7d4c2932-5.14.0-70.13.1.el9_0.x86_64

>>sfdisk -l /dev/sda
Disk /dev/sda: 64 GiB, 68719476736 bytes, 134217728 sectors
Disk model: Virtual Disk    
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: F2C7D0C1-5775-449E-B127-C4E909849050

Device       Start       End   Sectors  Size Type
/dev/sda1  1026048   2050047   1024000  500M Linux filesystem
/dev/sda2  2050048 134215679 132165632   63G Linux LVM
/dev/sda14    2048     10239      8192    4M BIOS boot
/dev/sda15   10240   1024000   1013761  495M EFI System

Partition table entries are not in disk order.


>> lsblk
NAME              MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
sda                 8:0    0   64G  0 disk 
├─sda1              8:1    0  500M  0 part /boot
├─sda2              8:2    0   63G  0 part 
│ ├─rootvg-rootlv 253:0    0    2G  0 lvm  /
│ ├─rootvg-usrlv  253:1    0   10G  0 lvm  /usr
│ ├─rootvg-tmplv  253:2    0    2G  0 lvm  /tmp
│ ├─rootvg-homelv 253:3    0    1G  0 lvm  /home
│ └─rootvg-varlv  253:4    0    8G  0 lvm  /var
├─sda14             8:14   0    4M  0 part 
└─sda15             8:15   0  495M  0 part /boot/efi
sr0                11:0    1    8G  0 rom  



As you can see the sfdisk is also throwing the error 
Partition table entries are not in disk order.

Comment 5 anujmaurya 2022-05-10 12:52:50 UTC
Created attachment 1878300 [details]
Kickstart file used by packer

Comment 7 Marta Lewandowska 2022-05-10 15:00:43 UTC
@anujmaurya Please try using the following in your kickstart: 

# Partitioning
clearpart --all --initlabel --disklabel=gpt
part prepboot  --size=4    --fstype=prepboot
part biosboot  --size=1    --fstype=biosboot
part /boot/efi --size=200  --fstype=efi
part /boot     --size=500  --fstype=xfs --label=boot
part / --fstype=xfs --size=5000 --grow=20

%packages
grub2-pc
grub2-efi-x64

%end

%post --erroronfail
grub2-install --target=i386-pc /dev/sda
parted /dev/sda disk_set pmbr_boot off
%end

You should not have to do anything with grub configuration. 
This works on KVM, so please try if it works for you, and let us know.

Thank you!

Comment 8 Renaud Métrich 2022-05-10 15:12:47 UTC
@anujmaurya Please try using the following in your kickstart: (amendment to Marta's proposal):

# Partitioning
clearpart --all --initlabel --disklabel=gpt
part biosboot  --size=1    --fstype=biosboot
part /boot/efi --size=200  --fstype=efi
part /boot     --size=500  --fstype=xfs --label=boot
part / --fstype=xfs --size=5000 --grow=20

%packages
grub2-pc
grub2-efi-x64

%end

%post --erroronfail
grub2-install --target=i386-pc /dev/sda
%end


With RHEL9, the proposal of Article https://access.redhat.com/articles/6718341 was implemented.
Note that your initial kickstart excerpt was not correct:
----
# setup bios boot
grub2-install --target=i386-pc /dev/sda
rm -f /boot/grub2/grubenv
cp -pr /boot/efi/EFI/redhat/grubenv /boot/grub2/
rm -f /boot/efi/EFI/redhat/grubenv
/usr/sbin/grub2-mkconfig -o /etc/grub2.cfg
----

Here above you were copying the "grubenv" file, which may lead to not having a single block in the end.
grub2-editenv should have been used instead.

Comment 9 anujmaurya 2022-05-11 05:55:29 UTC
Thanks @renaurmetric @mlewamlewan @vkuznets for the kS changes.

I figured out that the followed code were unnecessary so removed them but it didn't work at that time:
rm -f /boot/grub2/grubenv
cp -pr /boot/efi/EFI/redhat/grubenv /boot/grub2/
rm -f /boot/efi/EFI/redhat/grubenv


I'll try the changes suggested by you on Hyper and share updates here.

Was the issue due to the incorrect partitioning?


Thanks,
Anuj Maurya

Comment 10 Vitaly Kuznetsov 2022-05-11 12:25:15 UTC
I'm not exactly sure about the root cause but I also guess it's the partitioning.
For me, 'grub2-install --target=i386-pc /dev/sda' was succeeding but the system
wasn't able to boot with BIOS. UEFI always worked.

Comment 11 anujmaurya 2022-05-12 09:25:20 UTC
was able to boot the image on both BIOS and UEFI on Hyper-V manager with the following script


clearpart --all --initlabel --disklabel=gpt
part biosboot  --size=1    --fstype=biosboot
part /boot/efi --size=200  --fstype=vfat
part /boot     --size=500  --fstype=xfs --label=boot
# part / --fstype=xfs --size=5000 --grow
part pv.01 --fstype=lvmpv --size=32768 --grow
volgroup rootvg pv.01

logvol / --vgname=rootvg --fstype=xfs --size=2048 --name=rootlv
logvol /var --vgname=rootvg --fstype=xfs --size=8192 --name=varlv
logvol /home --vgname=rootvg --fstype=xfs --size=1024 --name=homelv
logvol /usr --vgname=rootvg --fstype=xfs --size=10240 --name=usrlv
logvol /tmp --vgname=rootvg --fstype=xfs --size=2048 --name=tmplv


%packages
grub2-pc
grub2-efi-x64

%end

%post --erroronfail
grub2-install --target=i386-pc /dev/sda
# parted /dev/sda disk_set pmbr_boot off
#enable ssh login for root
echo "PermitRootLogin yes" > /etc/ssh/sshd_config.d/01-permitrootlogin.conf
%end


We are facing another issue in provisioning VM with walinuxagent on Azure.
To make the image azure compatible, we used to edit the grub /etc/default/grub file and rebuilt the grub configurations for RHEL 8.

GRUB_CMDLINE_LINUX="console=tty1 console=ttyS0,115200n8 earlyprintk=ttyS0,115200 earlyprintk=ttyS0 net.ifnames=0"
GRUB_TERMINAL_OUTPUT="serial console"
GRUB_SERIAL_COMMAND="serial --speed=115200 --unit=0 --word=8 --parity=no --stop=1"

rebuild the grub configuration: 
grub2-mkconfig -o /boot/grub2/grub.cfg

Will this work for both BIOS/UEFI?

Comment 12 anujmaurya 2022-05-12 09:29:38 UTC
or do we need to build the EFI config separately?

grub2-mkconfig -o /boot/efi/EFI/redhat/grub.cfg

Comment 13 Renaud Métrich 2022-05-12 11:08:38 UTC
It will work with "grub2-mkconfig -o /boot/grub2/grub.cfg" because with the unified configuration it's now the real configuration file.
/boot/efi/EFI/redhat/grub.cfg is now a stub that should never be modified.

Hence better use this: "grub2-mkconfig -o /etc/grub2.cfg"

I think /etc/grub2-efi.cfg is now a link to /boot/grub2/grub.cfg but I didn't verify.

Comment 14 anujmaurya 2022-05-12 11:41:45 UTC
hi @renaudrmetric

yes both of them are symlinks to /boot/grub2/grub.cfg

would using the below command work for both UEFI and BIOS?

grub2-mkconfig -o /etc/grub2.cfg


PFA screenshot

Comment 15 anujmaurya 2022-05-12 11:42:14 UTC
Created attachment 1878909 [details]
grub2 symlinks

Comment 16 Renaud Métrich 2022-05-12 11:47:00 UTC
Yes it will work, it's the intent to store the configuration only in one place (/boot).
Please check the article I referenced for all details.

Comment 17 anujmaurya 2022-05-12 12:24:36 UTC
@rmrmetric thanks for the confirmation

another question is reagrding https://bugzilla.redhat.com/show_bug.cgi?id=2046431 

Waagent is not able to mount teh /dev/sr0 which contains provisioning vhd. is there any change wrt to mount in rhel 9?

Comment 18 Vitaly Kuznetsov 2022-05-12 12:57:04 UTC
(In reply to anujmaurya from comment #17)
> @rmrmetric thanks for the confirmation
> 
> another question is reagrding
> https://bugzilla.redhat.com/show_bug.cgi?id=2046431 
> 
> Waagent is not able to mount teh /dev/sr0 which contains provisioning vhd.
> is there any change wrt to mount in rhel 9?

(we should probably move waagent related discussion somewhere else as this BZ is about bootloader)

is there an error message? I'm not quite sure whether BZ#2046431 is related or not.

Also, is it possible to use cloud-init for provisioning?

Comment 19 Yuxin Sun 2022-05-13 13:20:56 UTC
Hi anujmaurya, 

There is a blocker issue if using WALA as provision agent: BZ#2081944. Not quite sure if you hit this issue. This issue causes WALA provision VM failed.

Comment 20 anujmaurya 2022-05-13 19:49:43 UTC
Yuxin, Can you provide me access to the bug https://bugzilla.redhat.com/show_bug.cgi?id=2081944

Comment 21 Yuxin Sun 2022-05-16 02:41:42 UTC
(In reply to anujmaurya from comment #20)
> Yuxin, Can you provide me access to the bug
> https://bugzilla.redhat.com/show_bug.cgi?id=2081944

Hi anujmaurya,

I've just added you in the BZ. Please help to check if you can access to it. The same issue in the github is here: https://github.com/Azure/WALinuxAgent/issues/2582. Thanks!

Comment 22 Marta Lewandowska 2022-05-16 07:59:20 UTC
@anujmaurya Based on your comment #11, I understand that the original problem, which you opened the bug for, has been resolved. Could you please confirm?
Problems with WALA are addressed in another bug that you have been given access to.

Comment 23 anujmaurya 2022-06-15 11:42:38 UTC
This issue was fixed, you can close the BZ