Bug 2220957
Summary: | Completing EFI base system build is impossible the system always tries to boot from EFI Network after final reboot | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Satellite | Reporter: | Sayan Das <saydas> | ||||
Component: | Compute Resources - VMWare | Assignee: | Shimon Shtein <sshtein> | ||||
Status: | CLOSED MIGRATED | QA Contact: | Satellite QE Team <sat-qe-bz-list> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 6.12.3 | CC: | ahumbe, chrobert, gtalreja, lstejska, mhulan, nalfassi, nshaik, rlavi, sshtein | ||||
Target Milestone: | Unspecified | Keywords: | MigratedToJIRA, Triaged | ||||
Target Release: | Unused | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2024-06-06 16:23:52 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Sayan Das
2023-07-06 16:47:05 UTC
Created attachment 1974328 [details]
Boot order settings being reconfigured in VMware console
If we see the attachment in Comment 1, we will see something like this This is the default boot order detected: BootOrder: 0003,0000,0001,0002 BootCurrent: 0001 Boot0000* EFI Virtual disk (0.0) Boot0001* EFI Network Boot0002* EFI Internal Shell (Unsupported option) Boot0003* Red Hat Enterprise Linux So, here we don't even need to change anything as 0003 is the RHEL drive and the expectations would be that , after reboot, the system will boot from 0003 only But it does not happen ( assuming i haven't used the efi_bootentry host param ). It always boots from EFI Network i.e. 0001 Now, since i had used efi_bootentry , you would see the following code is in play i.e. https://github.com/theforeman/foreman/blob/develop/app/views/unattended/provisioning_templates/snippet/efibootmgr_netboot.erb#L31C1-L34C33 0003 was detected as the required boot entry Existing boot order was 0003,0000,0001,0002 We simply appended 0003 in front of "0003,0000,0001,0002" , and hence the new boot order becomes "0003,0003,0000,0001,0002" The duplication happened as we forgot to remove the value of $id from $current before doing "efibootmgr -o ${id},${current}" But anyways, the situation remains same. One may think that, Having ""0003,0003" is the issue here i.e. duplicates. So, * I forced booted the system with HDD * Manually reconfigured the boot order i.e. efibootmgr -o 0003,0000,0001,0002 * Forced the next boot to be 0003 i.e. efibootmgr -n 0003 * rebooted. The same situation i.e. the VM still boots from Network instead of HDD I repeated the same with 0000 and the result remains same. I thought in BIOS we may still have a bad boot order wharas that also shows the right boot order. But still, BIOS seems to be honoring that boot order only which was set while creating the Host\VM and completely ignoring the settings from UEFI Boot Manager. I tested this with RHEl 7.9 8.6 and 9.1 , in the Vmware 7 infra ( without Secureboot enabled ) and the result remains same. The only way to workaround is to disable the EFI Network device in BIOS or else select HDD,Network as the BD prior before submitting the host\VM for build. I tested on a libvirt infra [ Sat 6.13, RHEL 8.8 client VM ( Titanocore edk2 UEFI ) ] Almost everything remains the same but here in libvirt: A) I don't need any manual hack as even if it boots from PXE\EFI Network, It gets the PXE Menu and then successfully boot with "Chainload Grub2 EFI from ESP" i.e. chainloads and boot me into OS So that's a very good news unlike VMware issue. B) Even if I do this on the VM: # efibootmgr BootCurrent: 0002 Timeout: 0 seconds BootOrder: 0002,0003,0004,0005,0007,0001,0000,0006 Boot0000* UiApp Boot0001* UEFI Misc Device Boot0002* UEFI PXEv4 (MAC:5254003C3993) Boot0003* UEFI PXEv6 (MAC:5254003C3993) Boot0004* UEFI HTTPv4 (MAC:5254003C3993) Boot0005* UEFI HTTPv6 (MAC:5254003C3993) Boot0006* EFI Internal Shell Boot0007* Red Hat Enterprise Linux # efibootmgr -o 0007,0002,0003,0004,0005,0001,0000,0006 BootCurrent: 0002 Timeout: 0 seconds BootOrder: 0007,0002,0003,0004,0005,0001,0000,0006 Boot0000* UiApp Boot0001* UEFI Misc Device Boot0002* UEFI PXEv4 (MAC:5254003C3993) Boot0003* UEFI PXEv6 (MAC:5254003C3993) Boot0004* UEFI HTTPv4 (MAC:5254003C3993) Boot0005* UEFI HTTPv6 (MAC:5254003C3993) Boot0006* EFI Internal Shell Boot0007* Red Hat Enterprise Linux It would not change the boot order of devices in BIOS. Even if you manually change the Boot Order, it will be just persistent on the runtime but then again will revert back. The only way I could change the boot order was by powering off the VM, Open it from libvirt console, go to Boot Options and then fix the Boot Order there. So perhaps the bigger question here would be why on VMware "Chainload Grub2 EFI from ESP" fails to find any bootable device to chainload from whereas no such issues exist in case of libvirt. And the bigger question is , what in the world "efibootmgr" does if it cannot even help with boot device selection\set boot device priority that BIOS can honor ? Some more information is there in https://bugzilla.redhat.com/show_bug.cgi?id=2058037 ( which explains why VMware fails to do the chainloading ) https://github.com/theforeman/foreman/pull/9175/files -> This has added "connectefi scsi" but it is commented out i.e. no effect by default. So just to test if it will work or not, I uncommented "connectefi scsi" line in "/var/lib/tftpboot/grub2/grub.cfg-01-00-50-56-b4-34-19" which was already deployed and simply booted up the VM and it works. i.e. even if the VM boots up from network it correctly chainloads the system from HDD and boots into OS. For any new system build to work, * We need to clone pxegrub2_chainload by name "pxegrub2_chainload vmware" and uncomment the "connectefi scsi" line and save it with correct Org and Location. * We need to clone "PXEGrub2 default local boot" to "PXEGrub2 default local boot vmware" and replace "pxegrub2_chainload" with "pxegrub2_chainload vmware" and save it with correct Org, Location and OS. * In Administer --> Settings --> Provisioning tab, we need to set "PXEGrub2 default local boot vmware" as the value of "Local boot PXEGrub2 template" And then a new system build on VMware will work without any issues in Vmware 7 only. But, even if it's a one time work, it's still a lot of work for one small tweak. Here in https://github.com/theforeman/foreman/blob/df3a0b0d970b229887e2d371568d16c8143c5aae/app/views/unattended/provisioning_templates/snippet/pxegrub2_chainload.erb#L46C2-L46C17 I propose we change #connectefi scsi to <% if host_param_true?('vmware') -%> connectefi scsi <% end -%> And then, We can keep a note in provisioning guide that, Uses's trying Vm build on VMware, should add a "vmware" host parameter of type boolean and value "true" Upstream bug assigned to sshtein Upstream bug assigned to sshtein Since https://github.com/theforeman/foreman/commit/7a77b3ce51a2c446c1389d7b742eb0749d821ef2 `connectefi` is not commented anymore. It should work by default. This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there. Due to differences in account names between systems, some fields were not replicated. Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information. To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "SAT-" followed by an integer. You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like: "Bugzilla Bug" = 1234567 In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information. |