Bug 1805122 - [Bare Metal][IPI] Installer does not change the boot order automatically
Summary: [Bare Metal][IPI] Installer does not change the boot order automatically
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 4.5.0
Assignee: Stephen Benjamin
QA Contact: Amit Ugol
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-02-20 10:13 UTC by Luis Arizmendi
Modified: 2020-05-27 11:33 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-04-02 16:31:26 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Luis Arizmendi 2020-02-20 10:13:26 UTC
Description of problem:

While running the deployment using openshift-baremetal-install there is one point in time where the nodes are turned on and get a RHEL ironic image from PXE. That image installs CoreOS in the local disk but the installer does not change the boot order to when it's restarted does not boot CoreOS, but from PXE boot, so the installation does not progress unless you change that manually during the deployment.

If you restart and select the local disk CoreOS starts getting the ignition file and then after some time it reboots again, so you have to select again manually to boot from local disk that second time (boot order is not changed at that time either)


Version-Release number of the following components:
4.4.0-0.nightly-2020-02-18-093529


How reproducible:

Steps to Reproduce:
1.Run baremetal IPI with openshift-baremetal-install
2.Wait until installer use BMC to boot nodes
3.Nodes start from PXE again after reboot

Actual results:
Installer does not reconfigure the boot order

Expected results:
Installer reconfigure the boot order and starts from local disk when rebooting after RHEL ironic image finish

Additional info:
I used UEFI

Comment 2 Luis Arizmendi 2020-02-20 16:38:05 UTC
Just an additional note. I've found that during the PXE boot made by metal3 (booting the Workers) the boot from NICs is removed from the boot order, so this behavior is only found while PXE booting the Master nodes from the bootstrap.

Comment 3 Stephen Benjamin 2020-02-20 16:44:32 UTC
Julia, does this sound familiar - could it be because we're not setting boot_capabilities on workers yet?

Comment 4 Julia Kreger 2020-02-20 17:46:22 UTC
Stephen, this sounds like the BMC may not be honoring the direction to boot from disk. The hardware that this was encountered on is somewhat known not to honor persistent boot commands, so the hardware should ideally be set to boot from disk by default at all times and then ironic will signal it to boot from the network for installation. Also worth noting that UEFI is a bit of an after thought to IPMI,and as such persistent boot commands are raw flags which are now actually blocked by some vendor's BMCs. We highly recommend using redfish in this case. 

Beyond this, there is not much I can say that can be done, not really a bug that can be descerned with out logs from the ironic pod.

Comment 5 Amit Ugol 2020-02-21 05:29:52 UTC
I am not aware that this happened to others and if it were consistent we would have seen more errors such as this. What is the differentiator? Do you use specific hardware to install on?

Comment 7 Julia Kreger 2020-03-11 17:56:12 UTC
The server bios likely needs to be set to always boot from disk as opposed to always boot from network. We've seen that specific manufacturer's BMCs disregard persistent boot commands before, so I really think there is nothing code wise that can really be "fixed" here.

Stephen, I think this is a candidate to be closed/wontfix.

Comment 8 Beth White 2020-04-02 16:31:26 UTC
Closing won't fix as per last comment from Julia.


Note You need to log in before you can comment on or make changes to this bug.