Bug 1680659
Summary: | Ironic iDrac driver not setting UEFI BIOS Boot Order | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Christopher Brown <chris.brown> |
Component: | openstack-ironic | Assignee: | Chris Dearborn <christopher_dearborn> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | mlammon |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 13.0 (Queens) | CC: | bfournie, chris.brown, christopher_dearborn, ggrimaux, mburns |
Target Milestone: | --- | Keywords: | Triaged, ZStream |
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-06-04 13:06:37 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Christopher Brown
2019-02-25 13:26:44 UTC
Using UEFI requires a little setup prior to deploying OSP. Can you verify that the UEFI configuration is setup correctly? To do this: 1. login to the iDRAC GUI 2. navigate to Configuration->BIOS Settings->Network Settings 3. Verify that "PXE Device 1" is set to Enabled and PXE Device 2-4 are set to Disabled 4. Under "PXE Device 1 Settings", verify Interface is set to the correct PXE NIC port When the overcloud is deployed, ironic sets the server to boot 1 time from the configured PXE NIC port, so if things are configured as above, there should be only 1 NIC port for it to boot from. Hi Chris, (In reply to Chris Dearborn from comment #1) > Using UEFI requires a little setup prior to deploying OSP. Can you verify > that the UEFI configuration is setup correctly? To do this: > 1. login to the iDRAC GUI > 2. navigate to Configuration->BIOS Settings->Network Settings > 3. Verify that "PXE Device 1" is set to Enabled and PXE Device 2-4 are set > to Disabled > 4. Under "PXE Device 1 Settings", verify Interface is set to the correct PXE > NIC port > > When the overcloud is deployed, ironic sets the server to boot 1 time from > the configured PXE NIC port, so if things are configured as above, there > should be only 1 NIC port for it to boot from. Yes, this was set both manually and using the management software. We also tried with: openstack baremetal node boot device set and this temporarily set the correct boot device however the issue seems to be with "something" setting the boot device as the hard disk. For example, Ironic queues up a job to change the boot order. This happens, the node reboots but then another job queues to set it back to hard disk. So rather than failing to PXE boot, it never gets to PXE boot. FYI, when ironic changes the boot order, it does not change it permanently, but instead changes it to pxe boot 1 time only. As a result, you won't see the permanent boot order change. For clarity, is the issue that you're seeing that the overcloud nodes are trying to boot from the local disk when they should be PXE booting? Also for clarity, you are NOT seeing an issue where it is trying to PXE boot from the wrong NIC port? Hi Chris, (In reply to Chris Dearborn from comment #3) > FYI, when ironic changes the boot order, it does not change it permanently, > but instead changes it to pxe boot 1 time only. As a result, you won't see > the permanent boot order change. Yep, got that, although there is a --persistent option is misleading but thats another matter. > For clarity, is the issue that you're seeing that the overcloud nodes are > trying to boot from the local disk when they should be PXE booting? Yes. > Also for clarity, you are NOT seeing an issue where it is trying to PXE boot > from the wrong NIC port? Correct. PXE boot is never attempted. As soon as we switch to Legacy BIOS, everything works. We tested redfish driver however iDrac 9 does not implement ForceRestart from the redfish specification therefore this driver is unusable with hardware running iDrac 9 - separate issue though! Hey Chris, Can you check to see what BIOS version you are running on your overcloud nodes? In my testing, it appears that a regression was introduced in the BIOS firmware starting with the 1.6.11 release. I've tested with the 1.4.9 and 1.5.6 releases, and both of those seem to work fine, so if you want a quick workaround, I would recommend falling back to the 1.5.6 release. I am continuing to investigate possible workarounds and am also following up with folks on the firmware side. Thanks, Chris Hi Chris, (In reply to Chris Dearborn from comment #5) > Hey Chris, > > Can you check to see what BIOS version you are running on your overcloud > nodes? 1.6.12 > > In my testing, it appears that a regression was introduced in the BIOS > firmware starting with the 1.6.11 release. I've tested with the 1.4.9 and > 1.5.6 releases, and both of those seem to work fine, so if you want a quick > workaround, I would recommend falling back to the 1.5.6 release. Thanks, we have had to workaround using Legacy BIOS for the moment. > I am continuing to investigate possible workarounds and am also following up > with folks on the firmware side. Please do keep me updated. We have more deployment windows coming up in the next month so would potentially be able to test these fixes then. *** Bug 1680927 has been marked as a duplicate of this bug. *** Hey folks, this issue has been fixed in the new BIOS version that is due to be shipped around mid-April. Hi Chris, (In reply to Chris Dearborn from comment #8) > Hey folks, > > this issue has been fixed in the new BIOS version that is due to be shipped > around mid-April. Is this 3.30.30.30 or are we still waiting on this release? Thanks Hey Chris, No, the fix isn't in 3.30.30.30. That's the latest version number for the Lifecycle Controller firmware, but the issue is actually in the BIOS firmware. The current release of the BIOS firmware for 14G is 1.6.13, and the bug is still present in that version. I can't supply the exact version number that the fix will be in because it is still changing internally as testing continues, but it will be in the BIOS firmware release that immediately follows 1.6.13, and it should ship in mid-April. This issue should be fixed in BIOS version 2.1.8. Hi Chris Brown - I see the case is closed and Chris D. has indicated the BIOS version it is fixed in. Can we close this bug or will you be able to test the updated BIOS? Hi Bob, (In reply to Bob Fournier from comment #12) > Hi Chris Brown - I see the case is closed and Chris D. has indicated the > BIOS version it is fixed in. Can we close this bug or will you be able to > test the updated BIOS? Thanks for following up. We are unable to test the updated BIOS but please feel free to close as this is not an issue with a Red Hat product. An updated iDrac firmware shipped which enabled us to use redfish and therefore allowed us to switch to UEFI. Thanks Chris. |