Bug 1394888

Summary: [Ironic] iDRAC hardware type does not work with UEFI boot mode
Product: Red Hat OpenStack Reporter: arkady kanevsky <arkady_kanevsky>
Component: openstack-ironicAssignee: Dmitry Tantsur <dtantsur>
Status: CLOSED ERRATA QA Contact: mlammon
Severity: high Docs Contact:
Priority: high    
Version: 13.0 (Queens)CC: arkady_kanevsky, audra_cooper, bfournie, cdevine, christopher_dearborn, david_paterson, dcain, dtantsur, jamsmith, jowood, kurt_hey, lmarsh, mburns, morazi, racedoro, rajini.karthik, randy_perryman, rhel-osp-director-maint, richard.pioso, srevivo, sumedh_sathaye
Target Milestone: betaKeywords: OtherQA, Reopened, Triaged
Target Release: 14.0 (Rocky)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-ironic-11.1.1-0.20180817221333.9ceb015.el7ost Doc Type: Bug Fix
Doc Text:
This update fixes UEFI persistent boot mode support for Dell EMC PowerEdge 13th and 14th generation servers. Those servers now successfully boot into the deployed operating system for either persistent boot modes: BIOS and UEFI. The fix applies to servers managed by the ironic integrated Dell Remote Access Controller (iDRAC) management hardware implementation ('idrac') function, located in ironic.drivers.modules.drac.management. The bug is not resolved for PowerEdge 12th generation and earlier servers; however, BIOS boot mode continues to be supported in PowerEdge 12th generation and ealier servers. Prior to this update, the boot device would persist during subsequent reboots only when the server's boot mode was set to BIOS.
Story Points: ---
Clone Of:
: 1614964 (view as bug list) Environment:
Last Closed: 2019-01-11 11:47:00 UTC Type: Feature Request
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1476902, 1588541, 1614964    

Description arkady kanevsky 2016-11-14 16:13:38 UTC
Description of problem:
Add EUFI support for Ironic for Ocata

Create an upstream bug & BZ for this in Ocata timeframe
Need Red Hat help to implement in Ocata


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 arkady kanevsky 2016-11-14 16:15:46 UTC
Can be split into two BZs.

Comment 2 Dmitry Tantsur 2016-11-15 13:28:41 UTC
Hi! We've had full UEFI support in Ironic for quite a while, and in Director since Mitaka or even LIberty (enabled by default in Newton/OSP10). Please feel free to file specific bugs if you see some missing features or encounter any issues.

*** This bug has been marked as a duplicate of bug 1093245 ***

Comment 3 arkady kanevsky 2016-11-15 14:31:09 UTC
Last time we tried it on  JS HW it did not worked as expected.
Adding a few folks to cc list to dig deeper.

Comment 4 Dmitry Tantsur 2016-11-15 14:34:33 UTC
There were some problems with RHEL 7.2 (and early RHEL 7.3) kernel which should be fixed in upcoming RHEL 7.3. Otherwise please file bugs.

Comment 5 Dmitry Tantsur 2016-11-19 12:21:39 UTC
Reopening as a feature request against DRAC driver specifically.

Comment 8 Dmitry Tantsur 2017-04-12 10:41:44 UTC
The patch was abandoned due to Miles leaving the team.

Comment 9 Sean Merrow 2017-05-03 20:12:08 UTC
Hi Arkady,

For this one, the upstream patch [1] has been abandoned and therefore not likely to make OSP 12. The recommendation is to assign someone from Dell EMC team to drive this upstream, along with the QA. Red Hat engineering can offer guidance, help on the review side, and of course bringing downstream.

Regards,
Sean

[1] https://review.openstack.org/#/c/420107/

Comment 10 Sean Merrow 2017-05-03 20:15:16 UTC
Sorry, also forgot to add the comments from engineering I got recently:

This BZ should verify if the pxe_drac Ironic driver can be used on nodes with UEFI enabled.

The tests are simple: register node in UEFI mode with Ironic and deploy image or just try to deploy OSP with director on a Dell node in UEFI.

Miles Gould, who recently left Red Hat, noticed that changing the boot order from the pxe_drac driver when in UEFI wasn’t possible and tried making some changes in the driver which proved unstable and he decided to abandon them (https://review.openstack.org/#/c/420107/). Those changes were just part of the tests but it could well be that UEFI in Dell also requires specific settings in the firmware.

Could Dell verify this if they haven’t done that yet?

Do you know if JetStream uses pxe_drac and works in UEFI?

Comment 11 Chris Dearborn 2017-05-03 20:29:10 UTC
JS 10 uses pxe_drac, but deliberately puts the overcloud nodes into legacy boot mode because UEFI boot mode does not work.

Comment 12 Chris Dearborn 2017-05-03 20:55:11 UTC
I know that Miles had tried getting ironic to work with the pxe_drac driver and UEFI mode, and it didn't work.

I had spoken with him at an OpenStack summit, and we agreed that Dell would push RAID configuration forward, and he would push UEFI booting forward.  As a result, we've been focused on RAID config (we have quite a few changes to push upstream) and really haven't looked at UEFI booting at all other than to review his proposed patch.

Comment 15 Bob Fournier 2017-10-20 17:24:45 UTC
Hi Chris - just wondering if you will get to this during the Queens cycle?  If not we'll change the flags to push it off to rhos-14.

Comment 16 Chris Dearborn 2018-01-23 14:37:05 UTC
Hey Bob, we've started working on this, but it definitely won't make Queens.  Feel free to push it out.

Comment 17 Richard Pioso 2018-07-03 15:23:57 UTC
I submitted a change for review in Gerrit which resolves the bug -- https://review.openstack.org/#/c/545184/ . It is in the openstack/ironic project. No changes to the openstack/python-dracclient are needed.

It is marked Do Not Merge (DNM), because more automated unit tests and a release note are needed before it can be merged. Those will soon be forthcoming.

The functional code (production code) changes have been completed. They have been thoroughly tested against actual hardware configurations. It successfully lends support for UEFI boot mode to Dell EMC PowerEdge 13th and 14th generation servers with a wide variety of disk devices, including Dell PowerEdge RAID Controller (PERC) RAID volumes, Dell Boot Optimized Storage Solution (BOSS) RAID 1, Host Bus Adapter (HBA) controlled JBOD, BOSS SSD, and NVMe. Regression testing of legacy BIOS boot mode against those same configurations passed. Finally, 12th generation servers continue to work with legacy BIOS mode and fail with UEFI. The change is not intended to resolve the bug on 12 generation servers.

Comment 31 errata-xmlrpc 2019-01-11 11:47:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:0045

Comment 32 Joe Wood 2019-03-22 20:32:21 UTC
Was there a backport of this to OCP13? We are experiencing similar issues on two client deployments at the moment.

Comment 33 Richard Pioso 2019-03-22 20:53:53 UTC
(In reply to Joe Wood from comment #32)
> Was there a backport of this to OCP13? We are experiencing similar issues on
> two client deployments at the moment.

Yes, it was backported to OSP 13. It first appeared in OSP 13 Zstream 2. More details are available at https://bugzilla.redhat.com/show_bug.cgi?id=1614964. Also, the upstream Ironic change was cherry picked onto the stable/queens branch -- https://review.openstack.org/#/c/588843