Bug 1828885
Summary: | IPI deployment fails on Dell r640 nodes using redfish | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Michael Zamot <mzamot> | |
Component: | Bare Metal Hardware Provisioning | Assignee: | Dmitry Tantsur <dtantsur> | |
Bare Metal Hardware Provisioning sub component: | ironic | QA Contact: | Raviv Bar-Tal <rbartal> | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | medium | |||
Priority: | medium | CC: | beth.white, cpaquin, dtantsur, imelofer, jkreger, stbenjam | |
Version: | 4.4 | Keywords: | OtherQA, Triaged | |
Target Milestone: | --- | |||
Target Release: | 4.6.0 | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: |
Certain Dell firmware versions dropped support for configuring persistent boot via Redfish. A workaround has been provided to ensure successful deployment on such servers.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1841216 (view as bug list) | Environment: | ||
Last Closed: | 2020-10-27 15:58:32 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1841216 | |||
Bug Blocks: |
Description
Michael Zamot
2020-04-28 14:16:55 UTC
This unfortunately is a known issue with some changes that were made to the Dell idrac firmware. Dell is aware that they created an incompatibility and were working to correct the issue, although I thought it was fixed in 4.10.10.10. I'm following up with our dell contacts to clarify. Our Dell contacts indicate that they believe the fix is still pending release in firmware. They anticipate following up later today. The suggested temporary workaround is to set the force_persistent_boot_device flag to True in a node's driver_info. There is, however, a huge caveat (that does not manifest itself in OpenStack context): if you want to ever reboot your node, you need to configure the boot sequence correctly (in BIOS, outside of ironic). You have to make sure that the local disk goes *first*, then goes network boot. It's an unusual configuration. Failure to do it will result in the node going into the introspection ramdisk on the next reboot. > force_persistent_boot_device flag to True
sorry, I meant "to Never"
So, after some discussions and clarity provided by our dell contacts as to the fix being available, it seems we're going to have to implement the workaround and I suspect go ahead and put up a giant warning. I should be able to whip up a baremetal operator and upstream documentation changes to address this after my next call. A nicer workaround that is limited to broken nodes and handles reboots: https://review.opendev.org/725239 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196 |