Bug 2088319

Summary: Redfish set boot device failed for node in OCP 4.9 latest RC
Product: OpenShift Container Platform Reporter: Jacob Anders <janders>
Component: Bare Metal Hardware ProvisioningAssignee: Jacob Anders <janders>
Bare Metal Hardware Provisioning sub component: ironic QA Contact: Amit Ugol <augol>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: urgent CC: augol, dtantsur, imelofer, manrodri, rpittau, tsedovic
Version: 4.9Keywords: OtherQA
Target Milestone: ---   
Target Release: 4.9.z   
Hardware: x86_64   
OS: Linux   
Fixed In Version: Doc Type: Release Note
Doc Text:
Since adding eTag handling in Ironic (this was implemented during upstream Yoga cycle - patches: https://review.opendev.org/c/openstack/sushy/+/818114 and https://review.opendev.org/c/openstack/sushy/+/818110 ) issues with eTag handling on old firmware versions were increasingly observed, in particular on HP Machines. For example on DL360G10, iLo 5 2.63 or later is required otherwise issues with eTag handling in firmware may prevent Ironic from successfully provisioning the server. It is always recommended to run latest firmware, however in case of eTag issues it is mandatory to upgrade to latest firmware prior to taking any further troubleshooting steps.
Story Points: ---
Clone Of: 2088196 Environment:
Last Closed: 2022-05-25 04:30:33 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 2088716    
Bug Blocks: 2088196    

Comment 1 Jacob Anders 2022-05-19 13:01:33 UTC
A brief update on the current state of this BZ at the end of the day:
* suspected-missing backports are up for review upstream (links are posted above)
* ART Team have un-tagged the problem packages for 4.8/4.9 so they shouldn't be included in the upcoming z-stream release
* once we are able to test the new packages including the missing backports, we will validate if this is sufficient fix

Comment 2 Jacob Anders 2022-05-20 02:41:26 UTC
Note: we verified that the proposed fix works in https://bugzilla.redhat.com/show_bug.cgi?id=2088196#c7. I will now work through the OCP backport process to ensure prerequisites are met in order to be able to merge the fix across the affected releases (noop BZ for 4.10/4.11 may be required to meet process requirements).

Comment 3 Jacob Anders 2022-05-20 02:43:38 UTC
Adding pending sushy release (which is a prerequisite for raising OCP PR) for tracking, currently under review.

Comment 4 Jacob Anders 2022-05-20 09:27:34 UTC
Current status: we have a tested fix merged upstream ( see https://bugzilla.redhat.com/show_bug.cgi?id=2088196#c7 ). We are waiting for the release of the library with the fix so that we can raise a downstream PR to include the fix in the Ironic image corresponding to this OCP version.

Comment 5 Jacob Anders 2022-05-23 05:27:00 UTC
OCP 4.9 PR is now raised.

Comment 6 Jacob Anders 2022-05-24 08:28:06 UTC
This has been fixed by https://review.opendev.org/c/openstack/sushy/+/842461/. I have performed verification on 4.8 with Manuel by manually patching 4.8.40 container with this fix. The fix is 100% identical on 4.8 and 4.9 and it's an automatic cherry-pick hence this is sufficient information to successfully verify this fix (4.9) as well. Setting status to VERIFIED/OtherQA.

Comment 7 Jacob Anders 2022-05-24 08:35:29 UTC
( forgot the link to 4.8 BZ with the confirmation that verification has been successful - it's here: https://bugzilla.redhat.com/show_bug.cgi?id=2088196#c11)

Comment 11 errata-xmlrpc 2022-05-25 04:30:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.35 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.