Bug 1888072 - Setting Supermicro node to PXE boot via Redfish doesn't take affect
Summary: Setting Supermicro node to PXE boot via Redfish doesn't take affect
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Bare Metal Hardware Provisioning
Version: 4.6
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.7.0
Assignee: Bob Fournier
QA Contact: Raviv Bar-Tal
URL:
Whiteboard:
Depends On:
Blocks: 1888375 1892302
TreeView+ depends on / blocked
 
Reported: 2020-10-14 00:13 UTC by Bob Fournier
Modified: 2021-02-24 15:26 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: During deployment, Supermicro nodes require the setting of the boot mode along with the boot device when using Redfish, otherwise it reverts the boot mode to "Legacy". Consequence: Cannot deploy Supermicro nodes using Redfish. Fix: Set the boot mode in Supermicro if it got changed after setting the boot device. Result: Can properly deploy Supermicro nodes.
Clone Of:
: 1888375 1892302 (view as bug list)
Environment:
Last Closed: 2021-02-24 15:25:41 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Ironic-conductor log file setting node to PXE boot which doesn't take affect (757.64 KB, text/plain)
2020-10-14 12:04 UTC, Bob Fournier
no flags Details


Links
System ID Private Priority Status Summary Last Updated
OpenStack Storyboard 2008252 0 None None None 2020-10-14 00:13:10 UTC
OpenStack gerrit 758856 0 None MERGED Sync boot mode when changing the boot device via Redfish 2021-02-18 19:28:13 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:26:09 UTC

Description Bob Fournier 2020-10-14 00:13:10 UTC
Description of problem:

Starting with a Supermicro node set to PXE boot (it was manually set via IPMI) we see Ironic able to successfully do a deployment and set the node to boot from disk using Redfish. However deploying a second time will fail because the node will keep bootinh to disk, it appears the Redfish command that Ironic send to change to PXE boot is not taking affect, perhaps because of the BootSourceOverrideEnabled setting.

The first time the node is set to boot from disk after writing the image:
2020-10-13 21:28:31.152 1 DEBUG sushy.connector [req-4841e280-8461-4351-a09a-5c8cfbe2c17a - - - - -] HTTP request: PATCH https://mgmt-f07-h13-000-1029p.rdu2.scalelab.redhat.com/redfish/v1/Systems/1; headers: {'OData-Version': '4.0'}; body: {'Boot': {'BootSourceOverrideTarget': 'Hdd', 'BootSourceOverrideEnabled': 'Continuous'}}; blocking: False; timeout: 60; session arguments: {}; _op /usr/lib/python3.6/site-packages/sushy/connector.py:102

And it takes affect and does boot from disk:
'IndicatorLED': 'Off', 'PowerState': 'On', 'Boot': {'BootSourceOverrideEnabled': 'Continuous', 'BootSourceOverrideMode': 'Legacy', 'BootSourceOverrideTarget': 'Hdd',

We see the command to not boot persistent:
2020-10-13 21:28:41.398 1 DEBUG sushy.connector [req-33f87403-3090-4afb-8c04-f8042aa61f81 - - - - -] HTTP request: PATCH https://mgmt-f06-h15-000-1029p.rdu2.scalelab.redhat.com/redfish/v1/Systems/1; headers: {'OData-Version': '4.0'}; body: {'Boot': {'BootSourceOverrideTarget': 'Hdd'}};

which results in BootSourceOverrideEnabled 'Once'
'PowerState': 'On', 'Boot': {'BootSourceOverrideEnabled': 'Once', 'BootSourceOverrideMode': 'Legacy', 'BootSourceOverrideTarget': 'Hdd',

And eventually:
'Boot': {'BootSourceOverrideEnabled': 'Disabled', 'BootSourceOverrideMode': 'Legacy', 'BootSourceOverrideTarget': 'Non

=========

On the second deployment we see:
'Boot': {'BootSourceOverrideEnabled': 'Disabled', 'BootSourceOverrideMode': 'Legacy', 'BootSourceOverrideTarget': 'None',

Then the command set back to PXE boot for introspection:
2020-10-13 19:56:45.095 1 DEBUG sushy.connector [req-6194fdaf-04ad-4c58-a51d-678af46bb6d3 - - - - -] HTTP request: PATCH https://mgmt-f06-h14-000-1029p.rdu2.scalelab.redhat.com/redfish/v1/Systems/1; headers: {'OData-Version': '4.0'}; body: {'Boot': {'BootSourceOverrideTarget': 'Pxe', 'BootSourceOverrideEnabled': 'Once'}}; blocking: False; timeout: 60; session arguments: {}; _op /usr/lib/python3.6/site-packages/sushy/connector.py:102^[[00m
2020-10-13 19:56:45.113 1 DEBUG sushy.connector [req-8b759858-1468-4051-98b8-a6bd4985df89 - - - - -] HTTP response for GET https://mgmt-f07-h13-000-1029p.rdu2.scalelab.redhat.com/redfish/v1/Systems/1: status code: 200 _op /usr/lib/python3.6/site-packages/sushy/connector.py:156

It is sent a 2nd time shortly after:
2020-10-13 19:56:45.113 1 DEBUG sushy.connector [req-8b759858-1468-4051-98b8-a6bd4985df89 - - - - -] HTTP request: PATCH https://mgmt-f07-h13-000-1029p.rdu2.scalelab.redhat.com/redfish/v1/Systems/1; headers: {'OData-Version': '4.0'}; body: {'Boot': {'BootSourceOverrideTarget': 'Pxe', 'BootSourceOverrideEnabled': 'Once'}}; blocking: False; timeout: 60; session arguments: {}; _op /usr/lib/python3.6/site-packages/sushy/connector.py:102

We can see in a subsequent get that the BootSourceOverrideEnabled and BootSourceOverrideTarget have changed:
IndicatorLED': 'Off', 'PowerState': 'Off', 'Boot': {'BootSourceOverrideEnabled': 'Once', 'BootSourceOverrideMode': 'Legacy', 'BootSourceOverrideTarget': 'Pxe', 'BootSourceOverrideTarget': ['None', 'Pxe', 'Floppy', 'Cd', 'Usb', 'Hdd', 'BiosSetup']},

ironic reboots the node (with this warning which is a separate issue):
020-10-13 19:56:59.846 1 WARNING sushy.resources.system.system [req-6194fdaf-04ad-4c58-a51d-678af46bb6d3 - - - - -] Could not figure out the allowed values for the reset system action for System 1^[[00m
2020-10-13 19:56:59.846 1 DEBUG sushy.connector [req-6194fdaf-04ad-4c58-a51d-678af46bb6d3 - - - - -] HTTP request: POST https://mgmt-f06-h14-000-1029p.rdu2.scalelab.redhat.com/redfish/v1/Systems/1/Actions/ComputerSystem.Reset; headers: {'OData-Version': '4.0'}; body: {'ResetType': 'On'}; blocking: False; timeout: 60; session arguments: {}; _op /usr/lib/python3.6/site-packages/sushy/connector.py:102

** However the node boots to disk, not PXE. **

Eventually the node will return:
Boot': {'BootSourceOverrideEnabled': 'Disabled', 'BootSourceOverrideMode': 'Legacy', 'BootSourceOverrideTarget': 'None',


This is with:
Hardware - Supermicro 1029P
Firmware Revision: 01.71.17
BIOS Version: 3.0a
Redfish Version: 1.0.1

Comment 1 Bob Fournier 2020-10-14 12:04:57 UTC
Created attachment 1721450 [details]
Ironic-conductor log file setting node to PXE boot which doesn't take affect

Comment 2 Bob Fournier 2020-10-14 18:26:13 UTC
Looks like the issue is that we need to set the mode to UEFI prior to PXE booting as it ends up reverting back to Legacy - 'BootSourceOverrideMode': 'Legacy',.  Working on a patch.

Comment 3 Bob Fournier 2020-10-14 22:44:17 UTC
The Supermicro seems to require the setting of the boot mode along with the boot device when using Redfish, otherwise it reverts the boot mode to "Legacy".  Can illustrate this with a simple case:

Start with these 2 settings for 
$ curl -k --user XXXX https://10.1.41.239/redfish/v1/Systems/1/ | jq .
 "Boot": {
    "BootSourceOverrideEnabled": "Continuous",
    "BootSourceOverrideMode": "UEFI",
    "BootSourceOverrideTarget": "Pxe",

Then change only BootSourceOverrideEnabled and BootSourceOverrideTarget
$ curl -k --user XXXX -X PATCH -d '{"Boot": {"BootSourceOverrideTarget": "Pxe", "BootSourceOverrideEnabled": "Once"}}' https://10.1.41.239/redfish/v1/Systems/1/

The mode has flipped to Legacy
$ curl -k --user XXXX -X PATCH -d '{"Boot": {"BootSourceOverrideTarget": "Pxe", "BootSourceOverrideEnabled": "Once"}}' https://10.1.41.239/redfish/v1/Systems/1/
 "Boot": {
    "BootSourceOverrideEnabled": "Once",
    "BootSourceOverrideMode": "Legacy",
    "BootSourceOverrideTarget": "Pxe",

Comment 4 Bob Fournier 2020-10-22 17:19:18 UTC
We've confirmed with Supermicro that the boot mode ("BootSourceOverrideMode") must be set in the Redfish request when setting the device ( "BootSourceOverrideTarget" and "BootSourceOverrideEnabled").  This is different than other vendors like Dell and HPE which require that the mode NOT be set in the same request - see https://review.opendev.org/#/c/710846/.

I have a patch upstream that will fix the issue for Supermicro and not break other vendors - https://review.opendev.org/#/c/758856/.  We've verified that nodes boot properly to PXE with this patch.  It's still pending upstream reviews.

Comment 9 errata-xmlrpc 2021-02-24 15:25:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.