Bug 1892302 - Setting Supermicro node to PXE boot via Redfish doesn't take affect
Summary: Setting Supermicro node to PXE boot via Redfish doesn't take affect
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Bare Metal Hardware Provisioning
Version: 4.6.z
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.6.z
Assignee: Bob Fournier
QA Contact: Raviv Bar-Tal
URL:
Whiteboard:
Depends On: 1888072
Blocks: 1888375
TreeView+ depends on / blocked
 
Reported: 2020-10-28 12:41 UTC by Bob Fournier
Modified: 2020-11-30 16:45 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1888072
Environment:
Last Closed: 2020-11-30 16:45:31 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 760119 0 None MERGED Sync boot mode when changing the boot device via Redfish 2021-02-17 14:26:33 UTC
Red Hat Product Errata RHBA-2020:5115 0 None None None 2020-11-30 16:45:53 UTC

Description Bob Fournier 2020-10-28 12:41:58 UTC
+++ This bug was initially created as a clone of Bug #1888072 +++

Description of problem:

Starting with a Supermicro node set to PXE boot (it was manually set via IPMI) we see Ironic able to successfully do a deployment and set the node to boot from disk using Redfish. However deploying a second time will fail because the node will keep bootinh to disk, it appears the Redfish command that Ironic send to change to PXE boot is not taking affect, perhaps because of the BootSourceOverrideEnabled setting.

The first time the node is set to boot from disk after writing the image:
2020-10-13 21:28:31.152 1 DEBUG sushy.connector [req-4841e280-8461-4351-a09a-5c8cfbe2c17a - - - - -] HTTP request: PATCH https://mgmt-f07-h13-000-1029p.rdu2.scalelab.redhat.com/redfish/v1/Systems/1; headers: {'OData-Version': '4.0'}; body: {'Boot': {'BootSourceOverrideTarget': 'Hdd', 'BootSourceOverrideEnabled': 'Continuous'}}; blocking: False; timeout: 60; session arguments: {}; _op /usr/lib/python3.6/site-packages/sushy/connector.py:102

And it takes affect and does boot from disk:
'IndicatorLED': 'Off', 'PowerState': 'On', 'Boot': {'BootSourceOverrideEnabled': 'Continuous', 'BootSourceOverrideMode': 'Legacy', 'BootSourceOverrideTarget': 'Hdd',

We see the command to not boot persistent:
2020-10-13 21:28:41.398 1 DEBUG sushy.connector [req-33f87403-3090-4afb-8c04-f8042aa61f81 - - - - -] HTTP request: PATCH https://mgmt-f06-h15-000-1029p.rdu2.scalelab.redhat.com/redfish/v1/Systems/1; headers: {'OData-Version': '4.0'}; body: {'Boot': {'BootSourceOverrideTarget': 'Hdd'}};

which results in BootSourceOverrideEnabled 'Once'
'PowerState': 'On', 'Boot': {'BootSourceOverrideEnabled': 'Once', 'BootSourceOverrideMode': 'Legacy', 'BootSourceOverrideTarget': 'Hdd',

And eventually:
'Boot': {'BootSourceOverrideEnabled': 'Disabled', 'BootSourceOverrideMode': 'Legacy', 'BootSourceOverrideTarget': 'Non

=========

On the second deployment we see:
'Boot': {'BootSourceOverrideEnabled': 'Disabled', 'BootSourceOverrideMode': 'Legacy', 'BootSourceOverrideTarget': 'None',

Then the command set back to PXE boot for introspection:
2020-10-13 19:56:45.095 1 DEBUG sushy.connector [req-6194fdaf-04ad-4c58-a51d-678af46bb6d3 - - - - -] HTTP request: PATCH https://mgmt-f06-h14-000-1029p.rdu2.scalelab.redhat.com/redfish/v1/Systems/1; headers: {'OData-Version': '4.0'}; body: {'Boot': {'BootSourceOverrideTarget': 'Pxe', 'BootSourceOverrideEnabled': 'Once'}}; blocking: False; timeout: 60; session arguments: {}; _op /usr/lib/python3.6/site-packages/sushy/connector.py:102^[[00m
2020-10-13 19:56:45.113 1 DEBUG sushy.connector [req-8b759858-1468-4051-98b8-a6bd4985df89 - - - - -] HTTP response for GET https://mgmt-f07-h13-000-1029p.rdu2.scalelab.redhat.com/redfish/v1/Systems/1: status code: 200 _op /usr/lib/python3.6/site-packages/sushy/connector.py:156

It is sent a 2nd time shortly after:
2020-10-13 19:56:45.113 1 DEBUG sushy.connector [req-8b759858-1468-4051-98b8-a6bd4985df89 - - - - -] HTTP request: PATCH https://mgmt-f07-h13-000-1029p.rdu2.scalelab.redhat.com/redfish/v1/Systems/1; headers: {'OData-Version': '4.0'}; body: {'Boot': {'BootSourceOverrideTarget': 'Pxe', 'BootSourceOverrideEnabled': 'Once'}}; blocking: False; timeout: 60; session arguments: {}; _op /usr/lib/python3.6/site-packages/sushy/connector.py:102

We can see in a subsequent get that the BootSourceOverrideEnabled and BootSourceOverrideTarget have changed:
IndicatorLED': 'Off', 'PowerState': 'Off', 'Boot': {'BootSourceOverrideEnabled': 'Once', 'BootSourceOverrideMode': 'Legacy', 'BootSourceOverrideTarget': 'Pxe', 'BootSourceOverrideTarget': ['None', 'Pxe', 'Floppy', 'Cd', 'Usb', 'Hdd', 'BiosSetup']},

ironic reboots the node (with this warning which is a separate issue):
020-10-13 19:56:59.846 1 WARNING sushy.resources.system.system [req-6194fdaf-04ad-4c58-a51d-678af46bb6d3 - - - - -] Could not figure out the allowed values for the reset system action for System 1^[[00m
2020-10-13 19:56:59.846 1 DEBUG sushy.connector [req-6194fdaf-04ad-4c58-a51d-678af46bb6d3 - - - - -] HTTP request: POST https://mgmt-f06-h14-000-1029p.rdu2.scalelab.redhat.com/redfish/v1/Systems/1/Actions/ComputerSystem.Reset; headers: {'OData-Version': '4.0'}; body: {'ResetType': 'On'}; blocking: False; timeout: 60; session arguments: {}; _op /usr/lib/python3.6/site-packages/sushy/connector.py:102

** However the node boots to disk, not PXE. **

Eventually the node will return:
Boot': {'BootSourceOverrideEnabled': 'Disabled', 'BootSourceOverrideMode': 'Legacy', 'BootSourceOverrideTarget': 'None',


This is with:
Hardware - Supermicro 1029P
Firmware Revision: 01.71.17
BIOS Version: 3.0a
Redfish Version: 1.0.1

--- Additional comment from Bob Fournier on 2020-10-14 12:04:57 UTC ---



--- Additional comment from Bob Fournier on 2020-10-14 18:26:13 UTC ---

Looks like the issue is that we need to set the mode to UEFI prior to PXE booting as it ends up reverting back to Legacy - 'BootSourceOverrideMode': 'Legacy',.  Working on a patch.

--- Additional comment from Bob Fournier on 2020-10-14 22:44:17 UTC ---

The Supermicro seems to require the setting of the boot mode along with the boot device when using Redfish, otherwise it reverts the boot mode to "Legacy".  Can illustrate this with a simple case:

Start with these 2 settings for 
$ curl -k --user XXXX https://10.1.41.239/redfish/v1/Systems/1/ | jq .
 "Boot": {
    "BootSourceOverrideEnabled": "Continuous",
    "BootSourceOverrideMode": "UEFI",
    "BootSourceOverrideTarget": "Pxe",

Then change only BootSourceOverrideEnabled and BootSourceOverrideTarget
$ curl -k --user XXXX -X PATCH -d '{"Boot": {"BootSourceOverrideTarget": "Pxe", "BootSourceOverrideEnabled": "Once"}}' https://10.1.41.239/redfish/v1/Systems/1/

The mode has flipped to Legacy
$ curl -k --user XXXX -X PATCH -d '{"Boot": {"BootSourceOverrideTarget": "Pxe", "BootSourceOverrideEnabled": "Once"}}' https://10.1.41.239/redfish/v1/Systems/1/
 "Boot": {
    "BootSourceOverrideEnabled": "Once",
    "BootSourceOverrideMode": "Legacy",
    "BootSourceOverrideTarget": "Pxe",

--- Additional comment from Bob Fournier on 2020-10-22 17:19:18 UTC ---

We've confirmed with Supermicro that the boot mode ("BootSourceOverrideMode") must be set in the Redfish request when setting the device ( "BootSourceOverrideTarget" and "BootSourceOverrideEnabled").  This is different than other vendors like Dell and HPE which require that the mode NOT be set in the same request - see https://review.opendev.org/#/c/710846/.

I have a patch upstream that will fix the issue for Supermicro and not break other vendors - https://review.opendev.org/#/c/758856/.  We've verified that nodes boot properly to PXE with this patch.  It's still pending upstream reviews.

Comment 5 errata-xmlrpc 2020-11-30 16:45:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6.6 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:5115


Note You need to log in before you can comment on or make changes to this bug.