Bug 1986238 - Supermicro X12 fails to provision using Redfish BM HW Provisioning
Summary: Supermicro X12 fails to provision using Redfish BM HW Provisioning
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Bare Metal Hardware Provisioning
Version: 4.8
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: 4.9.0
Assignee: Jacob Anders
QA Contact: Dave Cain
Padraig O'Grady
URL:
Whiteboard:
Depends On:
Blocks: 2003035
TreeView+ depends on / blocked
 
Reported: 2021-07-27 03:24 UTC by Dave Cain
Modified: 2021-10-18 17:41 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: SuperMicro X11/X12 BMCs not accepting Inserted and WriteProtected attributes in RedFish VirtualMedia.InsertMedia request. Consequence: Virtual media attachment fails, breaking OpenShift installations reliant on virtual media for provisioning. Fix: Modified sushy library and adding a conditional to stop sending these optional attributes when not strictly required. Result: OpenShift Assisted Installer no longer hits the issue where it's unable to attach virtual media due to Inserted and WriteProtected attributes not being allowed in VirtualMedia.InsertMedia request body.
Clone Of:
Environment:
Last Closed: 2021-10-18 17:41:26 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift ironic-image pull 204 0 None None None 2021-08-04 13:29:02 UTC
OpenStack Storyboard 2009086 0 None None None 2021-07-29 01:55:29 UTC
OpenStack gerrit 802690 0 None NEW Removing optional fields from insert_media payload 2021-07-29 01:55:29 UTC
OpenStack gerrit 803197 0 None None None 2021-08-02 12:03:16 UTC
Red Hat Product Errata RHSA-2021:3759 0 None None None 2021-10-18 17:41:42 UTC

Description Dave Cain 2021-07-27 03:24:51 UTC
Description of problem:
The OpenShift Infrastructure Operator (assisted installer) currently cannot provision Supermicro X12 based systems.  It does power up the system, but fails during the discovery ISO attach stage.


Relevant component information and versioning:
Supermicro Chassis SYS-210P-FRDN6T
Supermicro Board X12SPM-LN6TF
Supermicro Firmware Version 1.00.03
Supermicro Firmware Build Time 04/21/2021
Supermicro Redfish Version 1.8.0
Supermicro BIOS Firmware Version 1.1
Supermicro BIOS Build Time 04/29/2021
Supermicro SFT-DCMS-SINGLE license applied

Red Hat OpenShift version 4.8.2
RHCOS 48.84.202107202156-0
Kernel 4.18.0-305.10.2.el8_4.x86_64
assisted-service-operator.v99.0.0-unreleased
hive-operator.v1.1.9
openshift-gitops-operator.v1.1.2

How reproducible:
Every time


Steps to Reproduce:
1. Install Assisted Installer Operator, Hive, onto a 4.8 cluster as the hub or management cluster
2. Create an InfraEnv, AgentClusterInstall, ClusterDeployment, and BMH host files contingent for an OCP 4.8.2 deployment spoke cluster including Supermicro X12 system
3. Observe failure message for X12 system after kicking off install process

Actual results:

Events:
  Type    Reason               Age   From                         Message
  ----    ------               ----  ----                         -------
  Normal  Registered           111s  metal3-baremetal-controller  Registered new host
  Normal  InspectionSkipped    100s  metal3-baremetal-controller  disabled by annotation
  Normal  ProfileSet           100s  metal3-baremetal-controller  Hardware profile set: unknown
  Normal  BMCAccessValidated   100s  metal3-baremetal-controller  Verified access to BMC
  Normal  PowerOn              99s   metal3-baremetal-controller  Host powered on
  Normal  ProvisioningStarted  45s   metal3-baremetal-controller  Image provisioning started for https://assisted-service-assisted-installer.apps.volt.cars.lab/api/assisted-install/v1/clusters/07245160-7133-4f5d-ae93-6d0d51a4c84b/downloads/image?api_key=eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCJ9.eyJjbHVzdGVyX2lkIjoiMDcyNDUxNjAtNzEzMy00ZjVkLWFlOTMtNmQwZDUxYTRjODRiIn0.zsbNAS89TsmXRhcggyfieepPNHVV7BaDIN0SLjiJPVeMecOiO6SjNqF0fD_2uR0xmnhkqBV0CS6Pw3S7WRN3Fw
  Normal  ProvisioningError    25s   metal3-baremetal-controller  Image provisioning failed: Failed to deploy. Exception: HTTP POST https://172.28.11.42/redfish/v1/Managers/1/VirtualMedia/CD1/Actions/VirtualMedia.InsertMedia returned code 400. Base.v1_4_0.GeneralError: The property Inserted is not in the list of valid properties for the resource. Extended information: [{'MessageId': 'Base.1.4.PropertyUnknown', 'Severity': 'Warning', 'Resolution': 'Remove the unknown property from the request body and resubmit the request if the operation failed.', 'Message': 'The property Inserted is not in the list of valid properties for the resource.', 'MessageArgs': ['Inserted'], 'RelatedProperties': ['Inserted']}, {'MessageId': 'Base.1.4.PropertyUnknown', 'Severity': 'Warning', 'Resolution': 'Remove the unknown property from the request body and resubmit the request if the operation failed.', 'Message': 'The property WriteProtected is not in the list of valid properties for the resource.', 'MessageArgs': ['WriteProtected'], 'RelatedProperties': ['WriteProtected']}, {'MessageId': 'Base.1.4.PropertyValueFormatError', 'Severity': 'Warning', 'Resolution': 'Correct the value for the property in the request body and resubmit the request if the operation failed.', 'Message': 'The value http://10.40.0.123:6180/redfish/boot-6cdaa3ba-2d1b-4b66-933f-c3a719bee062.iso?filename=tmptt81btxv.iso for the property Image is of a different format than the property can accept.', 'MessageArgs': ['http://10.40.0.123:6180/redfish/boot-6cdaa3ba-2d1b-4b66-933f-c3a719bee062.iso?filename=tmptt81btxv.iso', 'Image'], 'RelatedProperties': ['[`~#$%&*()=+{}| \t;"\',<>?]']}]

$ cat 04-bmh-tesla-cars-lab.yaml
---
apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
  name: du1-ldc1-tesla-cars-lab
  namespace: assisted-installer
  labels:
    infraenvs.agent-install.openshift.io: "infraenv-ran-skylark-cars-lab"
  annotations:
    inspect.metal3.io: disabled
    bmac.agent-install.openshift.io/role: worker
spec:
  online: true
  bootMACAddress: 3c:ec:ef:30:52:64
  automatedCleaningMode: disabled
  bmc:
    address: redfish-virtualmedia://172.28.11.42/redfish/v1/Systems/1/
    credentialsName: bmc-du1-ldc1-tesla-cars-lab
    disableCertificateVerification: True


Expected results:
Successful provisioning with Redfish, like with other hardware OEM platforms: https://docs.openshift.com/container-platform/4.8/installing/installing_bare_metal_ipi/ipi-install-installation-workflow.html#ipi-install-configuration-files


Additional info:
Let me know if you need anything specific, happy to provide.

Comment 2 Jacob Anders 2021-07-27 03:39:53 UTC
This is caused by mismatch in sushy (which is trying to set Inserted property while attaching vMedia) and SuperMicro BMC (which treats Inserted and WriteProtected attributes as read only). Looking into this.

Comment 4 Jacob Anders 2021-07-28 00:35:09 UTC
Fix candidate is under review (see external tracker link).

I believe we will need a minor Ironic change to match as well, I will look into this as well.

Comment 5 Jacob Anders 2021-07-28 01:32:30 UTC
Adding reference to proposed Ironic change.

Comment 6 Jacob Anders 2021-07-29 01:55:29 UTC
It looks like we may need to split this into backportable and non-backportable components. Changes are up, added extra links as well as the upstream story requested in reviews.

Comment 9 Jacob Anders 2021-07-30 01:02:20 UTC
I tested https://review.opendev.org/c/openstack/sushy/+/802690/5 on Dell R640, HP e910 and SuperMicro X11 and all tests passed. 

Waiting for upstream reviews.

Comment 12 Jacob Anders 2021-07-30 10:27:31 UTC
There has been a report that https://review.opendev.org/c/openstack/sushy/+/802690/5 breaks virtual media on Lenovo (model number SD530 I think).

I uploaded https://review.opendev.org/c/openstack/sushy/+/802690/6 with a fix that should enable sushy to support both. Re-tested successfully on Dell/HP/Supermicro, requested a fellow upstream contributor who has access to Lenovo SD530 to test.

Comment 13 Jacob Anders 2021-08-02 10:53:59 UTC
https://review.opendev.org/c/openstack/sushy/+/802690/6 has merged into master. I will start looking into backports now.

Comment 14 Jacob Anders 2021-08-02 12:04:30 UTC
Backport is in CI and up for reviews https://review.opendev.org/c/openstack/sushy/+/803197/

Comment 15 Jacob Anders 2021-08-04 12:00:50 UTC
Both master and stable/wallaby changes have merged. Will post an update when ironic-image including the fix is available.

Comment 16 Jacob Anders 2021-08-09 08:28:50 UTC
https://github.com/openshift/ironic-image/pull/204 has now merged.

Comment 18 Lubov 2021-08-16 07:24:05 UTC
Hi, we don't have supermicro machines. Can this bz be verified on your site, please?

Comment 19 Dave Cain 2021-08-16 10:21:35 UTC
I'm happy to test if I can be provided a method to do so in (preferred) OpenShift 4.8.

Comment 23 Lubov 2021-08-17 12:34:46 UTC
Unfortunately I cannot verify this bz: have no suprmicro machine
Closing as OtherQA

Comment 24 Jeff Uphoff 2021-08-17 12:39:05 UTC
I have a SuperMicro system that I plan to add to my testing pipeline today to try to verify this.

(Not sure if I should change bug status for this...?)

Comment 25 Lubov 2021-08-17 12:40:53 UTC
(In reply to Jeff Uphoff from comment #24)
> I have a SuperMicro system that I plan to add to my testing pipeline today
> to try to verify this.

I'd appreciate it :)

> 
> (Not sure if I should change bug status for this...?)

Feel free to return it to on-qa

Comment 26 Jeff Uphoff 2021-08-19 17:39:57 UTC
I didn't get this verified before I had to turn my hardware over to someone else for some testing work. I'll pick this back up once I have access to the hardware again.

Comment 57 Dmitry Tantsur 2021-09-10 10:56:22 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=2003035 is the backport request, but we cannot proceed until this patch is verified. Could someone please do it?

Comment 58 Bertrand 2021-09-13 15:58:20 UTC
Hello Jeff,

Are you able to verified this BZ while leveraging Dave's HW?

Thanks,

Bertrand

Comment 59 Jeff Uphoff 2021-09-15 14:16:02 UTC
Perhaps? I'd need info on where it is, IPs, BMC info, etc. to try adding it to our Jenkins pipeline.

Comment 61 Dmitry Tantsur 2021-09-20 08:14:54 UTC
In https://bugzilla.redhat.com/show_bug.cgi?id=2003035 it was confirmed that 4.9 works (while 4.8 does not), so I'm marking this as verified.

Comment 62 Bertrand 2021-09-20 13:52:35 UTC
Now that this BZ has been verified, are we clear to have it backported?

Where are we tracking the OCP 4.8 Backport? 

Do we need a BZ tracking this Backport in OCP 4.8?

Comment 63 Iury Gregory Melo Ferreira 2021-09-20 13:58:40 UTC
@Bertand we are tracking 4.8 in https://bugzilla.redhat.com/show_bug.cgi?id=2003035

Dmitry marked the bug as verified, but I got an answer from Dave via e-mail today and we need to discuss if the validation was sufficient.

Comment 64 Bertrand 2021-09-20 14:35:58 UTC
Thanks Lury. 

I was expecting the same Summary between 2003035 (OCP 4.9)  and 1986238 (OCP 4.8). 

All good.

Comment 65 Bertrand 2021-09-22 10:34:53 UTC
(In reply to Bertrand from comment #64)
> Thanks Lury. 
> 
> I was expecting the same Summary between 2003035 (OCP 4.9)  and 1986238 (OCP
> 4.8). 
> 
> All good.

Self correcting for the record: 2003035 (OCP 4.8.z)  and 1986238 (OCP 4.9)

Comment 66 Dave Cain 2021-09-30 22:58:32 UTC
Unfortunately, this is not working for the Supermicro X12 using this backported fix.  :(

I see a different error message this time:

  Normal  ProvisioningError    15s   metal3-baremetal-controller  Image provisioning failed: Failed to deploy. Exception: HTTP POST https://172.28.11.42/redfish/v1/Managers/1/VirtualMedia/CD1/Actions/VirtualMedia.InsertMedia returned code 400. Base.v1_4_0.GeneralError: The value http://10.40.0.122:6180/redfish/boot-dc5f752f-314b-4cea-a93c-85f008f5ce4d.iso?filename=tmp1p2l6hew.iso for the property Image is of a different format than the property can accept. Extended information: [{'MessageId': 'Base.1.4.PropertyValueFormatError', 'Severity': 'Warning', 'Resolution': 'Correct the value for the property in the request body and resubmit the request if the operation failed.', 'Message': 'The value http://10.40.0.122:6180/redfish/boot-dc5f752f-314b-4cea-a93c-85f008f5ce4d.iso?filename=tmp1p2l6hew.iso for the property Image is of a different format than the property can accept.', 'MessageArgs': ['http://10.40.0.122:6180/redfish/boot-dc5f752f-314b-4cea-a93c-85f008f5ce4d.iso?filename=tmp1p2l6hew.iso', 'Image'], 'RelatedProperties': ['[`~#$%&*()=+{}| \t;"\',<>?]']}]

I have tried two different BMC firmware versions here.  Perhaps it doesn't like the '?'?

It is worth mentioning that the ISO does mount to a Supermicro X11 system.

Comment 67 Lubov 2021-10-03 09:19:37 UTC
(In reply to Dave Cain from comment #66)
> Unfortunately, this is not working for the Supermicro X12 using this
> backported fix.  :(
> 
> I see a different error message this time:
> 
>   Normal  ProvisioningError    15s   metal3-baremetal-controller  Image
> provisioning failed: Failed to deploy. Exception: HTTP POST
> https://172.28.11.42/redfish/v1/Managers/1/VirtualMedia/CD1/Actions/
> VirtualMedia.InsertMedia returned code 400. Base.v1_4_0.GeneralError: The
> value
> http://10.40.0.122:6180/redfish/boot-dc5f752f-314b-4cea-a93c-85f008f5ce4d.
> iso?filename=tmp1p2l6hew.iso for the property Image is of a different format
> than the property can accept. Extended information: [{'MessageId':
> 'Base.1.4.PropertyValueFormatError', 'Severity': 'Warning', 'Resolution':
> 'Correct the value for the property in the request body and resubmit the
> request if the operation failed.', 'Message': 'The value
> http://10.40.0.122:6180/redfish/boot-dc5f752f-314b-4cea-a93c-85f008f5ce4d.
> iso?filename=tmp1p2l6hew.iso for the property Image is of a different format
> than the property can accept.', 'MessageArgs':
> ['http://10.40.0.122:6180/redfish/boot-dc5f752f-314b-4cea-a93c-85f008f5ce4d.
> iso?filename=tmp1p2l6hew.iso', 'Image'], 'RelatedProperties':
> ['[`~#$%&*()=+{}| \t;"\',<>?]']}]
> 
> I have tried two different BMC firmware versions here.  Perhaps it doesn't
> like the '?'?
> 
> It is worth mentioning that the ISO does mount to a Supermicro X11 system.

May be separate bug on this error, WDYT? 

Assigning the bz qa on you

Comment 68 Jacob Anders 2021-10-05 03:01:36 UTC
(In reply to Lubov from comment #67)
> (In reply to Dave Cain from comment #66)
> > Unfortunately, this is not working for the Supermicro X12 using this
> > backported fix.  :(
> > 
> > I see a different error message this time:
> > 
> >   Normal  ProvisioningError    15s   metal3-baremetal-controller  Image
> > provisioning failed: Failed to deploy. Exception: HTTP POST
> > https://172.28.11.42/redfish/v1/Managers/1/VirtualMedia/CD1/Actions/
> > VirtualMedia.InsertMedia returned code 400. Base.v1_4_0.GeneralError: The
> > value
> > http://10.40.0.122:6180/redfish/boot-dc5f752f-314b-4cea-a93c-85f008f5ce4d.
> > iso?filename=tmp1p2l6hew.iso for the property Image is of a different format
> > than the property can accept. Extended information: [{'MessageId':
> > 'Base.1.4.PropertyValueFormatError', 'Severity': 'Warning', 'Resolution':
> > 'Correct the value for the property in the request body and resubmit the
> > request if the operation failed.', 'Message': 'The value
> > http://10.40.0.122:6180/redfish/boot-dc5f752f-314b-4cea-a93c-85f008f5ce4d.
> > iso?filename=tmp1p2l6hew.iso for the property Image is of a different format
> > than the property can accept.', 'MessageArgs':
> > ['http://10.40.0.122:6180/redfish/boot-dc5f752f-314b-4cea-a93c-85f008f5ce4d.
> > iso?filename=tmp1p2l6hew.iso', 'Image'], 'RelatedProperties':
> > ['[`~#$%&*()=+{}| \t;"\',<>?]']}]
> > 
> > I have tried two different BMC firmware versions here.  Perhaps it doesn't
> > like the '?'?
> > 
> > It is worth mentioning that the ISO does mount to a Supermicro X11 system.
> 
> May be separate bug on this error, WDYT? 
> 
> Assigning the bz qa on you

I discussed the problem described in the error message with Dave in real time and we've done some additional investigation on the X12 machine. Based on the outcomes of this investigation I am fairly confident that:
1) the original issue is resolved (otherwise we wouldn't move on to the next issue which this error describes)
2) the error message in the comment I'm replying to is a new issue.

Dave can you please open a new BZ to cover the overly restrictive validation of the virtual media URL in SuperMicro X12?

Comment 69 Jacob Anders 2021-10-05 03:23:40 UTC
(In reply to Jacob Anders from comment #68)
> (In reply to Lubov from comment #67)
> > (In reply to Dave Cain from comment #66)
> > > Unfortunately, this is not working for the Supermicro X12 using this
> > > backported fix.  :(
> > > 
> > > I see a different error message this time:
> > > 
> > >   Normal  ProvisioningError    15s   metal3-baremetal-controller  Image
> > > provisioning failed: Failed to deploy. Exception: HTTP POST
> > > https://172.28.11.42/redfish/v1/Managers/1/VirtualMedia/CD1/Actions/
> > > VirtualMedia.InsertMedia returned code 400. Base.v1_4_0.GeneralError: The
> > > value
> > > http://10.40.0.122:6180/redfish/boot-dc5f752f-314b-4cea-a93c-85f008f5ce4d.
> > > iso?filename=tmp1p2l6hew.iso for the property Image is of a different format
> > > than the property can accept. Extended information: [{'MessageId':
> > > 'Base.1.4.PropertyValueFormatError', 'Severity': 'Warning', 'Resolution':
> > > 'Correct the value for the property in the request body and resubmit the
> > > request if the operation failed.', 'Message': 'The value
> > > http://10.40.0.122:6180/redfish/boot-dc5f752f-314b-4cea-a93c-85f008f5ce4d.
> > > iso?filename=tmp1p2l6hew.iso for the property Image is of a different format
> > > than the property can accept.', 'MessageArgs':
> > > ['http://10.40.0.122:6180/redfish/boot-dc5f752f-314b-4cea-a93c-85f008f5ce4d.
> > > iso?filename=tmp1p2l6hew.iso', 'Image'], 'RelatedProperties':
> > > ['[`~#$%&*()=+{}| \t;"\',<>?]']}]
> > > 
> > > I have tried two different BMC firmware versions here.  Perhaps it doesn't
> > > like the '?'?
> > > 
> > > It is worth mentioning that the ISO does mount to a Supermicro X11 system.
> > 
> > May be separate bug on this error, WDYT? 
> > 
> > Assigning the bz qa on you
> 
> I discussed the problem described in the error message with Dave in real
> time and we've done some additional investigation on the X12 machine. Based
> on the outcomes of this investigation I am fairly confident that:
> 1) the original issue is resolved (otherwise we wouldn't move on to the next
> issue which this error describes)
> 2) the error message in the comment I'm replying to is a new issue.
> 
> Dave can you please open a new BZ to cover the overly restrictive validation
> of the virtual media URL in SuperMicro X12?

Dave, do you agree with my assessment as per the previous comment? I wanted to make sure we're all on the same page so that we can work towards closing this bug. Adding a needinfo.

Comment 70 Iury Gregory Melo Ferreira 2021-10-05 11:27:12 UTC
+1 to open a new BZ with the new information.

Comment 71 Dave Cain 2021-10-05 17:21:14 UTC
Yes, agree with you Jacob.  

The original issue surfaced in this BZ appears to be addressed (RedFish VirtualMedia.InsertMedia request).  I will open a new BZ for the URL problems exhibited in https://bugzilla.redhat.com/show_bug.cgi?id=1986238#c66.

Thanks much!

Comment 72 Jacob Anders 2021-10-06 02:28:04 UTC
Thank you Dave!

Comment 74 errata-xmlrpc 2021-10-18 17:41:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759


Note You need to log in before you can comment on or make changes to this bug.