Bug 2106378

Summary: Spoke BMH stuck “provisioning” after changing a BIOS attribute via the converged workflow
Product: OpenShift Container Platform Reporter: tali <tali>
Component: Bare Metal Hardware ProvisioningAssignee: Dmitry Tantsur <dtantsur>
Bare Metal Hardware Provisioning sub component: ironic QA Contact: tali <tali>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: ccrum, rpittau, sasha
Version: 4.11Keywords: OtherQA, TestBlocker, Triaged
Target Milestone: ---   
Target Release: 4.12.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-01-17 19:52:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description tali@redhat.com 2022-07-12 13:39:36 UTC
Description of problem:

Deploy a spoke cluster with a HostFirmwareSettings CR via the converged workflow. The BMH is stuck in “provisioning” state after successfully changing a BIOS attribute during deployment. Clean step timed out after the host is rebooted/installed. 

oc get  Agent -n cnfde11     
NAME                                   CLUSTER   APPROVED   ROLE     STAGE
aba3d84c-44c5-f521-e1b8-f24d29c26080   cnfde11   true       master   Done

Name:         cnfde11.ptp.lab.eng.bos.redhat.com
Namespace:    cnfde11
Labels:       infraenvs.agent-install.openshift.io=cnfde11
Annotations:  argocd.argoproj.io/sync-wave: 1
              bmac.agent-install.openshift.io/hostname: cnfde11.ptp.lab.eng.bos.redhat.com
              bmac.agent-install.openshift.io/role: master
              ran.openshift.io/ztp-gitops-generated: {}
API Version:  metal3.io/v1alpha1
Kind:         BareMetalHost
Metadata:
  Creation Timestamp:  2022-07-12T00:31:06Z
  Finalizers:
    baremetalhost.metal3.io
  Generation:  2
  Managed Fields:
    API Version:  metal3.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:spec:
        f:customDeploy:
          .:
          f:method:
    Manager:      assisted-service
    Operation:    Update
    Time:         2022-07-12T00:31:06Z
    API Version:  metal3.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:finalizers:
          .:
          v:"baremetalhost.metal3.io":
    Manager:      baremetal-operator
    Operation:    Update
    Time:         2022-07-12T00:31:06Z
    API Version:  metal3.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:argocd.argoproj.io/sync-wave:
          f:bmac.agent-install.openshift.io/hostname:
          f:bmac.agent-install.openshift.io/role:
          f:kubectl.kubernetes.io/last-applied-configuration:
          f:ran.openshift.io/ztp-gitops-generated:
        f:labels:
          .:
          f:infraenvs.agent-install.openshift.io:
      f:spec:
        .:
        f:automatedCleaningMode:
        f:bmc:
          .:
          f:address:
          f:credentialsName:
          f:disableCertificateVerification:
        f:bootMACAddress:
        f:bootMode:
        f:online:
        f:rootDeviceHints:
          .:
          f:deviceName:
    Manager:      kubectl-client-side-apply
    Operation:    Update
    Time:         2022-07-12T00:31:06Z
    API Version:  metal3.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        .:
        f:errorCount:
        f:errorMessage:
        f:goodCredentials:
          .:
          f:credentials:
            .:
            f:name:
            f:namespace:
          f:credentialsVersion:
        f:hardware:
          .:
          f:cpu:
            .:
            f:arch:
            f:clockMegahertz:
            f:count:
            f:flags:
            f:model:
          f:firmware:
            .:
            f:bios:
              .:
              f:date:
              f:vendor:
              f:version:
          f:hostname:
          f:nics:
          f:ramMebibytes:
          f:storage:
          f:systemVendor:
            .:
            f:manufacturer:
            f:productName:
            f:serialNumber:
        f:hardwareProfile:
        f:lastUpdated:
        f:operationHistory:
          .:
          f:deprovision:
            .:
            f:end:
            f:start:
          f:inspect:
            .:
            f:end:
            f:start:
          f:provision:
            .:
            f:end:
            f:start:
          f:register:
            .:
            f:end:
            f:start:
        f:operationalStatus:
        f:poweredOn:
        f:provisioning:
          .:
          f:ID:
          f:bootMode:
          f:image:
            .:
            f:url:
          f:raid:
            .:
            f:hardwareRAIDVolumes:
            f:softwareRAIDVolumes:
          f:rootDeviceHints:
            .:
            f:deviceName:
          f:state:
        f:triedCredentials:
          .:
          f:credentials:
            .:
            f:name:
            f:namespace:
          f:credentialsVersion:
    Manager:         baremetal-operator
    Operation:       Update
    Subresource:     status
    Time:            2022-07-12T01:18:31Z
  Resource Version:  9942763
  UID:               4623afc5-1827-4c5d-9269-83726645c3c1
Spec:
  Automated Cleaning Mode:  disabled
  Bmc:
    Address:                           redfish-virtualmedia+https://10.16.231.98/redfish/v1/Systems/1
    Credentials Name:                  bmh-secret
    Disable Certificate Verification:  true
  Boot MAC Address:                    3c:ec:ef:1e:d3:5e
  Boot Mode:                           UEFI
  Custom Deploy:
    Method:  start_assisted_install
  Online:    true
  Root Device Hints:
    Device Name:  /dev/sdb
Status:
  Error Count:    0
  Error Message:  
  Good Credentials:
    Credentials:
      Name:               bmh-secret
      Namespace:          cnfde11
    Credentials Version:  9910452
  Hardware:
    Cpu:
      Arch:             x86_64
      Clock Megahertz:  3900
      Count:            48
      Flags:
        3dnowprefetch
        abm
        acpi
        adx
        aes
        aperfmperf
        apic
        arat
        arch_capabilities
        arch_perfmon
        art
        avx
        avx2
        avx512_vnni
        avx512bw
        avx512cd
        avx512dq
        avx512f
        avx512vl
        bmi1
        bmi2
        bts
        cat_l3
        cdp_l3
        clflush
        clflushopt
        clwb
        cmov
        constant_tsc
        cpuid
        cpuid_fault
        cqm
        cqm_llc
        cqm_mbm_local
        cqm_mbm_total
        cqm_occup_llc
        cx16
        cx8
        dca
        de
        ds_cpl
        dtes64
        dtherm
        dts
        epb
        ept
        ept_ad
        erms
        est
        f16c
        flexpriority
        flush_l1d
        fma
        fpu
        fsgsbase
        fxsr
        hle
        ht
        ibpb
        ibrs
        ibrs_enhanced
        ida
        intel_ppin
        intel_pt
        invpcid
        invpcid_single
        lahf_lm
        lm
        mba
        mca
        mce
        md_clear
        mmx
        monitor
        movbe
        mpx
        msr
        mtrr
        nonstop_tsc
        nopl
        nx
        ospke
        pae
        pat
        pbe
        pcid
        pclmulqdq
        pdcm
        pdpe1gb
        pebs
        pge
        pku
        pln
        pni
        popcnt
        pse
        pse36
        pts
        rdrand
        rdseed
        rdt_a
        rdtscp
        rep_good
        sdbg
        sep
        smap
        smep
        smx
        ss
        ssbd
        sse
        sse2
        sse4_1
        sse4_2
        ssse3
        stibp
        syscall
        tm
        tm2
        tpr_shadow
        tsc
        tsc_adjust
        tsc_deadline_timer
        vme
        vmx
        vnmi
        vpid
        x2apic
        xgetbv1
        xsave
        xsavec
        xsaveopt
        xsaves
        xtopology
        xtpr
      Model:  Intel(R) Xeon(R) Gold 6212U CPU @ 2.40GHz
    Firmware:
      Bios:
        Date:     05/18/2021
        Vendor:   American Megatrends Inc.
        Version:  3.5
    Hostname:     api.cnfde11.ptp.lab.eng.bos.redhat.com
    Nics:
      Mac:          ac:1f:6b:e1:1d:d2
      Model:        0x8086 0x158b
      Name:         ens2f0
      Ip:           10.16.231.52
      Mac:          3c:ec:ef:1e:d3:5e
      Model:        0x8086 0x37d2
      Name:         eno1
      Mac:          ac:1f:6b:e1:1d:d3
      Model:        0x8086 0x158b
      Name:         ens2f1
      Mac:          3c:ec:ef:1e:d3:5f
      Model:        0x8086 0x37d2
      Name:         eno2
    Ram Mebibytes:  98304
    Storage:
      Model:       INTEL SSDPELKX010T8
      Name:        /dev/nvme0n1
      Size Bytes:  1000204886016
      Type:        NVME
    System Vendor:
      Manufacturer:   Supermicro
      Product Name:   Super Server (To be filled by O.E.M.)
      Serial Number:  SHUBIWC00001
  Hardware Profile:   unknown
  Last Updated:       2022-07-12T01:32:03Z
  Operation History:
    Deprovision:
      End:    <nil>
      Start:  <nil>
    Inspect:
      End:    2022-07-12T00:45:39Z
      Start:  2022-07-12T00:31:29Z
    Provision:
      End:    <nil>
      Start:  2022-07-12T01:32:03Z
    Register:
      End:             2022-07-12T00:31:29Z
      Start:           2022-07-12T00:31:06Z
  Operational Status:  OK
  Powered On:          false
  Provisioning:
    ID:         1adf6c09-f81a-432b-8e9b-aef12e4de699
    Boot Mode:  UEFI
    Image:
      URL:  
    Raid:
      Hardware RAID Volumes:  <nil>
      Software RAID Volumes:
    Root Device Hints:
      Device Name:  /dev/sdb
    State:          provisioning
  Tried Credentials:
    Credentials:
      Name:               bmh-secret
      Namespace:          cnfde11
    Credentials Version:  9910452
Events:
  Type    Reason               Age   From                         Message
  ----    ------               ----  ----                         -------
  Normal  Registered           65m   metal3-baremetal-controller  Registered new host
  Normal  BMCAccessValidated   64m   metal3-baremetal-controller  Verified access to BMC
  Normal  InspectionStarted    64m   metal3-baremetal-controller  Hardware inspection started
  Normal  InspectionComplete   50m   metal3-baremetal-controller  Hardware inspection completed
  Normal  ProfileSet           50m   metal3-baremetal-controller  Hardware profile set: unknown
  Normal                       19m   metal3-baremetal-controller  Timeout reached while cleaning the node. Please check if the ramdisk responsible for the cleaning is running on the node. Failed on step {'args': {'settings': [{'name': 'PowerButtonFunction', 'value': '4 Seconds Override'}]}, 'interface': 'bios', 'step': 'apply_configuration', 'abortable': False, 'priority': 0}.
  Normal  ProvisioningStarted  4m8s  metal3-baremetal-controller  Image provisioning started for 


Version-Release number of selected component (if applicable):
- Latest upstream assisted-service-operator
- OCP 4.11 on hub (4.11.0-fc.3)
- 4.10 spoke

How reproducible:
100%

Steps to Reproduce:
1. Deploy OCP 4.11 hub with upstream assisted-service-operator
2. Try to deploy spoke using manually created CRs including a HostFirmwareSettings CR

cat HostFirmwareSettings.yaml
apiVersion: metal3.io/v1alpha1
kind: HostFirmwareSettings
metadata:
    name: "cnfde11.ptp.lab.eng.bos.redhat.com"
    namespace: "cnfde11"
spec:
  settings:
    PowerButtonFunction: "4 Seconds Override"

Actual results:
BMH stuck "provisioning"

Expected results:
The SuperMicro server is deployed as expected

Additional info:

Comment 1 tali@redhat.com 2022-07-12 13:42:47 UTC
The must-gather is available: https://drive.google.com/file/d/1nQhCrfHRwTT1c6TbvwOZG5wItXWSdVGU/view?usp=sharing

Comment 3 Alexander Chuzhoy 2022-07-12 16:56:39 UTC
reproducing trying with SNO spoke on real BM (Dell PowerEdge R640):
4.11.0-0.nightly-2022-07-11-080250
multicluster-engine.v2.1.0

[kni@r640-u01 ~]$ oc get bmh
NAME         STATE       CONSUMER   ONLINE   ERROR   AGE
master-1-0   preparing              true             35m
[kni@r640-u01 ~]$ 
[kni@r640-u01 ~]$ 
[kni@r640-u01 ~]$ 
[kni@r640-u01 ~]$ oc get hostfirmwaresettings.metal3.io  master-1-0 -o yaml
apiVersion: metal3.io/v1alpha1
kind: HostFirmwareSettings
metadata:
  creationTimestamp: "2022-07-12T16:18:48Z"
  generation: 1
  name: master-1-0
  namespace: qe1
  resourceVersion: "1263089"
  uid: a9bdfd70-8eb0-4286-9c57-b197ab06bc84
spec:
  settings:
    SecureBoot: Disabled
status:
  conditions:
  - lastTransitionTime: "2022-07-12T16:19:13Z"
    message: ""
    observedGeneration: 1
    reason: Success
    status: "True"
    type: ChangeDetected
  - lastTransitionTime: "2022-07-12T16:19:13Z"
    message: ""
    observedGeneration: 1
    reason: Success
    status: "True"
    type: Valid
  lastUpdated: "2022-07-12T16:19:13Z"
  schema:
    name: schema-8b6476c0
    namespace: qe1
  settings:
    AcPwrRcvry: Last
    AcPwrRcvryDelay: Immediate
    AcPwrRcvryUserDelay: "60"
    AesNi: Enabled
    AssetTag: ""
    AuthorizeDeviceFirmware: Disabled
    BootMode: Uefi
    BootSeqRetry: Enabled
    ConTermType: Vt100Vt220
    ControlledTurbo: Disabled
    ControlledTurboMinusBin: "0"
    CorrEccSmi: Enabled
    CpuInterconnectBusLinkPower: Disabled
    CpuInterconnectBusSpeed: MaxDataRate
    CurrentEmbVideoState: Enabled
    DcuIpPrefetcher: Enabled
    DcuStreamerPrefetcher: Enabled
    DellAutoDiscovery: PlatformDefault
    DellWyseP25BIOSAccess: Enabled
    DynamicCoreAllocation: Disabled
    EmbSata: AhciMode
    EmbVideo: Enabled
    EnergyPerformanceBias: MaxPower
    ErrPrompt: Enabled
    ExtSerialConnector: Serial1
    FailSafeBaud: "115200"
    ForceInt10: Disabled
    GenericUsbBoot: Disabled
    HddFailover: Enabled
    HddPlaceholder: Enabled
    HttpDev1EnDis: Disabled
    HttpDev1Interface: NIC.Integrated.1-1-1
    HttpDev1Protocol: IPv4
    HttpDev1Uri: ""
    HttpDev1VlanEnDis: Disabled
    HttpDev1VlanId: "1"
    HttpDev1VlanPriority: "0"
    HttpDev2EnDis: Disabled
    HttpDev2Interface: NIC.Integrated.1-1-1
    HttpDev2Protocol: IPv4
    HttpDev2Uri: ""
    HttpDev2VlanEnDis: Disabled
    HttpDev2VlanId: "1"
    HttpDev2VlanPriority: "0"
    HttpDev3EnDis: Disabled
    HttpDev3Interface: NIC.Integrated.1-1-1
    HttpDev3Protocol: IPv4
    HttpDev3Uri: ""
    HttpDev3VlanEnDis: Disabled
    HttpDev3VlanId: "1"
    HttpDev3VlanPriority: "0"
    HttpDev4EnDis: Disabled
    HttpDev4Interface: NIC.Integrated.1-1-1
    HttpDev4Protocol: IPv4
    HttpDev4Uri: ""
    HttpDev4VlanEnDis: Disabled
    HttpDev4VlanId: "1"
    HttpDev4VlanPriority: "0"
    InBandManageabilityInterface: Enabled
    IntegratedNetwork1: Enabled
    IntegratedRaid: Enabled
    IntelTxt: "Off"
    InternalUsb: "On"
    IoatEngine: Disabled
    IscsiDev1Con1Auth: None
    IscsiDev1Con1ChapName: ""
    IscsiDev1Con1ChapSecret: ""
    IscsiDev1Con1ChapType: OneWay
    IscsiDev1Con1DhcpEnDis: Disabled
    IscsiDev1Con1EnDis: Disabled
    IscsiDev1Con1Gateway: ""
    IscsiDev1Con1Interface: NIC.Integrated.1-1-1
    IscsiDev1Con1Ip: ""
    IscsiDev1Con1IsId: ""
    IscsiDev1Con1Lun: "0"
    IscsiDev1Con1Mask: ""
    IscsiDev1Con1Port: "3260"
    IscsiDev1Con1Protocol: IPv4
    IscsiDev1Con1Retry: "3"
    IscsiDev1Con1RevChapName: ""
    IscsiDev1Con1RevChapSecret: ""
    IscsiDev1Con1TargetIp: ""
    IscsiDev1Con1TargetName: ""
    IscsiDev1Con1TgtDhcpEnDis: Disabled
    IscsiDev1Con1Timeout: "10000"
    IscsiDev1Con1VlanEnDis: Disabled
    IscsiDev1Con1VlanId: "1"
    IscsiDev1Con1VlanPriority: "0"
    IscsiDev1Con2Auth: None
    IscsiDev1Con2ChapName: ""
    IscsiDev1Con2ChapSecret: ""
    IscsiDev1Con2ChapType: OneWay
    IscsiDev1Con2DhcpEnDis: Disabled
    IscsiDev1Con2EnDis: Disabled
    IscsiDev1Con2Gateway: ""
    IscsiDev1Con2Interface: NIC.Integrated.1-1-1
    IscsiDev1Con2Ip: ""
    IscsiDev1Con2IsId: ""
    IscsiDev1Con2Lun: "0"
    IscsiDev1Con2Mask: ""
    IscsiDev1Con2Port: "3260"
    IscsiDev1Con2Protocol: IPv4
    IscsiDev1Con2Retry: "3"
    IscsiDev1Con2RevChapName: ""
    IscsiDev1Con2RevChapSecret: ""
    IscsiDev1Con2TargetIp: ""
    IscsiDev1Con2TargetName: ""
    IscsiDev1Con2TgtDhcpEnDis: Disabled
    IscsiDev1Con2Timeout: "10000"
    IscsiDev1Con2VlanEnDis: Disabled
    IscsiDev1Con2VlanId: "1"
    IscsiDev1Con2VlanPriority: "0"
    IscsiDev1ConOrder: Con1Con2
    IscsiDev1EnDis: Disabled
    IscsiInitiatorName: ""
    LogicalProc: Enabled
    MemFrequency: MaxPerf
    MemOpMode: OptimizerMode
    MemPatrolScrub: Standard
    MemRefreshRate: 1x
    MemTest: Disabled
    MemoryMappedIOH: 56TB
    MmioAbove4Gb: Enabled
    MonitorMwait: Enabled
    NodeInterleave: Disabled
    NumLock: "On"
    NvmeMode: NonRaid
    OneTimeBootMode: Disabled
    OneTimeUefiBootSeqDev: RAID.Integrated.1-1
    OppSrefEn: Disabled
    OsWatchdogTimer: Disabled
    PcieAspmL1: Disabled
    PowerCycleRequest: None
    Proc1Brand: Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz
    Proc1Id: 6-55-4
    Proc1L2Cache: 16x1 MB
    Proc1L3Cache: 22 MB
    Proc1NumCores: "16"
    Proc1TurboCoreNum: All
    Proc2Brand: Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz
    Proc2Id: 6-55-4
    Proc2L2Cache: 16x1 MB
    Proc2L3Cache: 22 MB
    Proc2NumCores: "16"
    Proc2TurboCoreNum: All
    ProcAdjCacheLine: Enabled
    ProcBusSpeed: 10.40 GT/s
    ProcC1E: Disabled
    ProcCStates: Disabled
    ProcConfigTdp: Nominal
    ProcCoreSpeed: 2.10 GHz
    ProcCores: All
    ProcHwPrefetcher: Enabled
    ProcPwrPerf: MaxPerf
    ProcTurboMode: Enabled
    ProcVirtualization: Enabled
    ProcX2Apic: Disabled
    PwrButton: Enabled
    PxeDev1EnDis: Enabled
    PxeDev1Interface: NIC.Integrated.1-1-1
    PxeDev1Protocol: IPv6
    PxeDev1VlanEnDis: Disabled
    PxeDev1VlanId: "1"
    PxeDev1VlanPriority: "0"
    PxeDev2EnDis: Disabled
    PxeDev2Interface: NIC.Integrated.1-1-1
    PxeDev2Protocol: IPv4
    PxeDev2VlanEnDis: Disabled
    PxeDev2VlanId: "1"
    PxeDev2VlanPriority: "0"
    PxeDev3EnDis: Disabled
    PxeDev3Interface: NIC.Integrated.1-1-1
    PxeDev3Protocol: IPv4
    PxeDev3VlanEnDis: Disabled
    PxeDev3VlanId: "1"
    PxeDev3VlanPriority: "0"
    PxeDev4EnDis: Disabled
    PxeDev4Interface: NIC.Integrated.1-1-1
    PxeDev4Protocol: IPv4
    PxeDev4VlanEnDis: Disabled
    PxeDev4VlanId: "1"
    PxeDev4VlanPriority: "0"
    RedirAfterBoot: Enabled
    RedundantOsLocation: None
    SataPortA: Auto
    SataPortACapacity: N/A
    SataPortADriveType: Unknown Device
    SataPortAModel: Unknown
    SataPortB: Auto
    SataPortBCapacity: N/A
    SataPortBDriveType: Unknown Device
    SataPortBModel: Unknown
    SataPortC: Auto
    SataPortCCapacity: N/A
    SataPortCDriveType: Unknown Device
    SataPortCModel: Unknown
    SataPortD: Auto
    SataPortDCapacity: N/A
    SataPortDDriveType: Unknown Device
    SataPortDModel: Unknown
    SataPortE: Auto
    SataPortECapacity: N/A
    SataPortEDriveType: Unknown Device
    SataPortEModel: Unknown
    SataPortF: Auto
    SataPortFCapacity: N/A
    SataPortFDriveType: Unknown Device
    SataPortFModel: Unknown
    SataPortG: Auto
    SataPortGCapacity: N/A
    SataPortGDriveType: Unknown Device
    SataPortGModel: Unknown
    SataPortH: Auto
    SataPortHCapacity: N/A
    SataPortHDriveType: Unknown Device
    SataPortHModel: Unknown
    SataPortI: Auto
    SataPortICapacity: N/A
    SataPortIDriveType: Unknown Device
    SataPortIModel: Unknown
    SataPortJ: Auto
    SataPortJCapacity: N/A
    SataPortJDriveType: Unknown Device
    SataPortJModel: Unknown
    SataPortK: Auto
    SataPortKCapacity: N/A
    SataPortKDriveType: Unknown Device
    SataPortKModel: Unknown
    SataPortL: Auto
    SataPortLCapacity: N/A
    SataPortLDriveType: Unknown Device
    SataPortLModel: Unknown
    SataPortM: Auto
    SataPortMCapacity: N/A
    SataPortMDriveType: Unknown Device
    SataPortMModel: Unknown
    SataPortN: Auto
    SataPortNCapacity: N/A
    SataPortNDriveType: Unknown Device
    SataPortNModel: Unknown
    SecureBoot: Enabled
    SecureBootMode: DeployedMode
    SecureBootPolicy: Standard
    SecurityFreezeLock: Enabled
    SerialComm: OnConRedirAuto
    SerialPortAddress: Serial1Com2Serial2Com1
    SetBootOrderDis: ""
    SetBootOrderEn: ""
    SetBootOrderFqdd1: ""
    SetBootOrderFqdd2: ""
    SetBootOrderFqdd3: ""
    SetBootOrderFqdd4: ""
    SetBootOrderFqdd5: ""
    SetBootOrderFqdd6: ""
    SetBootOrderFqdd7: ""
    SetBootOrderFqdd8: ""
    SetBootOrderFqdd9: ""
    SetBootOrderFqdd10: ""
    SetBootOrderFqdd11: ""
    SetBootOrderFqdd12: ""
    SetBootOrderFqdd13: ""
    SetBootOrderFqdd14: ""
    SetBootOrderFqdd15: ""
    SetBootOrderFqdd16: ""
    SetLegacyHddOrderFqdd1: ""
    SetLegacyHddOrderFqdd2: ""
    SetLegacyHddOrderFqdd3: ""
    SetLegacyHddOrderFqdd4: ""
    SetLegacyHddOrderFqdd5: ""
    SetLegacyHddOrderFqdd6: ""
    SetLegacyHddOrderFqdd7: ""
    SetLegacyHddOrderFqdd8: ""
    SetLegacyHddOrderFqdd9: ""
    SetLegacyHddOrderFqdd10: ""
    SetLegacyHddOrderFqdd11: ""
    SetLegacyHddOrderFqdd12: ""
    SetLegacyHddOrderFqdd13: ""
    SetLegacyHddOrderFqdd14: ""
    SetLegacyHddOrderFqdd15: ""
    SetLegacyHddOrderFqdd16: ""
    Slot1: Enabled
    Slot1Bif: x16
    Slot2: Enabled
    Slot2Bif: x16
    Slot3: Enabled
    Slot3Bif: x16
    SriovGlobalEnable: Disabled
    SubNumaCluster: Disabled
    SysMemSize: 192 GB
    SysMemSpeed: 2666 Mhz
    SysMemType: ECC DDR4
    SysMemVolt: 1.20 V
    SysMfrContactInfo: www.dell.com
    SysProfile: PerfOptimized
    SystemBiosVersion: 1.6.13
    SystemCpldVersion: 1.0.2
    SystemManufacturer: Dell Inc.
    SystemMeVersion: 4.0.4.401
    SystemModelName: PowerEdge R640
    SystemServiceTag: 176Q2W2
    TpmInfo: No TPM present
    TpmPpiBypassClear: Disabled
    TpmPpiBypassProvision: Disabled
    UefiComplianceVersion: "2.5"
    UefiVariableAccess: Standard
    UncoreFrequency: MaxUFS
    UpiPrefetch: Enabled
    UsbManagedPort: "On"
    UsbPorts: AllOn
    VideoMem: 16 MB
    WorkloadProfile: NotAvailable
    WriteCache: Disabled
    WriteDataCrc: Disabled


[kni@r640-u01 ~]$ ssh core.134.3  sudo crictl ps -a|head
CONTAINER           IMAGE                                                                                                                        CREATED             STATE               NAME                                          ATTEMPT             POD ID              POD
21b96ac5e5faf       b4ce2d5db90a4b1f12ae9d799ccf1400fb4507b1cf5974bfe3814f631f89054e                                                             10 minutes ago      Exited              collect-profiles                              0                   f80cd95fc1cfa       collect-profiles-27627405-f99h5
6ce890c1f56a7       b4ce2d5db90a4b1f12ae9d799ccf1400fb4507b1cf5974bfe3814f631f89054e                                                             21 minutes ago      Exited              collect-profiles                              0                   da3f273ba0299       collect-profiles-27627390-gtfls
cc1898b6d972a       quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a3143920c200b67fcd60d05abaae99c125a450b7b95fe682d1c84960f0a1e897       21 minutes ago      Running             dns                                           1                   c44cc526d67ca       dns-default-r4hds
4231d0dcfed58       quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:884b46e766f92e6a4ddfd493b0eeab5b6ad72f01cab7651a7f457509d8d76fb0       21 minutes ago      Running             csi-snapshot-controller-operator              1                   ba5062861adce       csi-snapshot-controller-operator-86884c7b4d-lvzb7
5f19b6878dc03       f334e212c405aa88a76f9e23708952da3d5f4cf6715632895f4d50035f51538a                                                             22 minutes ago      Running             kube-rbac-proxy-thanos                        1                   318c46737f3ab       prometheus-k8s-0
f97bdfed2c2ab       f334e212c405aa88a76f9e23708952da3d5f4cf6715632895f4d50035f51538a                                                             22 minutes ago      Running             kube-rbac-proxy                               1                   318c46737f3ab       prometheus-k8s-0
a9f4533f88033       e081cc8c15586c74c1683dd70325ad55cbc08fc6627359bbe65801e1ac5e35f7                                                             22 minutes ago      Running             prometheus-proxy                              1                   318c46737f3ab       prometheus-k8s-0
260ae1e801a31       quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d791a91d25a8f8d8eae66d888ede55de4eda82806d722e944b4ce8564444ace7       22 minutes ago      Running             thanos-query                                  1                   1c24c73bbad47       thanos-querier-7b57f85644-zgfhc
4bfc96d853025       quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d791a91d25a8f8d8eae66d888ede55de4eda82806d722e944b4ce8564444ace7       22 minutes ago      Running             thanos-sidecar                                1                   318c46737f3ab       prometheus-k8s-0
[kni@r640-u01 ~]$

Comment 6 tali@redhat.com 2022-08-02 19:08:20 UTC
The solution to address this issue, should not introduce an additional reboot (as the host is installed as expected at the stage).

Comment 8 Dmitry Tantsur 2022-08-11 15:23:47 UTC
Has anyone been able to reproduce this on a virtual setup? My naive attempt to create a BMH with Redfish, change a settings and provision an image has succeeded.

Comment 9 Dmitry Tantsur 2022-08-11 16:08:40 UTC
I think I'm onto something. Technical notes for myself: when fast-track is on, we never re-configure the ISO, even when rebooting the node after a clean step. And since we use one-time ISO boot by default, the node boots into nothing. We need to re-configure the ISO before rebooting. Cannot be reproduced on a virtual environment because these don't support one-time boot.

Comment 10 Dmitry Tantsur 2022-08-23 15:46:18 UTC
The fix should be available now.

Comment 12 Dmitry Tantsur 2022-08-26 10:37:37 UTC
Would it be possible for you to test the fix in 4.12? I think it should solve the issue, but I haven't been able to reproduce the issue to begin with.

Comment 13 tali@redhat.com 2022-08-29 13:02:57 UTC
I tested the fix on 4.12.0-0.nightly-2022-08-23-031342. The BMH is still stuck in a “provisioning” state. It booted to a live-ISO when rebooting after BIOS/RAID settings changes. Clean step timed out event was also generated (although the change was applied successfully).

Comment 16 Dmitry Tantsur 2022-08-29 14:20:12 UTC
Could you retest with a newer nightly? The patch merged on the same day, I have a feeling the nightly did not include it yet.

Comment 17 tali@redhat.com 2022-08-29 14:23:42 UTC
Ok, I will test the fix with a newer nightly.

Comment 18 tali@redhat.com 2022-08-31 01:06:56 UTC
I tested the fix on 4.12.0-0.nightly-2022-08-29-102035. The original problem has been fixed. The host was installed properly after changing a BIOS attribute.

However, I observed two reboots during cleaning. One reboot is intended to finish the clean step( although it acts as an extra reboot in comparison to non converged ZTP workflow). 

The second reboot needs to be investigated.

Comment 19 Dmitry Tantsur 2022-08-31 09:55:29 UTC
> I tested the fix on 4.12.0-0.nightly-2022-08-29-102035. The original problem has been fixed. The host was installed properly after changing a BIOS attribute.

Great, thank you! I'll close this bug as verified by you.

> However, I observed two reboots during cleaning.

Let's file a new bug for this (it's done in jira OCPBUGS nowadays). Please include must-gather.

P.S.

> it acts as an extra reboot in comparison to non converged ZTP workflow

Well, the non-converged workflow did not have BIOS settings :) We need the reboot to make sure the settings are applied before deployment.

Comment 22 errata-xmlrpc 2023-01-17 19:52:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:7399