Description of Problem: Note: This article is published in the public state at the request of RedHat. Ironic node enters the clean failed state after delete_configuration clean step failure. This is caused when hardwareRAIDVolumes is nil and the target node doesn't have a RAID controller. The cause is that BuildRAIDCleanSteps function does not consider such case and always do delete_configration. We have already created PR in Metal3 community to address such case. https://github.com/metal3-io/baremetal-operator/pull/942 This PR adds the case that when hardwareRAIDVolumes is nil, keep the actual RAID configuration(does not do delete_configration). Version-Release number of selected component: This issue was detected in the Pre-GA version. Red Hat OpenShift Container Platform Version Number: 4.9.0-0.nightly-2021-07-26-071921 Release Number: 4.9 Kubernetes Version: 1.21 Cri-o Version: 0.1.0 Related Component: None Related Middleware/Application: None Underlying RHCOS Release Number: 4.9 Underlying RHCOS Architecture: x86_64 Underlying RHCOS Kernel Version: 4.18.0 Drivers or hardware or architecture dependency: This error occurs when the target node doesn't have a RAID controller. How reproducible: Always Step to Reproduce: 1. Create install-config.yaml in clusterconfigs: Worker machine does not install raid card. $ vim ~/clusterconfigs/install-config.yaml 2. Create manifests: $ openshift-baremetal-install --dir ~/clusterconfigs create manifests 3. Create cluster: $ openshift-baremetal-install --dir ~/clusterconfigs --log-level debug create cluster Actual Results: Ironic node enters the clean failed state. Expected Results: Ironic node does not enter the clean failed state. Summary of actions taken to resolve issue: We need to merge upstream(Metal3) and downstream(RHOCP) PRs. - Upstream: https://github.com/metal3-io/baremetal-operator/pull/942 - Downstream: https://github.com/openshift/baremetal-operator/pull/170 Location of diagnostic data: None Hardware configuration: Model: RX2540 M4 Target Release: RHOCP4.9 Additional Info: None
Could you, please, verify this bz. We don't have Fujitsu machines to verify.
Hi, Lubov Yes, Fujitsu is going to verify it, please wait. Best Regards, Yasuhiro Futakawa
Hi, Lubov, Fujitsu verified that it works correctly with 4.9.0-0.nightly-2021-08-23-192406. We also confirmed the fix of this BZ was included in this nightly build. Best Regards, Yasuhiro Futakawa
Good news, closing
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759