Bug 1891301
| Summary: | Deleting bmh by "oc delete bmh' get stuck | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Nataf Sharabi <nsharabi> | ||||
| Component: | Bare Metal Hardware Provisioning | Assignee: | Andrea Fasano <afasano> | ||||
| Bare Metal Hardware Provisioning sub component: | baremetal-operator | QA Contact: | Adina Wolff <awolff> | ||||
| Status: | CLOSED ERRATA | Docs Contact: | |||||
| Severity: | high | ||||||
| Priority: | medium | CC: | beth.white, kiran, lshilin, stbenjam, ykashtan, yprokule | ||||
| Version: | 4.8 | Keywords: | Triaged | ||||
| Target Milestone: | --- | Flags: | nsharabi:
needinfo+
|
||||
| Target Release: | 4.8.0 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | No Doc Update | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2021-07-27 22:33:58 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Nataf Sharabi
2020-10-25 12:42:36 UTC
link to must-gather: https://drive.google.com/drive/folders/1pdoRJ92mcz7_TWBRKH_P-SkTPFQV9Frb?usp=sharing Logs show it failing with this error:
2020-10-29T10:39:43.030 Reconciling BareMetalHost {Request.Namespace: 'openshift-machine-api', Request.Name: 'openshift-worker-0-2'}
2020-10-29T10:39:43.030 Fetching Status from Annotation {Request.Namespace: 'openshift-machine-api', Request.Name: 'openshift-worker-0-2'}
2020-10-29T10:39:43.030 No status cache found {Request.Namespace: 'openshift-machine-api', Request.Name: 'openshift-worker-0-2'}
2020-10-29T10:39:43.030 adding finalizer {Request.Namespace: 'openshift-machine-api', Request.Name: 'openshift-worker-0-2', existingFinalizers: [], newValue: 'baremetalhost.metal3.io'}
2020-10-29T10:39:43.042 Reconciling BareMetalHost {Request.Namespace: 'openshift-machine-api', Request.Name: 'openshift-worker-0-2'}
2020-10-29T10:39:43.042 Fetching Status from Annotation {Request.Namespace: 'openshift-machine-api', Request.Name: 'openshift-worker-0-2'}
2020-10-29T10:39:43.042 No status cache found {Request.Namespace: 'openshift-machine-api', Request.Name: 'openshift-worker-0-2'}
2020-10-29T10:39:43.042 Reconciler error {controller: 'metal3-baremetalhost-controller', request: 'openshift-machine-api/openshift-worker-0-2', error: 'failed to create provisioner: failed to parse BMC address information: failed to parse BMC address information: parse "<redfish://192.168.123.1:8000/redfish/v1/Systems/e2e8a52d-1012-4eec-a22b-dfd57f0df50b>": first path segment in URL cannot contain colon'}
Basically, because we can't figure out any valid driver for this URL, we fail and then never get to run the code that would remove the finalizer. It just keeps hitting this error.
In the medium term, we should institute a webhook that prevents invalid stuff like this being set. But in the short term we should probably do something like just go ahead and remove the finalizer if we can't create a BMC and the DeletionTimestamp is set.
Workarounds would be to update the Host to have a correct address (possible since we haven't yet implemented a webhook to prevent changing the address either) or manually remove the finalizer.
Issue will be fixed by the upstream PR https://github.com/metal3-io/baremetal-operator/pull/838 in conjunction with @Zane's commit: https://github.com/metal3-io/baremetal-operator/commit/beea4d0ead807a8f19b38d538db3502ee3504b97. Changes will be ported downstream by PR https://github.com/openshift/baremetal-operator/pull/142 verified on: Client Version: 4.8.0-0.nightly-2021-06-13-101614 Server Version: 4.8.0-0.nightly-2021-07-09-181248 Kubernetes Version: v1.21.1+f36aa36 bmh is created but shows registration error: openshift-machine-api openshift-worker-0-2 registering true registration error bmh deletes without a problem Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438 |