Created attachment 1725940 [details] baremetalhosts.metal3.io_crd Version: ------------------------- Client Version: 4.7.0-0.nightly-2020-10-27-051128 Server Version: 4.7.0-0.nightly-2020-10-27-051128 Kubernetes Version: v1.19.0+e67f5dc Setup: ------------------------- Provisioning_net_IPv6, Baremetal_net_IPv4, disconnected install Platform: ------------------------- libvirt IPI (automated install with `openshift-baremetal-install`) What happened? ------------------------- After we deployed a cluster successfully, when running $oc get bmh -A -o yaml, we expect to see 'ErrorCount' field as described here: https://github.com/openshift/baremetal-operator/blob/01e0c1e89144deb1b05f689f29b45a749c660a3d/config/crd/bases/metal3.io_baremetalhosts.yaml#L252 But after I failed to scale down (which is a known bug in 4.7), I expect to see that the ErrorCount field would appear and hold a value, but it isn't described in the bmh description, nor in the baremetalhosts.metal3.io CRD (attached to the bug). $oc get bmh -A -o yaml ....... ................... spec: bmc: address: redfish://192.168.123.1:8000/redfish/v1/Systems/b7a2977b-0375-47ce-aa23-4d14968f15fd credentialsName: openshift-worker-0-1-bmc-secret disableCertificateVerification: true bootMACAddress: 52:54:00:d6:06:21 consumerRef: apiVersion: machine.openshift.io/v1beta1 kind: Machine name: ocp-edge-cluster-0-dh4qg-worker-0-5x94p namespace: openshift-machine-api hardwareProfile: unknown online: false rootDeviceHints: deviceName: /dev/sda status: errorMessage: "" goodCredentials: credentials: name: openshift-worker-0-1-bmc-secret namespace: openshift-machine-api credentialsVersion: "18225" hardware: ... ..... it seems like the this field is not showing in the machine-api-operator CRD as shown here: https://github.com/openshift/machine-api-operator/blob/ed7858da22dec8c5d5d3302252a259e3cd743b6a/install/0000_30_machine-api-operator_08_baremetalhost.crd.yaml#L309 What did you expect to happen? -------------------------------- we expect to see 'ErrorCount' field in the metal3.io_baremetalhosts.yaml CRD How to reproduce it -------------------------------- 1. Deploy disconnected env with OCP4.7, IPV6 Provisioning network and IPV4 Baremetal network 2. $oc get bmh -A -o yaml, and search for the 'ErrorCount' field must-gather: --------------------------------- https://drive.google.com/drive/folders/1r9WyF4hdrE43Me68J7vLwGhbTQqO9ygY?usp=sharing
*** Bug 1892243 has been marked as a duplicate of this bug. ***
seems like the issue still exists (no error count field): status: errorMessage: "" goodCredentials: credentials: name: openshift-worker-0-1-bmc-secret namespace: openshift-machine-api credentialsVersion: "21138" hardware: cpu: added the CRD to the bug
Created attachment 1727729 [details] baremetalhosts.metal3.io_crd-new
Created attachment 1727730 [details] get_bmh-new.txt
In 4.7 the BMH CRD will be managed by CBO (and not anymore by MAO). CBO is currently under development, so until the CVO integration will not be enabled the current fix could not be tested (see https://github.com/openshift/cluster-baremetal-operator/blob/bc2f94fb67f989cf13525fa9f843bd3c59159e0e/Dockerfile#L12)
The current fix depends on the completion of the epic https://issues.redhat.com/browse/KNIDEPLOY-2171. I've tested it locally and it works fine, when the CBO will be completed it could be tested as part as epic as well.
Does it hurt to update the CRD in MAO while we wait? It doesn't seem helpful to have the baremetal-operator controller and its CRD out of sync, even if we think it will be temporary.
Looks like the CBO migration will take a while to complete, so it could be a good idea to temporary cover MAO as well. /cc @sadasu
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633