Version: $ openshift-install version Client Version: 4.7.0-0.nightly-2020-12-21-131655 Server Version: 4.7.0-0.nightly-2020-12-21-131655 Kubernetes Version: v1.20.0+87544c5 --------------------------------------------------------------- Platform: #Please specify the platform type: aws, libvirt, openstack or baremetal etc. libvirt --------------------------------------------------------------- What happened? We created a worker that hold an existing deployed worker's mac-address. We expected to get an error indicate that there is already a worker with the same mac-address. But the status of the worker stucked on registering state. [kni@provisionhost-0-0 ~]$ oc get bmh -A NAMESPACE NAME STATUS PROVISIONING STATUS CONSUMER BMC HARDWARE PROFILE ONLINE ERROR default openshift-worker-0-2 redfish://192.168.123.1:8000/redfish/v1/Systems/84c713cf-2bc4-43c5-8a00-86c8c2ba8d25 true openshift-machine-api openshift-master-0-0 OK externally provisioned ocp-edge-cluster-0-mnn2d-master-0 redfish://192.168.123.1:8000/redfish/v1/Systems/20b39e3d-58c3-4bc4-94af-975200ae63b4 true openshift-machine-api openshift-master-0-1 OK externally provisioned ocp-edge-cluster-0-mnn2d-master-1 redfish://192.168.123.1:8000/redfish/v1/Systems/ce2645ae-08e3-4f8a-9622-fb9fd788b8ea true openshift-machine-api openshift-master-0-2 OK externally provisioned ocp-edge-cluster-0-mnn2d-master-2 redfish://192.168.123.1:8000/redfish/v1/Systems/3094a111-e3aa-4ca3-a5b7-f87e6c916aa7 true openshift-machine-api openshift-worker-0-0 OK provisioned ocp-edge-cluster-0-mnn2d-worker-0-bz8wr redfish://192.168.123.1:8000/redfish/v1/Systems/1817896d-ecc8-4cb6-aae8-aa0b8d43a0e1 unknown true openshift-machine-api openshift-worker-0-1 OK provisioned ocp-edge-cluster-0-mnn2d-worker-0-nbgkc redfish://192.168.123.1:8000/redfish/v1/Systems/b4283036-875b-4bcc-aa4d-8d350c53f11d unknown true openshift-machine-api openshift-worker-0-2 registering redfish://192.168.123.1:8000/redfish/v1/Systems/84c713cf-2bc4-43c5-8a00-86c8c2ba8d25 true [kni@provisionhost-0-0 ~]$ cat worker-0-2.yaml apiVersion: v1 kind: Secret metadata: name: openshift-worker-0-2-bmc-secret type: Opaque data: username: YWRtaW4K password: cGFzc3dvcmQK --- apiVersion: metal3.io/v1alpha1 kind: BareMetalHost metadata: name: openshift-worker-0-2 spec: online: true bmc: address: redfish://192.168.123.1:8000/redfish/v1/Systems/84c713cf-2bc4-43c5-8a00-86c8c2ba8d25 credentialsName: openshift-worker-0-2-bmc-secret disableCertificateVerification: True username: admin password: password bootMACAddress: 52:54:00:1e:43:06 hardwareProfile: unknown --------------------------------------------------------------- What did you expect to happen? We expect to see error regarding the existed mac-address or ready state and getting error after this step ($ oc scale machineset -n openshift-machine-api ocp-edge-cluster-0-worker-0 --replicas=N+1). --------------------------------------------------------------- How to reproduce it (as minimally and precisely as possible)? 1. $ ssh kni@provisionhost-0-0 2. Create a file for the new bmh we want to deploy: $ vi new-nodeX.yaml Inside the file, put MAC address and IP address similar to exist deployed node. apiVersion: v1 kind: Secret metadata: name: openshift-worker-0-X-bmc-secret type: Opaque data: username: <YWRtaW4K> password: <cGFzc3dvcmQK> --- apiVersion: metal3.io/v1alpha1 kind: BareMetalHost metadata: name: openshift-worker-0-X spec: online: true bmc: address: <redfish://192.168.123.1:8000/redfish/v1/Systems/e2e8a52d-1012-4eec-a22b-dfd57f0df50b> credentialsName: openshift-worker-0-X-bmc-secret disableCertificateVerification: True username: admin password: password bootMACAddress: <52:54:00:e4:d1:13> rootDeviceHints: deviceName: /dev/sda 3. Add the new BMH: $ oc create -f new-nodeX.yaml -n openshift-machine-api 4. Result: An error message will indicate that there is already a BMH with the same MAC and IP address. -------------------------------------------- must-gather - https://drive.google.com/drive/folders/1oVhbl0oXEu1LWAuSxs3sunOpPCHEE7tS?usp=sharing
I put MAC address and IP address similar to exist deployed node I didn't get an error message will indicate that there is already a BMH with the same MAC and IP address, I got only registration error. In this regard, I wanted to ask if this is enough and I can move the bug to "verified"? [kni@provisionhost-0-0 ~]$ oc get bmh -n openshift-machine-api NAME STATE CONSUMER ONLINE ERROR openshift-master-0-0 externally provisioned ocp-edge-cluster-0-jmht5-master-0 true openshift-master-0-1 externally provisioned ocp-edge-cluster-0-jmht5-master-1 true openshift-master-0-2 externally provisioned ocp-edge-cluster-0-jmht5-master-2 true openshift-worker-0-0 provisioned ocp-edge-cluster-0-jmht5-worker-0-lqnwr true openshift-worker-0-1 provisioned ocp-edge-cluster-0-jmht5-worker-0-6795b true openshift-worker-0-2 registering true registration error
The error should read something like: "MAC Address 00:e2:b4:d9:0a:f1 conflicts with existing host ostest-worker-0" Would do you mean by the MAC address being similar? Could you attach the yaml output of the above command?
(In reply to Honza Pokorny from comment #3) > The error should read something like: > > "MAC Address 00:e2:b4:d9:0a:f1 conflicts with existing host ostest-worker-0" > > Would do you mean by the MAC address being similar? > > Could you attach the yaml output of the above command? 1.I put for the new node (worker-0-2) MAC address and IP address the same to exist deployed node (worker-0-1). 2.I can see the error message in the yaml file: name: openshift-worker-0-2 namespace: openshift-machine-api resourceVersion: "69903" uid: 63484031-70e8-428e-8c05-0e0332b03ded spec: bmc: address: redfish://192.168.123.1:8000/redfish/v1/Systems/a832161c-fe24-422e-93f8-4ae721b872b5 credentialsName: openshift-worker-0-2-bmc-secret disableCertificateVerification: true bootMACAddress: 52:54:00:af:6e:ab hardwareProfile: unknown online: true status: errorCount: 7 errorMessage: MAC address 52:54:00:af:6e:ab conflicts with existing node openshift-worker-0-1 errorType: registration error So this is enough that I can see the error only in the yaml file?
Yes, I think this is sufficient. Thanks
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438