Hide Forgot
Version: $ openshift-install version Client Version: 4.7.0-0.nightly-2020-12-21-131655 Server Version: 4.7.0-0.nightly-2020-12-21-131655 Kubernetes Version: v1.20.0+87544c5 --------------------------------------------------------------- Platform: #Please specify the platform type: aws, libvirt, openstack or baremetal etc. libvirt --------------------------------------------------------------- What happened? We created a worker that hold an existing deployed worker's mac-address. We expected to get an error indicate that there is already a worker with the same mac-address. But the status of the worker stucked on registering state. [kni@provisionhost-0-0 ~]$ oc get bmh -A NAMESPACE NAME STATUS PROVISIONING STATUS CONSUMER BMC HARDWARE PROFILE ONLINE ERROR default openshift-worker-0-2 redfish://192.168.123.1:8000/redfish/v1/Systems/84c713cf-2bc4-43c5-8a00-86c8c2ba8d25 true openshift-machine-api openshift-master-0-0 OK externally provisioned ocp-edge-cluster-0-mnn2d-master-0 redfish://192.168.123.1:8000/redfish/v1/Systems/20b39e3d-58c3-4bc4-94af-975200ae63b4 true openshift-machine-api openshift-master-0-1 OK externally provisioned ocp-edge-cluster-0-mnn2d-master-1 redfish://192.168.123.1:8000/redfish/v1/Systems/ce2645ae-08e3-4f8a-9622-fb9fd788b8ea true openshift-machine-api openshift-master-0-2 OK externally provisioned ocp-edge-cluster-0-mnn2d-master-2 redfish://192.168.123.1:8000/redfish/v1/Systems/3094a111-e3aa-4ca3-a5b7-f87e6c916aa7 true openshift-machine-api openshift-worker-0-0 OK provisioned ocp-edge-cluster-0-mnn2d-worker-0-bz8wr redfish://192.168.123.1:8000/redfish/v1/Systems/1817896d-ecc8-4cb6-aae8-aa0b8d43a0e1 unknown true openshift-machine-api openshift-worker-0-1 OK provisioned ocp-edge-cluster-0-mnn2d-worker-0-nbgkc redfish://192.168.123.1:8000/redfish/v1/Systems/b4283036-875b-4bcc-aa4d-8d350c53f11d unknown true openshift-machine-api openshift-worker-0-2 registering redfish://192.168.123.1:8000/redfish/v1/Systems/84c713cf-2bc4-43c5-8a00-86c8c2ba8d25 true [kni@provisionhost-0-0 ~]$ cat worker-0-2.yaml apiVersion: v1 kind: Secret metadata: name: openshift-worker-0-2-bmc-secret type: Opaque data: username: YWRtaW4K password: cGFzc3dvcmQK --- apiVersion: metal3.io/v1alpha1 kind: BareMetalHost metadata: name: openshift-worker-0-2 spec: online: true bmc: address: redfish://192.168.123.1:8000/redfish/v1/Systems/84c713cf-2bc4-43c5-8a00-86c8c2ba8d25 credentialsName: openshift-worker-0-2-bmc-secret disableCertificateVerification: True username: admin password: password bootMACAddress: 52:54:00:1e:43:06 hardwareProfile: unknown --------------------------------------------------------------- What did you expect to happen? We expect to see error regarding the existed mac-address or ready state and getting error after this step ($ oc scale machineset -n openshift-machine-api ocp-edge-cluster-0-worker-0 --replicas=N+1). --------------------------------------------------------------- How to reproduce it (as minimally and precisely as possible)? 1. $ ssh kni@provisionhost-0-0 2. Create a file for the new bmh we want to deploy: $ vi new-nodeX.yaml Inside the file, put MAC address and IP address similar to exist deployed node. apiVersion: v1 kind: Secret metadata: name: openshift-worker-0-X-bmc-secret type: Opaque data: username: <YWRtaW4K> password: <cGFzc3dvcmQK> --- apiVersion: metal3.io/v1alpha1 kind: BareMetalHost metadata: name: openshift-worker-0-X spec: online: true bmc: address: <redfish://192.168.123.1:8000/redfish/v1/Systems/e2e8a52d-1012-4eec-a22b-dfd57f0df50b> credentialsName: openshift-worker-0-X-bmc-secret disableCertificateVerification: True username: admin password: password bootMACAddress: <52:54:00:e4:d1:13> rootDeviceHints: deviceName: /dev/sda 3. Add the new BMH: $ oc create -f new-nodeX.yaml -n openshift-machine-api 4. Result: An error message will indicate that there is already a BMH with the same MAC and IP address. -------------------------------------------- must-gather - https://drive.google.com/drive/folders/1oVhbl0oXEu1LWAuSxs3sunOpPCHEE7tS?usp=sharing
Honza, this might be related to some work [https://github.com/metal3-io/baremetal-operator/pull/581] you have been doing on the BMO to check if the MAC address in the BMH is already in use. Please feel free to re-assign if you are no longer working on it.
This has been fixed upstream: https://github.com/metal3-io/baremetal-operator/pull/776 https://github.com/metal3-io/baremetal-operator/pull/780 But not yet backported downstream; setting back to assigned until that happens
It's not fixed, I checked in 4.7.3 The status of the worker still stucked on registering state without error: [kni@provisionhost-0-0 ~]$ oc get bmh -n openshift-machine-api NAME STATUS PROVISIONING STATUS CONSUMER BMC HARDWARE PROFILE ONLINE ERROR openshift-master-0-0 OK externally provisioned ocp-edge-cluster-0-s7q9d-master-0 redfish://192.168.123.1:8000/redfish/v1/Systems/a3d533e4-6c95-40b0-b280-d9778a8acd09 true openshift-master-0-1 OK externally provisioned ocp-edge-cluster-0-s7q9d-master-1 redfish://192.168.123.1:8000/redfish/v1/Systems/7be4f29c-26b8-48a9-9376-89d8ce5891c0 true openshift-master-0-2 OK externally provisioned ocp-edge-cluster-0-s7q9d-master-2 redfish://192.168.123.1:8000/redfish/v1/Systems/577c051c-423f-41f9-9ecd-e1c618599cda true openshift-worker-0-0 OK provisioned ocp-edge-cluster-0-s7q9d-worker-0-bbj2f redfish://192.168.123.1:8000/redfish/v1/Systems/9cb149bf-609d-4a6d-8e50-2251c43b2a66 unknown true openshift-worker-0-1 registering redfish://192.168.123.1:8000/redfish/v1/Systems/9cb149bf-609d-4a6d-8e50-2251c43b2a66 true From the yaml file: name: openshift-worker-0-1 namespace: openshift-machine-api resourceVersion: "1381496" selfLink: /apis/metal3.io/v1alpha1/namespaces/openshift-machine-api/baremetalhosts/openshift-worker-0-1 uid: b81dd001-c3b9-41e8-9159-babdf67e2327 spec: bmc: address: redfish://192.168.123.1:8000/redfish/v1/Systems/9cb149bf-609d-4a6d-8e50-2251c43b2a66 credentialsName: openshift-worker-0-1-bmc-secret disableCertificateVerification: true bootMACAddress: 52:54:00:52:e5:4a hardwareProfile: unknown online: true status: errorCount: 0 errorMessage: "" goodCredentials: {} hardwareProfile: ""
Verified in: [kni@provisionhost-0-0 ~]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.7.0-0.nightly-2021-03-26-090502 True False 13m Cluster version is 4.7.0-0.nightly-2021-03-26-090502 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- [kni@provisionhost-0-0 ~]$ oc get bmh -A NAMESPACE NAME STATUS PROVISIONING STATUS CONSUMER BMC HARDWARE PROFILE ONLINE ERROR openshift-machine-api openshift-master-0-0 OK externally provisioned ocp-edge-cluster-0-lw7k6-master-0 redfish://192.168.123.1:8000/redfish/v1/Systems/f39229e7-8a2d-4b5c-bf6a-2fe7669b422e true openshift-machine-api openshift-master-0-1 OK externally provisioned ocp-edge-cluster-0-lw7k6-master-1 redfish://192.168.123.1:8000/redfish/v1/Systems/d2dc9287-99ad-4e81-837b-b435250f1cda true openshift-machine-api openshift-master-0-2 OK externally provisioned ocp-edge-cluster-0-lw7k6-master-2 redfish://192.168.123.1:8000/redfish/v1/Systems/587047ff-7951-40dd-b263-b6be34d8450d true openshift-machine-api openshift-worker-0-0 OK provisioned ocp-edge-cluster-0-lw7k6-worker-0-mctlz redfish://192.168.123.1:8000/redfish/v1/Systems/82f92ca2-6235-42df-8820-b1522b44fed9 unknown true openshift-machine-api openshift-worker-0-1 OK provisioned ocp-edge-cluster-0-lw7k6-worker-0-s8rz4 redfish://192.168.123.1:8000/redfish/v1/Systems/2df0b238-3885-464b-a442-c358f6717733 unknown false openshift-machine-api openshift-worker-0-2 error registering redfish://192.168.123.1:8000/redfish/v1/Systems/2df0b238-3885-464b-a442-c358f6717733 true MAC address 52:54:00:ee:5e:a2 conflicts with existing node openshift-worker-0-1 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- name: openshift-worker-0-2 namespace: openshift-machine-api resourceVersion: "38677" selfLink: /apis/metal3.io/v1alpha1/namespaces/openshift-machine-api/baremetalhosts/openshift-worker-0-2 uid: 0a158267-398b-4802-9957-a588b4080880 spec: bmc: address: redfish://192.168.123.1:8000/redfish/v1/Systems/2df0b238-3885-464b-a442-c358f6717733 credentialsName: openshift-worker-0-2-bmc-secret disableCertificateVerification: true bootMACAddress: 52:54:00:ee:5e:a2 hardwareProfile: unknown online: true status: errorCount: 3 errorMessage: MAC address 52:54:00:ee:5e:a2 conflicts with existing node openshift-worker-0-1 errorType: registration error goodCredentials: {} hardwareProfile: "" lastUpdated: "2021-03-26T16:13:46Z" operationHistory: deprovision: end: null start: null inspect: end: null start: null provision: end: null start: null register: end: null start: "2021-03-26T16:12:46Z" operationalStatus: error poweredOn: false provisioning: ID: "" image: checksum: "" url: "" state: registering
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.4 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:0957