Scaling machineset down should check if the number of replicas matches the number of bmh Version: 4.4.0-0.nightly-2020-01-09-013524 Steps to reproduce: Have 3 workes in the bmh list. Remove a worker with: oc delete bmh openshift-worker-1 -n openshift-machine-api #Check the list of bmh (note that we now have only 2 workers): [kni@worker-2 ~]$ oc get bmh -n openshift-machine-api NAME STATUS PROVISIONING STATUS CONSUMER BMC HARDWARE PROFILE ONLINE ERROR openshift-master-0 OK externally provisioned ocp-edge-cluster-master-0 ipmi://192.168.123.1:6230 true openshift-master-1 OK externally provisioned ocp-edge-cluster-master-1 ipmi://192.168.123.1:6231 true openshift-master-2 OK externally provisioned ocp-edge-cluster-master-2 ipmi://192.168.123.1:6232 true openshift-worker-0 OK provisioned ocp-edge-cluster-worker-0-d2fvm ipmi://192.168.123.1:6233 unknown true openshift-worker-9 OK provisioned ocp-edge-cluster-worker-0-ptklp ipmi://192.168.123.1:6239 unknown true Scale the number of replicas to 2 (expecting no action, since we already have 2 workers): oc scale machineset -n openshift-machine-api ocp-edge-cluster-worker-0 --replicas=2 Actual observation: a worker gets deprovisioned and provisioned again. (.ironic) [kni@worker-2 ~]$ oc get bmh -n openshift-machine-api NAME STATUS PROVISIONING STATUS CONSUMER BMC HARDWARE PROFILE ONLINE ERROR openshift-master-0 OK externally provisioned ocp-edge-cluster-master-0 ipmi://192.168.123.1:6230 true openshift-master-1 OK externally provisioned ocp-edge-cluster-master-1 ipmi://192.168.123.1:6231 true openshift-master-2 OK externally provisioned ocp-edge-cluster-master-2 ipmi://192.168.123.1:6232 true openshift-worker-0 OK deprovisioning ocp-edge-cluster-worker-0-5bdrv ipmi://192.168.123.1:6233 unknown false openshift-worker-9 OK provisioned ocp-edge-cluster-worker-0-ptklp ipmi://192.168.123.1:6239 unknown true [kni@worker-2 ~]$ oc get bmh -n openshift-machine-api NAME STATUS PROVISIONING STATUS CONSUMER BMC HARDWARE PROFILE ONLINE ERROR openshift-master-0 OK externally provisioned ocp-edge-cluster-master-0 ipmi://192.168.123.1:6230 true openshift-master-1 OK externally provisioned ocp-edge-cluster-master-1 ipmi://192.168.123.1:6231 true openshift-master-2 OK externally provisioned ocp-edge-cluster-master-2 ipmi://192.168.123.1:6232 true openshift-worker-0 OK provisioning ocp-edge-cluster-worker-0-d2fvm ipmi://192.168.123.1:6233 unknown true openshift-worker-9 OK provisioned ocp-edge-cluster-worker-0-ptklp ipmi://192.168.123.1:6239 unknown true [kni@worker-2 ~]$ oc get bmh -n openshift-machine-api NAME STATUS PROVISIONING STATUS CONSUMER BMC HARDWARE PROFILE ONLINE ERROR openshift-master-0 OK externally provisioned ocp-edge-cluster-master-0 ipmi://192.168.123.1:6230 true openshift-master-1 OK externally provisioned ocp-edge-cluster-master-1 ipmi://192.168.123.1:6231 true openshift-master-2 OK externally provisioned ocp-edge-cluster-master-2 ipmi://192.168.123.1:6232 true openshift-worker-0 OK provisioned ocp-edge-cluster-worker-0-d2fvm ipmi://192.168.123.1:6233 unknown true openshift-worker-9 OK provisioned ocp-edge-cluster-worker-0-ptklp ipmi://192.168.123.1:6239 unknown true
I think this is not MCO but more machine-api, passing to Alberto to take a look
How many replicas was there originally? I reckon this is because when you scaled down there's no guarantee the machine which bmh was deleted is the one being deleted, assigning to Sandhya for baremetal specific insight.
The baremetal code uses an annotation to manage which machine is removed when the set is scaled down. https://github.com/metal3-io/metal3-docs/blob/master/design/remove-host.md
*** This bug has been marked as a duplicate of bug 1812588 ***