This feature seem have been disabled on vsphere according to pr: https://github.com/openshift/origin/pull/27236
Same issue is observed on BM platform as well. ocp version : 4.11.0-0.nightly-arm64-2022-06-09-060907 steps followed: oc delete machine <adistefa-ipibm-vjwqv-master-0> before adding new machine. The machine gets deleted. oc get machines NAME PHASE TYPE REGION ZONE AGE adistefa-ipibm-vjwqv-master-1 Running 20h adistefa-ipibm-vjwqv-worker-0-qhqc6 Running 20h adistefa-ipibm-vjwqv-worker-0-xgd4j Running 5h46m adistefa-ipibm-vjwqv-worker-0-zv9qh Running 20h adistefa-ipibm-vjwqv-worker-miyadav-vsdkm Running 5h13m master-03 Running 4h57m oc get nodes NAME STATUS ROLES AGE VERSION master-01.adistefa-ipibm.qeclusters.arm.eng.rdu2.redhat.com Ready master 20h v1.24.0+bb9c2f1 master-03.adistefa-ipibm.qeclusters.arm.eng.rdu2.redhat.com Ready master 4h49m v1.24.0+bb9c2f1 worker-00.adistefa-ipibm.qeclusters.arm.eng.rdu2.redhat.com Ready worker 20h v1.24.0+bb9c2f1 worker-01.adistefa-ipibm.qeclusters.arm.eng.rdu2.redhat.com Ready worker 20h v1.24.0+bb9c2f1 worker-02.adistefa-ipibm.qeclusters.arm.eng.rdu2.redhat.com Ready worker 5h38m v1.24.0+bb9c2f1 worker-03.adistefa-ipibm.qeclusters.arm.eng.rdu2.redhat.com Ready worker 5h5m v1.24.0+bb9c2f1 master-0 gets deleted (It should have been prevented). vim openshift-machine-api/pods/machine-api-controllers-85b9c7f7d6-4jbfp/machine-controller/machine-controller/logs/current.log 2022-06-13T13:39:11.850370075Z I0613 13:39:11.850296 1 controller.go:709] evicting pod openshift-etcd/revision-pruner-16-master-00.adistefa-ipibm.qeclusters.arm.eng.rdu2.redhat.com 2022-06-13T13:39:11.873421269Z I0613 13:39:11.873374 1 controller.go:432] Evicted pod from Nodepodrevision-pruner-16-master-00.adistefa-ipibm.qeclusters.arm.eng.rdu2.redhat.com/openshift-etcd 2022-06-13T13:39:11.873462751Z I0613 13:39:11.873449 1 controller.go:460] drain successful for machine "adistefa-ipibm-vjwqv-master-0"
*** Bug 2094919 has been marked as a duplicate of this bug. ***
checked on BM platform ocp version : 4.11.0-0.nightly-2022-06-23-153912 oc delete machine skundu-bm-ww96d-master-0 oc get machines NAME PHASE TYPE REGION ZONE AGE skundu-bm-ww96d-master-0 Deleting 100m skundu-bm-ww96d-master-1 Running 101m skundu-bm-ww96d-master-2 Running 101m skundu-bm-ww96d-worker-0-gkxrz Running 76m skundu-bm-ww96d-worker-0-vsrg7 Running 76m machine continues to remain in "Deleting" state. oc get nodes NAME STATUS ROLES AGE VERSION openshift-qe-013.lab.eng.rdu2.redhat.com Ready master 80m v1.24.0+284d62a openshift-qe-014.lab.eng.rdu2.redhat.com Ready master 80m v1.24.0+284d62a openshift-qe-015.lab.eng.rdu2.redhat.com Ready master 80m v1.24.0+284d62a openshift-qe-016.lab.eng.rdu2.redhat.com Ready worker 54m v1.24.0+284d62a openshift-qe-023.lab.eng.rdu2.redhat.com Ready worker 52m v1.24.0+284d62a As seen above, the node continues to remain in Ready state. etcd operator logs: skip removing the deletion hook from machine skundu-bm-ww96d-master-0 since its member is still present with any of: [{InternalIP } {InternalIP } {InternalIP fe80::f602:70ff:feb8:d8f0%eno1.194} {InternalIP 10.8.1.143} {InternalIP 2620:52:0:800:f602:70ff:feb8:d8f0} {InternalIP } {InternalIP } {InternalIP } {Hostname openshift-qe-013.lab.eng.rdu2.redhat.com} {InternalDNS openshift-qe-013.lab.eng.rdu2.redhat.com}] I0624 12:04:02.561798 1 machinedeletionhooks.go:121] current members [ID:2648565165544474566 name:"openshift-qe-014.lab.eng.rdu2.redhat.com" peerURLs:"https://10.8.1.144:2380" clientURLs:"https://10.8.1.144:2379" ID:16601722864613429937 name:"openshift-qe-013.lab.eng.rdu2.redhat.com" peerURLs:"https://10.8.1.143:2380" clientURLs:"https://10.8.1.143:2379" ID:17820969981482329470 name:"openshift-qe-015.lab.eng.rdu2.redhat.com" peerURLs:"https://10.8.1.145:2380" clientURLs:"https://10.8.1.145:2379" ] with IPSet: map[10.8.1.143:{} 10.8.1.144:{} 10.8.1.145:{}] I0624 12:04:02.561880 1 machinedeletionhooks.go:135] skip removing the deletion hook from machine skundu-bm-ww96d-master-0 since its member is still present with any of: [{InternalIP } {InternalIP } {InternalIP fe80::f602:70ff:feb8:d8f0%eno1.194} {InternalIP 10.8.1.143} {InternalIP 2620:52:0:800:f602:70ff:feb8:d8f0} {InternalIP } {InternalIP } {InternalIP } {Hostname openshift-qe-013.lab.eng.rdu2.redhat.com} {InternalDNS openshift-qe-013.lab.eng.rdu2.redhat.com}] It works as expected on BM platform.
The positive scenario of scaling up the etcd works fine on the BM platform. oc get machines NAME PHASE TYPE REGION ZONE AGE adistefa-ipi2-tlxp7-master-1 Running 4h12m adistefa-ipi2-tlxp7-master-2 Running 4h12m adistefa-ipi2-tlxp7-master-new Running 131m The new machine has successfully replaced the deleted machine. oc get nodes NAME STATUS ROLES AGE VERSION master-01.adistefa-ipi2.qeclusters.arm.eng.rdu2.redhat.com Ready master 3h52m v1.24.0+284d62a master-02.adistefa-ipi2.qeclusters.arm.eng.rdu2.redhat.com Ready master 3h52m v1.24.0+284d62a node-01.adistefa-ipi2.qeclusters.arm.eng.rdu2.redhat.com Ready master 121m v1.24.0+284d62a The new node has successfully replaced the deleted node. control plane pods are also replicated on the new node. oc get po -n openshift-etcd etcd-master-01.adistefa-ipi2.qeclusters.arm.eng.rdu2.redhat.com 5/5 Running 0 117m etcd-master-02.adistefa-ipi2.qeclusters.arm.eng.rdu2.redhat.com 5/5 Running 0 116m etcd-node-01.adistefa-ipi2.qeclusters.arm.eng.rdu2.redhat.com 5/5 Running 0 114m oc get po -n openshift-kube-apiserver kube-apiserver-master-01.adistefa-ipi2.qeclusters.arm.eng.rdu2.redhat.com 5/5 Running 0 125m kube-apiserver-master-02.adistefa-ipi2.qeclusters.arm.eng.rdu2.redhat.com 5/5 Running 0 128m kube-apiserver-node-01.adistefa-ipi2.qeclusters.arm.eng.rdu2.redhat.com 5/5 Running 0 135m
Verified with 4.11.0-0.nightly-2022-06-25-081133 on ipi vsphere, the hook works well
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069