When the Machine controller deletes a Node, the log messages are confusing due to the parameters being passed in the wrong order. Log messages should be in the form: <Machine name>: ... but when deleting a Node you see one in the form: <Node name>: deleting node <Machine name> for machine This is independent of the platform in use.
Verify failed, still is <Node name>: deleting node <Machine name> for machine clusterversion: 4.6.0-0.nightly-2020-09-18-002612 $ oc get node NAME STATUS ROLES AGE VERSION zhsungcp918-cd2rn-master-0.c.openshift-qe.internal Ready master 137m v1.19.0+b4ffb45 zhsungcp918-cd2rn-master-1.c.openshift-qe.internal Ready master 137m v1.19.0+b4ffb45 zhsungcp918-cd2rn-master-2.c.openshift-qe.internal Ready master 138m v1.19.0+b4ffb45 zhsungcp918-cd2rn-worker-a-zkgff.c.openshift-qe.internal Ready worker 128m v1.19.0+b4ffb45 zhsungcp918-cd2rn-worker-b-r64rd.c.openshift-qe.internal Ready worker 128m v1.19.0+b4ffb45 zhsungcp918-cd2rn-worker-c-pnmw7.c.openshift-qe.internal Ready worker 128m v1.19.0+b4ffb45 $ oc get machine NAME PHASE TYPE REGION ZONE AGE zhsungcp918-cd2rn-master-0 Running n1-standard-4 us-central1 us-central1-a 142m zhsungcp918-cd2rn-master-1 Running n1-standard-4 us-central1 us-central1-b 142m zhsungcp918-cd2rn-master-2 Running n1-standard-4 us-central1 us-central1-c 142m zhsungcp918-cd2rn-worker-a-zkgff Running n1-standard-4 us-central1 us-central1-a 131m zhsungcp918-cd2rn-worker-b-r64rd Running n1-standard-4 us-central1 us-central1-b 131m zhsungcp918-cd2rn-worker-c-pnmw7 Running n1-standard-4 us-central1 us-central1-c 131m $ oc delete machine zhsungcp918-cd2rn-worker-c-pnmw7 machine.machine.openshift.io "zhsungcp918-cd2rn-worker-c-pnmw7" deleted I0918 09:04:00.936404 1 controller.go:247] zhsungcp918-cd2rn-worker-c-pnmw7.c.openshift-qe.internal: deleting node "zhsungcp918-cd2rn-worker-c-pnmw7" for machine
The Machine controller is vendored in each cluster-api-provider's tree, so this is only fixed for vSphere (which is in-tree in MAO). Other providers will get the fix as they update their vendored packages.
i am working on proposing pull requests for the other providers now. i should have these posted today.
All PRs are now merged, the GCP PR was replaced by github.com/openshift/cluster-api-provider-gcp/pull/122 but it is linked to a separate BZ
verified clusterversion: 4.6.0-0.nightly-2020-10-08-210814 aws: I1009 05:26:34.314348 1 controller.go:248] zhsun109aws-tsn5h-worker-us-east-2c-frj7w: deleting node "ip-10-0-206-48.us-east-2.compute.internal" for machine gcp: $ oc logs machine-api-controllers-65fbf64c45-nk55m -c machine-controller | grep deleting I1009 06:36:04.994387 1 controller.go:248] zhsun109gcp-m2fw5-worker-c-dwvsb: deleting node "zhsun109gcp-m2fw5-worker-c-dwvsb.c.openshift-qe.internal" for machine azure: I1009 06:43:20.246529 1 controller.go:248] zhsun109az-hnqvv-worker-northcentralus-rwrv8: deleting node "zhsun109az-hnqvv-worker-northcentralus-rwrv8" for machine ocp: $ oc logs machine-api-controllers-65fbf64c45-nk55m -c machine-controller | grep deleting I1009 06:36:04.994387 1 controller.go:248] zhsun109gcp-m2fw5-worker-c-dwvsb: deleting node "zhsun109gcp-m2fw5-worker-c-dwvsb.c.openshift-qe.internal" for machine vshpere: I1009 07:49:44.218593 1 controller.go:248] zhsun109vsp-5vlnn-worker-l5rlm: deleting node "zhsun109vsp-5vlnn-worker-l5rlm" for machine
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196