Description of problem: MachineHealthCheck triger machine deletion, the new node join the cluster, but the old machine cloudn't be deleted. Version-Release number of selected component (if applicable): 4.2.0-0.nightly-2019-07-25-053632 How reproducible: Always Steps to Reproduce: 1. Edit featuregate to enable MachineHealthCheck controller 2. Create a machinehealtchcheck apiVersion: healthchecking.openshift.io/v1alpha1 kind: MachineHealthCheck metadata: name: example namespace: openshift-machine-api spec: selector: matchLabels: machine.openshift.io/cluster-api-cluster: zhsun1-lpwf6 machine.openshift.io/cluster-api-machine-role: worker machine.openshift.io/cluster-api-machine-type: worker machine.openshift.io/cluster-api-machineset: zhsun1-lpwf6-worker-us-east-2c 3. Stop the kubelet from a node 4. Check machine, node, monitor the controller log Actual results: $ oc get machine NAME INSTANCE STATE TYPE REGION ZONE AGE zhsun1-lpwf6-master-0 i-075ff28732222dd75 running m4.xlarge us-east-2 us-east-2a 23h zhsun1-lpwf6-master-1 i-05ca21332af6d4e4e running m4.xlarge us-east-2 us-east-2b 23h zhsun1-lpwf6-master-2 i-0e9471fbe5a713bf0 running m4.xlarge us-east-2 us-east-2c 23h zhsun1-lpwf6-worker-us-east-2b-sb6dx i-09679f1ffaefc5fc4 running m4.large us-east-2 us-east-2b 4h30m zhsun1-lpwf6-worker-us-east-2c-rghdq i-0be214301c27cf5d8 running m4.large us-east-2 us-east-2c 4h18m zhsun1-lpwf6-worker-us-east-2c-rz5wl i-0343708b0798c2bf7 running m4.large us-east-2 us-east-2c 12m $ oc get node NAME STATUS ROLES AGE VERSION ip-10-0-136-167.us-east-2.compute.internal Ready master 23h v1.14.0+bd34733a7 ip-10-0-146-55.us-east-2.compute.internal Ready master 23h v1.14.0+bd34733a7 ip-10-0-155-18.us-east-2.compute.internal Ready worker 4h27m v1.14.0+bd34733a7 ip-10-0-161-35.us-east-2.compute.internal Ready worker 8m49s v1.14.0+bd34733a7 ip-10-0-163-156.us-east-2.compute.internal NotReady,SchedulingDisabled worker 4h14m v1.14.0+bd34733a7 ip-10-0-165-203.us-east-2.compute.internal Ready master 23h v1.14.0+bd34733a7 I0726 07:38:49.624616 1 controller.go:193] Reconciling machine "zhsun1-lpwf6-worker-us-east-2c-rghdq" triggers delete I0726 07:38:49.702854 1 info.go:16] ignoring DaemonSet-managed pods: tuned-z9274, dns-default-s4fh6, node-ca-gxv4d, machine-config-daemon-8684g, node-exporter-kqkvm, multus-54tlk, ovs-wn95s, sdn-qxjps; deleting pods with local storage: alertmanager-main-1, grafana-b8978fdd6-rdr9n, kube-state-metrics-548b88d5b9-2gs4r, prometheus-adapter-78b65745c7-qvxqm, prometheus-k8s-1 I0726 07:39:09.768250 1 info.go:16] ignoring DaemonSet-managed pods: tuned-z9274, dns-default-s4fh6, node-ca-gxv4d, machine-config-daemon-8684g, node-exporter-kqkvm, multus-54tlk, ovs-wn95s, sdn-qxjps; deleting pods with local storage: alertmanager-main-1, grafana-b8978fdd6-rdr9n, kube-state-metrics-548b88d5b9-2gs4r, prometheus-adapter-78b65745c7-qvxqm, prometheus-k8s-1 I0726 07:39:09.768294 1 info.go:20] failed to evict pods from node "ip-10-0-163-156.us-east-2.compute.internal" (pending pods: alertmanager-main-1,certified-operators-cc5b64cf4-x65qj,community-operators-64d697dc9c-ml7qf,grafana-b8978fdd6-rdr9n,image-registry-76bfc6c458-7cn5g,kube-state-metrics-548b88d5b9-2gs4r,openshift-state-metrics-77848f6cdd-7pgtp,prometheus-adapter-78b65745c7-qvxqm,prometheus-k8s-1,prometheus-operator-5c54dbc6d8-8nhvx,redhat-operators-79967f6d7b-6fz6r,router-default-7dcb86744-s42nf,telemeter-client-6c4b8c889b-skn7v): Drain did not complete within 20s I0726 07:39:09.768316 1 info.go:16] Drain did not complete within 20s I0726 07:39:09.768325 1 info.go:20] unable to drain node "ip-10-0-163-156.us-east-2.compute.internal" I0726 07:39:09.768336 1 info.go:20] there are pending nodes to be drained: ip-10-0-163-156.us-east-2.compute.internal W0726 07:39:09.768350 1 controller.go:286] drain failed for machine "zhsun1-lpwf6-worker-us-east-2c-rghdq": Drain did not complete within 20s E0726 07:39:09.768366 1 controller.go:202] Failed to drain node for machine "zhsun1-lpwf6-worker-us-east-2c-rghdq": requeue in: 20s I0726 07:39:09.768386 1 controller.go:352] Actuator returned requeue-after error: requeue in: 20s Expected results: The uhealthy machine could be deleted Additional info:
Hi sunzhaohua, based on the logs from the machine controller > I0726 07:39:09.768294 1 info.go:20] failed to evict pods from node "ip-10-0-163-156.us-east-2.compute.internal" (pending pods: alertmanager-main-1,certified-operators-cc5b64cf4-x65qj,community-operators-64d697dc9c-ml7qf,grafana-b8978fdd6-rdr9n,image-registry-76bfc6c458-7cn5g,kube-state-metrics-548b88d5b9-2gs4r,openshift-state-metrics-77848f6cdd-7pgtp,prometheus-adapter-78b65745c7-qvxqm,prometheus-k8s-1,prometheus-operator-5c54dbc6d8-8nhvx,redhat-operators-79967f6d7b-6fz6r,router-default-7dcb86744-s42nf,telemeter-client-6c4b8c889b-skn7v): Drain did not complete within 20s I assume there is PDB resource deployed in the cluster that forbids pods from being quickly deleted (i.e. rescheduled to a different node). It takes a while before all the pods are properly re-deployed (or just deleted in case of daemon sets). Can you check if the pods are eventually deleted? Depending on the workload, it might take minutes before the node draining is finished. If the pods are still not drained after few minutes and the machine is still not deleted, can you describe all the pods in `pending pods` list that are to be drained to see why they are stuck? Thanks Jan
Jan Chaloupka I tried again in ipi Azure, 4.2.0-0.nightly-2019-07-28-222114. These pods can not be deleted even after a few hours and stuck in terminating status. $ oc get pod --all-namespaces | grep Terminating openshift-image-registry image-registry-5846554c9c-gpnxp 1/1 Terminating 0 23h openshift-monitoring alertmanager-main-0 3/3 Terminating 0 23h openshift-monitoring grafana-78f4f9b797-fzt5r 2/2 Terminating 0 23h openshift-monitoring prometheus-k8s-1 6/6 Terminating 1 23h I0730 03:01:57.498769 1 controller.go:193] Reconciling machine "zhsun2-g4ft2-worker-centralus1-rj8rc" triggers delete I0730 03:01:57.565637 1 info.go:16] ignoring DaemonSet-managed pods: tuned-dz4t6, dns-default-9kjkm, node-ca-nhxjs, machine-config-daemon-vrc96, node-exporter-rghnh, multus-n45pv, ovs-h2r69, sdn-s9zww; deleting pods with local storage: alertmanager-main-0, grafana-78f4f9b797-fzt5r, prometheus-k8s-1 I0730 03:02:17.764275 1 info.go:16] ignoring DaemonSet-managed pods: tuned-dz4t6, dns-default-9kjkm, node-ca-nhxjs, machine-config-daemon-vrc96, node-exporter-rghnh, multus-n45pv, ovs-h2r69, sdn-s9zww; deleting pods with local storage: alertmanager-main-0, grafana-78f4f9b797-fzt5r, prometheus-k8s-1 I0730 03:02:17.764300 1 info.go:20] failed to evict pods from node "zhsun2-g4ft2-worker-centralus1-rj8rc" (pending pods: alertmanager-main-0,grafana-78f4f9b797-fzt5r,image-registry-5846554c9c-gpnxp,prometheus-k8s-1): Drain did not complete within 20s I0730 03:02:17.764313 1 info.go:16] Drain did not complete within 20s I0730 03:02:17.764321 1 info.go:20] unable to drain node "zhsun2-g4ft2-worker-centralus1-rj8rc" I0730 03:02:17.764325 1 info.go:20] there are pending nodes to be drained: zhsun2-g4ft2-worker-centralus1-rj8rc W0730 03:02:17.764331 1 controller.go:286] drain failed for machine "zhsun2-g4ft2-worker-centralus1-rj8rc": Drain did not complete within 20s E0730 03:02:17.764338 1 controller.go:202] Failed to drain node for machine "zhsun2-g4ft2-worker-centralus1-rj8rc": requeue in: 20s I0730 03:02:17.764346 1 controller.go:352] Actuator returned requeue-after error: requeue in: 20s I0730 03:02:37.764686 1 controller.go:129] Reconciling Machine "zhsun2-g4ft2-worker-centralus1-rj8rc"
Created attachment 1594481 [details] describe all the pod in terminating status
*** This bug has been marked as a duplicate of bug 1732614 ***
-There's code bug being fixed to avoid leaking subroutines https://bugzilla.redhat.com/show_bug.cgi?id=1733708 -This issue is describing a design behaviour we have today and we should improve. See https://jira.coreos.com/browse/CLOUD-638 / https://github.com/openshift/cluster-api/pull/61 hence reopening
I believe the root cause of this bug is here: https://bugzilla.redhat.com/show_bug.cgi?id=1743741
*** Bug 1745420 has been marked as a duplicate of this bug. ***
(In reply to Michael Gugino from comment #6) > I believe the root cause of this bug is here: > https://bugzilla.redhat.com/show_bug.cgi?id=1743741 I no longer believe this to be the cause and is somewhat unrelated.
Here are the steps to precisely replicate the issue: ## STEPS ## 1. Stop kubelet on a worker node that has a pod with local storage/data, such as alert-monitor-1 in namespace openshift-monitoring 1.a: #!/bin/bash ./oc debug --image=rhel7/rhel-tools nodes/$1 # wait for prompt to be ready chroot /host systemctl stop kubelet 2. Drain the node before the server notices something is wrong 2.a: ./oc adm drain <node> --ignore-daemonsets --delete-local-data This will hang after it hits that monitoring pod (or possibly a different pod with local storage). After a bit, exit out, and you can query pods on that node, all the non-daemonset pods will be stuck in terminating state (or at least the pods with local storage). 3. Delete the machine associated with that node; drain will be stuck in the same spot. ## END STEPS ##
This can happen for two legit reasons: - Deadlock PDB - Kubernetes will not delete stateful pods within unreachable nodes https://kubernetes.io/docs/tasks/run-application/force-delete-stateful-set-pod As a user you can manually add the machine.openshift.io/exclude-node-draining after inspecting to proceed with the deletion if that's intended. Although we are actively working on improving the experience for such legit cases (e.g https://github.com/openshift/cluster-api-provider-gcp/pull/59) this should not be a blocker for the 4.2 release
*** Bug 1797588 has been marked as a duplicate of this bug. ***
moving back to assigned because more PRs are needed.
*** Bug 1803764 has been marked as a duplicate of this bug. ***
Verified clusterversion: 4.4.0-0.nightly-2020-02-18-042756 1. Stop kubelet on a worker node that has a pod with local storage/data, such as alert-monitor-1 in namespace openshift-monitoring #!/bin/bash ./oc debug --image=rhel7/rhel-tools nodes/$1 chroot /host systemctl stop kubelet 2. Drain the node before the server notices something is wrong ./oc adm drain <node> --ignore-daemonsets --delete-local-data This will hang after it hits that monitoring pod (or possibly a different pod with local storage). After a bit, exit out, and you can query pods on that node, all the non-daemonset pods will be stuck in terminating state (or at least the pods with local storage). 3. Delete the machine associated with that node; machine could be deleted successful, drain node succesful . AWS: $ oc debug node/ip-10-0-165-100.us-east-2.compute.internal sh-4.2# chroot /host sh-4.4# systemctl stop kubelet $ oc adm drain ip-10-0-165-100.us-east-2.compute.internal --ignore-daemonsets --delete-local-data node/ip-10-0-165-100.us-east-2.compute.internal cordoned WARNING: ignoring DaemonSet-managed Pods: openshift-cluster-node-tuning-operator/tuned-z846q, openshift-dns/dns-default-2s5ch, openshift-image-registry/node-ca-cjbw2, openshift-machine-config-operator/machine-config-daemon-dlxkh, openshift-monitoring/node-exporter-gcqsp, openshift-multus/multus-5t24s, openshift-sdn/ovs-f26mr, openshift-sdn/sdn-987hl evicting pod "grafana-755b7df4f9-nggz9" evicting pod "csi-snapshot-controller-operator-5c695fc45-crztf" evicting pod "redhat-marketplace-6b555969b5-95pvx" evicting pod "image-registry-6c5b6656c4-v9cj4" evicting pod "community-operators-58d6f48fb9-4xmgk" evicting pod "certified-operators-8558655f99-fhq5b" evicting pod "router-default-5ccb96d54b-t2xcb" evicting pod "redhat-operators-6b58656c49-5rhk5" evicting pod "migrator-54b9f4568d-s4c8k" evicting pod "alertmanager-main-2" pod/redhat-marketplace-6b555969b5-95pvx evicted pod/certified-operators-8558655f99-fhq5b evicted pod/image-registry-6c5b6656c4-v9cj4 evicted pod/migrator-54b9f4568d-s4c8k evicted pod/csi-snapshot-controller-operator-5c695fc45-crztf evicted pod/redhat-operators-6b58656c49-5rhk5 evicted pod/community-operators-58d6f48fb9-4xmgk evicted pod/router-default-5ccb96d54b-t2xcb evicted pod/grafana-755b7df4f9-nggz9 evicted pod/alertmanager-main-2 evicted node/ip-10-0-165-100.us-east-2.compute.internal evicted $ oc delete machine zhsun4-s2gcg-worker-us-east-2c-d25ct machine.machine.openshift.io "zhsun4-s2gcg-worker-us-east-2c-d25ct" deleted gcp: $ oc debug node/zhsung-5bn2h-w-b-9rvtp.c.openshift-qe.internal Starting pod/zhsung-5bn2h-w-b-9rvtpcopenshift-qeinternal-debug ... To use host binaries, run `chroot /host` Pod IP: 10.0.32.3 If you don't see a command prompt, try pressing enter. sh-4.2# chroot /host sh-4.4# systemctl stop kubelet Removing debug pod ... $ oc adm drain zhsung-5bn2h-w-b-9rvtp.c.openshift-qe.internal --ignore-daemonsets --delete-local-data node/zhsung-5bn2h-w-b-9rvtp.c.openshift-qe.internal cordoned WARNING: ignoring DaemonSet-managed Pods: openshift-cluster-node-tuning-operator/tuned-lp254, openshift-dns/dns-default-l6wtj, openshift-image-registry/node-ca-wbtcp, openshift-machine-config-operator/machine-config-daemon-2w5zw, openshift-monitoring/node-exporter-q87zj, openshift-multus/multus-2jfwf, openshift-sdn/ovs-dn4sq, openshift-sdn/sdn-2fx7f evicting pod "migrator-54b9f4568d-49q54" evicting pod "csi-snapshot-controller-operator-7cdd579b87-sp4ck" evicting pod "redhat-operators-dfd49b4d5-vkp8r" evicting pod "kube-state-metrics-7f8b6cc5cb-4crdp" evicting pod "router-default-6944c99f7b-6xq4r" evicting pod "prometheus-adapter-845d7776c9-qxcw9" evicting pod "openshift-state-metrics-9648b55dd-btl5c" evicting pod "alertmanager-main-0" evicting pod "thanos-querier-864b78449f-xpcdw" evicting pod "grafana-755b7df4f9-2j5kp" pod/prometheus-adapter-845d7776c9-qxcw9 evicted pod/grafana-755b7df4f9-2j5kp evicted pod/thanos-querier-864b78449f-xpcdw evicted pod/router-default-6944c99f7b-6xq4r evicted pod/csi-snapshot-controller-operator-7cdd579b87-sp4ck evicted pod/kube-state-metrics-7f8b6cc5cb-4crdp evicted pod/openshift-state-metrics-9648b55dd-btl5c evicted pod/migrator-54b9f4568d-49q54 evicted pod/alertmanager-main-0 evicted pod/redhat-operators-dfd49b4d5-vkp8r evicted node/zhsung-5bn2h-w-b-9rvtp.c.openshift-qe.internal evicted $ oc delete machine zhsung-5bn2h-w-b-9rvtp machine.machine.openshift.io "zhsung-5bn2h-w-b-9rvtp" deleted Azure: $ oc debug node/zhsunazure-zz8pb-worker-centralus3-n62bs Starting pod/zhsunazure-zz8pb-worker-centralus3-n62bs-debug ... To use host binaries, run `chroot /host` Pod IP: 10.0.32.7 If you don't see a command prompt, try pressing enter. sh-4.2# sh-4.2# sh-4.2# systemctl stop kubelet sh: systemctl: command not found sh-4.2# systemctl stop kubelet sh: systemctl: command not found sh-4.2# chroot /host sh-4.4# systemctl stop kubelet Removing debug pod ... $ oc adm drain zhsunazure-zz8pb-worker-centralus3-n62bs --ignore-daemonsets --delete-local-data node/zhsunazure-zz8pb-worker-centralus3-n62bs cordoned WARNING: ignoring DaemonSet-managed Pods: openshift-cluster-node-tuning-operator/tuned-x4l9m, openshift-dns/dns-default-b8hdm, openshift-image-registry/node-ca-mkw4m, openshift-machine-config-operator/machine-config-daemon-hjn5s, openshift-monitoring/node-exporter-8v9s9, openshift-multus/multus-wh62g, openshift-sdn/ovs-p8gtr, openshift-sdn/sdn-hr9tc evicting pod "alertmanager-main-0" evicting pod "prometheus-k8s-1" pod/alertmanager-main-0 evicted pod/prometheus-k8s-1 evicted node/zhsunazure-zz8pb-worker-centralus3-n62bs evicted $ oc delete machine zhsunazure-zz8pb-worker-centralus3-n62bs machine.machine.openshift.io "zhsunazure-zz8pb-worker-centralus3-n62bs" deleted
Verified on gcp clusterversion: 4.4.0-0.nightly-2020-09-13-231918 $ oc debug node/zhsungcp9141-2r2jp-worker-c-pxf5b.c.openshift-qe.internal Starting pod/zhsungcp9141-2r2jp-worker-c-pxf5bcopenshift-qeinternal-debug ... To use host binaries, run `chroot /host` Pod IP: 10.0.32.4 If you don't see a command prompt, try pressing enter. sh-4.2# chroot /host sh-4.4# systemctl stop kubelet Removing debug pod ... $ oc adm drain zhsungcp9141-2r2jp-worker-c-pxf5b.c.openshift-qe.internal --ignore-daemonsets --delete-local-data node/zhsungcp9141-2r2jp-worker-c-pxf5b.c.openshift-qe.internal cordoned WARNING: ignoring DaemonSet-managed Pods: openshift-cluster-node-tuning-operator/tuned-947lw, openshift-dns/dns-default-hfxv8, openshift-image-registry/node-ca-jq7px, openshift-machine-config-operator/machine-config-daemon-qv2ps, openshift-monitoring/node-exporter-7m46s, openshift-multus/multus-r7t2b, openshift-sdn/ovs-kkhg7, openshift-sdn/sdn-mj8m4 evicting pod openshift-image-registry/image-registry-66f4c6d65f-ks64x evicting pod openshift-cluster-storage-operator/csi-snapshot-controller-746f87d7d4-h5htf evicting pod openshift-monitoring/alertmanager-main-0 evicting pod openshift-monitoring/prometheus-k8s-0 evicting pod openshift-monitoring/thanos-querier-77d54d5d7d-zxkvm pod/prometheus-k8s-0 evicted pod/image-registry-66f4c6d65f-ks64x evicted pod/csi-snapshot-controller-746f87d7d4-h5htf evicted pod/thanos-querier-77d54d5d7d-zxkvm evicted pod/alertmanager-main-0 evicted node/zhsungcp9141-2r2jp-worker-c-pxf5b.c.openshift-qe.internal evicted $ oc delete machine zhsungcp9141-2r2jp-worker-c-pxf5b machine.machine.openshift.io "zhsungcp9141-2r2jp-worker-c-pxf5b" deleted