Description of problem: Ipi on osp, when node that has a pod with local storage is unreachable, drain node failed Version-Release number of selected component (if applicable): 4.4.0-0.nightly-2020-03-03-175205 How reproducible: Always Steps to Reproduce: 1. Stop kubelet on a worker node that has a pod with local storage/data, such as alert-monitor-1 in namespace openshift-monitoring 2. Drain the node $ oc adm drain zhsun9-x9smf-worker-gc2pb --ignore-daemonsets --delete-local-data node/zhsun9-x9smf-worker-gc2pb cordoned WARNING: ignoring DaemonSet-managed Pods: openshift-cluster-node-tuning-operator/tuned-qhc69, openshift-dns/dns-default-brjwk, openshift-image-registry/node-ca-pvwk6, openshift-machine-config-operator/machine-config-daemon-kcgvg, openshift-monitoring/node-exporter-h8tp4, openshift-multus/multus-qdlwn, openshift-sdn/ovs-ct6ft, openshift-sdn/sdn-6tdvb evicting pod "csi-snapshot-controller-operator-6f959d8749-f4vzp" evicting pod "alertmanager-main-2" evicting pod "openshift-state-metrics-58dddb6776-6p87v" evicting pod "csi-snapshot-controller-6cccfcb78b-lvmzm" evicting pod "grafana-7dd7875c69-sddzn" evicting pod "router-default-6d84d84994-8j86r" evicting pod "migrator-6b788867c8-59m8c" evicting pod "kube-state-metrics-95445d6d7-fm6kf" evicting pod "redhat-marketplace-9c7d9ffd4-4t9hd" evicting pod "alertmanager-main-0" evicting pod "alertmanager-main-1" evicting pod "prometheus-k8s-0" evicting pod "redhat-operators-668f9c4f6c-mtr25" evicting pod "prometheus-adapter-6f949b8994-7mz2s" evicting pod "prometheus-k8s-1" evicting pod "image-registry-6b7dc776db-vwzrq" evicting pod "thanos-querier-678f7fbcbf-cl6fp" 3. Delete the machine associated with that node 4. check logs Actual results: After 2 hours machine still stuck in Deleting status. $ oc get machine NAME PHASE TYPE REGION ZONE AGE zhsun9-x9smf-master-0 Running m1.xlarge regionOne nova 29h zhsun9-x9smf-master-1 Running m1.xlarge regionOne nova 29h zhsun9-x9smf-master-2 Running m1.xlarge regionOne nova 29h zhsun9-x9smf-worker-2-tl8bp Running m1.xlarge regionOne nova 21h zhsun9-x9smf-worker-cxddw Running m1.xlarge regionOne nova 16h zhsun9-x9smf-worker-gc2pb Deleting m1.xlarge regionOne nova 28h zhsun9-x9smf-worker-n6xgw Running m1.xlarge regionOne nova 126m zhsun9-x9smf-worker-smf2x Running m1.xlarge regionOne nova 19h zhsun9-x9smf-worker-wr7dd Running m1.xlarge regionOne nova 15h I0305 03:43:46.044197 1 controller.go:164] Reconciling Machine "zhsun9-x9smf-worker-gc2pb" I0305 03:43:46.044285 1 controller.go:376] Machine "zhsun9-x9smf-worker-gc2pb" in namespace "openshift-machine-api" doesn't specify "cluster.k8s.io/cluster-name" label, assuming nil cluster I0305 03:43:46.066928 1 machineservice.go:229] Cloud provider CA cert not provided, using system trust bundle I0305 03:43:46.737975 1 controller.go:284] Reconciling machine "zhsun9-x9smf-worker-gc2pb" triggers idempotent update I0305 03:51:04.258108 1 controller.go:164] Reconciling Machine "zhsun9-x9smf-worker-gc2pb" I0305 03:51:04.258160 1 controller.go:376] Machine "zhsun9-x9smf-worker-gc2pb" in namespace "openshift-machine-api" doesn't specify "cluster.k8s.io/cluster-name" label, assuming nil cluster ...... I0305 06:21:41.865370 1 info.go:20] unable to drain node "zhsun9-x9smf-worker-gc2pb" I0305 06:21:41.865378 1 info.go:20] there are pending nodes to be drained: zhsun9-x9smf-worker-gc2pb W0305 06:21:41.865390 1 controller.go:364] drain failed for machine "zhsun9-x9smf-worker-gc2pb": [error when waiting for pod "csi-snapshot-controller-operator-6f959d8749-f4vzp" terminating: timed out waiting for the condition, error when waiting for pod "image-registry-6b7dc776db-vwzrq" terminating: timed out waiting for the condition, error when waiting for pod "router-default-6d84d84994-8j86r" terminating: timed out waiting for the condition, error when waiting for pod "csi-snapshot-controller-6cccfcb78b-lvmzm" terminating: timed out waiting for the condition, error when waiting for pod "redhat-marketplace-9c7d9ffd4-4t9hd" terminating: timed out waiting for the condition, error when waiting for pod "redhat-operators-668f9c4f6c-mtr25" terminating: timed out waiting for the condition, error when waiting for pod "alertmanager-main-0" terminating: timed out waiting for the condition, error when waiting for pod "prometheus-k8s-1" terminating: timed out waiting for the condition, error when waiting for pod "alertmanager-main-1" terminating: timed out waiting for the condition, error when waiting for pod "thanos-querier-678f7fbcbf-cl6fp" terminating: timed out waiting for the condition, error when waiting for pod "migrator-6b788867c8-59m8c" terminating: timed out waiting for the condition, error when waiting for pod "alertmanager-main-2" terminating: timed out waiting for the condition, error when waiting for pod "openshift-state-metrics-58dddb6776-6p87v" terminating: timed out waiting for the condition, error when waiting for pod "prometheus-k8s-0" terminating: timed out waiting for the condition, error when waiting for pod "prometheus-adapter-6f949b8994-7mz2s" terminating: timed out waiting for the condition, error when waiting for pod "grafana-7dd7875c69-sddzn" terminating: timed out waiting for the condition, error when waiting for pod "kube-state-metrics-95445d6d7-fm6kf" terminating: timed out waiting for the condition] E0305 06:21:41.865425 1 controller.go:235] Failed to drain node for machine "zhsun9-x9smf-worker-gc2pb": requeue in: 20s I0305 06:21:41.865443 1 controller.go:409] Actuator returned requeue-after error: requeue in: 20s $ oc get po --all-namespaces -o wide | grep Terminating openshift-csi-snapshot-controller-operator csi-snapshot-controller-operator-6f959d8749-f4vzp 1/1 Terminating 0 29h 10.131.0.4 zhsun9-x9smf-worker-gc2pb <none> <none> openshift-csi-snapshot-controller csi-snapshot-controller-6cccfcb78b-lvmzm 1/1 Terminating 63 28h 10.131.0.16 zhsun9-x9smf-worker-gc2pb <none> <none> openshift-image-registry image-registry-6b7dc776db-vwzrq 1/1 Terminating 0 28h 10.131.0.11 zhsun9-x9smf-worker-gc2pb <none> <none> openshift-ingress router-default-6d84d84994-8j86r 1/1 Terminating 0 28h 192.168.0.24 zhsun9-x9smf-worker-gc2pb <none> <none> openshift-kube-storage-version-migrator migrator-6b788867c8-59m8c 1/1 Terminating 0 28h 10.131.0.12 zhsun9-x9smf-worker-gc2pb <none> <none> openshift-marketplace redhat-marketplace-9c7d9ffd4-4t9hd 1/1 Terminating 0 28h 10.131.0.14 zhsun9-x9smf-worker-gc2pb <none> <none> openshift-marketplace redhat-operators-668f9c4f6c-mtr25 1/1 Terminating 0 28h 10.131.0.9 zhsun9-x9smf-worker-gc2pb <none> <none> openshift-monitoring alertmanager-main-0 3/3 Terminating 0 28h 10.131.0.29 zhsun9-x9smf-worker-gc2pb <none> <none> openshift-monitoring alertmanager-main-1 3/3 Terminating 0 28h 10.131.0.28 zhsun9-x9smf-worker-gc2pb <none> <none> openshift-monitoring alertmanager-main-2 3/3 Terminating 0 28h 10.131.0.24 zhsun9-x9smf-worker-gc2pb <none> <none> openshift-monitoring grafana-7dd7875c69-sddzn 2/2 Terminating 0 28h 10.131.0.25 zhsun9-x9smf-worker-gc2pb <none> <none> openshift-monitoring kube-state-metrics-95445d6d7-fm6kf 3/3 Terminating 0 28h 10.131.0.6 zhsun9-x9smf-worker-gc2pb <none> <none> openshift-monitoring openshift-state-metrics-58dddb6776-6p87v 3/3 Terminating 0 28h 10.131.0.5 zhsun9-x9smf-worker-gc2pb <none> <none> openshift-monitoring prometheus-adapter-6f949b8994-7mz2s 1/1 Terminating 0 5h12m 10.131.0.43 zhsun9-x9smf-worker-gc2pb <none> <none> openshift-monitoring prometheus-k8s-0 7/7 Terminating 1 28h 10.131.0.30 zhsun9-x9smf-worker-gc2pb <none> <none> openshift-monitoring prometheus-k8s-1 7/7 Terminating 1 19h 10.131.0.38 zhsun9-x9smf-worker-gc2pb <none> <none> openshift-monitoring thanos-querier-678f7fbcbf-cl6fp 4/4 Terminating 0 27h 10.131.0.31 zhsun9-x9smf-worker-gc2pb <none> <none> Expected results: Drain node successful, machine could be deleted. Additional info:
This is probably just honouring the PDB which is expected. You can signal admin intent to force delete by setting the annotation `machine.openshift.io/exclude-node-draining: ""` on the machine. I would expect the log message to mention the PDB or to more specific though so leaving this open to dig into the logs.
@alberto This bug is similar wtih this one, https://bugzilla.redhat.com/show_bug.cgi?id=1733474 it is because this node I drained has pods with local storage. In aws, gcp, azure, no this issue. The same steps, only osp fails.
@sunzhaohua Would you mind trying this again with a newer version of the installer? My understanding of the scenario is that this has been fixed in cluster-api and should no longer be a problem.
@egarcia I tried again, machine still gets stuck in deleting status for about 2 hours $ oc adm drain zhsun78-jw9rt-worker-z7gfp --ignore-daemonsets --delete-local-data node/zhsun78-jw9rt-worker-z7gfp cordoned evicting pod "alertmanager-main-2" evicting pod "router-default-646f7d7959-fhf2z" evicting pod "alertmanager-main-1" evicting pod "prometheus-adapter-7ccc9d5c6d-mdc5c" evicting pod "grafana-74d584c555-5llss" evicting pod "prometheus-k8s-0" evicting pod "telemeter-client-7cc8d5d84f-dfhch" evicting pod "thanos-querier-6cd75c56f5-ws5zz" pod/alertmanager-main-1 evicted E0313 14:08:45.060297 10670 request.go:924] Unexpected error when reading response body: read tcp 10.72.13.139:54482->10.0.101.20:6443: read: invalid argument pod/router-default-646f7d7959-fhf2z evicted pod/grafana-74d584c555-5llss evicted pod/prometheus-adapter-7ccc9d5c6d-mdc5c evicted pod/alertmanager-main-2 evicted pod/telemeter-client-7cc8d5d84f-dfhch evicted pod/thanos-querier-6cd75c56f5-ws5zz evicted pod/prometheus-k8s-0 evicted node/zhsun78-jw9rt-worker-z7gfp evicted $ oc get pod -o wide --all-namespaces | grep Terminating openshift-ingress router-default-646f7d7959-fhf2z 1/1 Terminating 0 4h40m 192.168.0.22 zhsun78-jw9rt-worker-z7gfp <none> <none> openshift-monitoring alertmanager-main-1 3/3 Terminating 0 4h32m 10.128.2.10 zhsun78-jw9rt-worker-z7gfp <none> <none> openshift-monitoring alertmanager-main-2 3/3 Terminating 0 4h32m 10.128.2.6 zhsun78-jw9rt-worker-z7gfp <none> <none> openshift-monitoring grafana-74d584c555-5llss 2/2 Terminating 0 4h32m 10.128.2.9 zhsun78-jw9rt-worker-z7gfp <none> <none> openshift-monitoring prometheus-adapter-7ccc9d5c6d-mdc5c 1/1 Terminating 0 4h32m 10.128.2.5 zhsun78-jw9rt-worker-z7gfp <none> <none> openshift-monitoring prometheus-k8s-0 7/7 Terminating 1 4h31m 10.128.2.13 zhsun78-jw9rt-worker-z7gfp <none> <none> openshift-monitoring telemeter-client-7cc8d5d84f-dfhch 3/3 Terminating 0 4h31m 10.128.2.11 zhsun78-jw9rt-worker-z7gfp <none> <none> openshift-monitoring thanos-querier-6cd75c56f5-ws5zz 4/4 Terminating 0 4h31m 10.128.2.12 zhsun78-jw9rt-worker-z7gfp <none> <none> $ oc get node NAME STATUS ROLES AGE VERSION zhsun78-jw9rt-master-0 Ready master 4h49m v1.17.1 zhsun78-jw9rt-master-1 Ready master 4h49m v1.17.1 zhsun78-jw9rt-master-2 Ready master 4h49m v1.17.1 zhsun78-jw9rt-worker-9v2hb Ready worker 4h31m v1.17.1 zhsun78-jw9rt-worker-jqj26 Ready worker 4h39m v1.17.1 zhsun78-jw9rt-worker-z28r2 Ready worker 135m v1.17.1 zhsun78-jw9rt-worker-z7gfp NotReady,SchedulingDisabled worker 4h37m v1.17.1 I0313 04:01:14.193862 1 controller.go:164] Reconciling Machine "zhsun78-jw9rt-worker-z7gfp" I0313 04:01:14.193979 1 controller.go:376] Machine "zhsun78-jw9rt-worker-z7gfp" in namespace "openshift-machine-api" doesn't specify "cluster.k8s.io/cluster-name" label, assuming nil cluster I0313 04:01:14.270598 1 machineservice.go:229] Cloud provider CA cert not provided, using system trust bundle I0313 04:01:14.817282 1 controller.go:284] Reconciling machine "zhsun78-jw9rt-worker-z7gfp" triggers idempotent update I0313 06:32:27.089538 1 info.go:16] ignoring DaemonSet-managed pods: tuned-4ppm4, dns-default-qp8p2, node-ca-cnjmq, machine-config-daemon-blkdh, node-exporter-9772q, multus-q2bjt, ovs-phcds, sdn-l9fpn; deleting pods with local storage: alertmanager-main-1, alertmanager-main-2, grafana-74d584c555-5llss, prometheus-adapter-7ccc9d5c6d-mdc5c, prometheus-k8s-0 I0313 06:32:27.089573 1 info.go:20] failed to evict pods from node "zhsun78-jw9rt-worker-z7gfp" (pending pods: alertmanager-main-1,alertmanager-main-2,grafana-74d584c555-5llss,prometheus-adapter-7ccc9d5c6d-mdc5c,prometheus-k8s-0,router-default-646f7d7959-fhf2z,telemeter-client-7cc8d5d84f-dfhch,thanos-querier-6cd75c56f5-ws5zz): [error when waiting for pod "prometheus-adapter-7ccc9d5c6d-mdc5c" terminating: timed out waiting for the condition, error when waiting for pod "grafana-74d584c555-5llss" terminating: timed out waiting for the condition, error when waiting for pod "thanos-querier-6cd75c56f5-ws5zz" terminating: timed out waiting for the condition, error when waiting for pod "alertmanager-main-1" terminating: timed out waiting for the condition, error when waiting for pod "router-default-646f7d7959-fhf2z" terminating: timed out waiting for the condition, error when waiting for pod "alertmanager-main-2" terminating: timed out waiting for the condition, error when waiting for pod "telemeter-client-7cc8d5d84f-dfhch" terminating: timed out waiting for the condition, error when waiting for pod "prometheus-k8s-0" terminating: timed out waiting for the condition] I0313 06:32:27.089606 1 info.go:16] [error when waiting for pod "prometheus-adapter-7ccc9d5c6d-mdc5c" terminating: timed out waiting for the condition, error when waiting for pod "grafana-74d584c555-5llss" terminating: timed out waiting for the condition, error when waiting for pod "thanos-querier-6cd75c56f5-ws5zz" terminating: timed out waiting for the condition, error when waiting for pod "alertmanager-main-1" terminating: timed out waiting for the condition, error when waiting for pod "router-default-646f7d7959-fhf2z" terminating: timed out waiting for the condition, error when waiting for pod "alertmanager-main-2" terminating: timed out waiting for the condition, error when waiting for pod "telemeter-client-7cc8d5d84f-dfhch" terminating: timed out waiting for the condition, error when waiting for pod "prometheus-k8s-0" terminating: timed out waiting for the condition] I0313 06:32:27.089616 1 info.go:20] unable to drain node "zhsun78-jw9rt-worker-z7gfp" I0313 06:32:27.089621 1 info.go:20] there are pending nodes to be drained: zhsun78-jw9rt-worker-z7gfp W0313 06:32:27.089627 1 controller.go:364] drain failed for machine "zhsun78-jw9rt-worker-z7gfp": [error when waiting for pod "prometheus-adapter-7ccc9d5c6d-mdc5c" terminating: timed out waiting for the condition, error when waiting for pod "grafana-74d584c555-5llss" terminating: timed out waiting for the condition, error when waiting for pod "thanos-querier-6cd75c56f5-ws5zz" terminating: timed out waiting for the condition, error when waiting for pod "alertmanager-main-1" terminating: timed out waiting for the condition, error when waiting for pod "router-default-646f7d7959-fhf2z" terminating: timed out waiting for the condition, error when waiting for pod "alertmanager-main-2" terminating: timed out waiting for the condition, error when waiting for pod "telemeter-client-7cc8d5d84f-dfhch" terminating: timed out waiting for the condition, error when waiting for pod "prometheus-k8s-0" terminating: timed out waiting for the condition] E0313 06:32:27.089646 1 controller.go:235] Failed to drain node for machine "zhsun78-jw9rt-worker-z7gfp": requeue in: 20s I0313 06:32:27.089663 1 controller.go:409] Actuator returned requeue-after error: requeue in: 20s I0313 06:32:47.089897 1 controller.go:164] Reconciling Machine "zhsun78-jw9rt-worker-z7gfp" I0313 06:32:47.090025 1 controller.go:376] Machine "zhsun78-jw9rt-worker-z7gfp" in namespace "openshift-machine-api" doesn't specify "cluster.k8s.io/cluster-name" label, assuming nil cluster I0313 06:32:47.090067 1 controller.go:227] Reconciling machine "zhsun78-jw9rt-worker-z7gfp" triggers delete I0313 06:32:47.173906 1 info.go:16] ignoring DaemonSet-managed pods: tuned-4ppm4, dns-default-qp8p2, node-ca-cnjmq, machine-config-daemon-blkdh, node-exporter-9772q, multus-q2bjt, ovs-phcds, sdn-l9fpn; deleting pods with local storage: alertmanager-main-1, alertmanager-main-2, grafana-74d584c555-5llss, prometheus-adapter-7ccc9d5c6d-mdc5c, prometheus-k8s-0
Hmm, I will look a little further into this. What version of the installer did you get these results with? Also, do you have any custom scripts that might be deleting servers?
There are possibly 2 or more issues. The first issue is, the actual VM disappeared from openstack before the node was drained. This should never happen (unless you are intentionally skipping drain, which isn't the case here). Second, the openstack provider must be using outdated version of the machine-api. We have a work around in place that will ignore an unresponsive node (kubelet not communicating) that will go ahead and skip any pods that already have a deletion timestamp. This does not appear to be working in this bug and is not specific to any one provider. Third, there may be something that is causing nodes on openstack to go unready/unreachable when draining. This could be because of the first issue we talked about (actual VM already gone) or it could be because some pod that is required for the kubelet to communicate is gone (such as a proxy or other such thing) and it's being drained and then effectively blocking comms. If openstack has such a thing, it should run as a daemonset. The first step is to update your cluster-api dependencies in each release branch. In 4.3 and newer, we have moved cluster-api library code into the machine-api-operator repo. Here's what we're doing for the AWS provider in 4.4: https://github.com/openshift/cluster-api-provider-aws/blob/release-4.4/go.mod#L13 4.3: https://github.com/openshift/cluster-api-provider-aws/blob/release-4.3/go.mod#L13
Verified on 4.5.0-0.nightly-2020-05-04-113741 "oc delete machine" causes node drained correctly, node, machine and openstack instance deleted
*** Bug 1811400 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409