Bug 1832999 - [OSP] Failed to drain node for machine
Summary: [OSP] Failed to drain node for machine
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Compute
Version: 4.4
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: 4.4.z
Assignee: egarcia
QA Contact: sunzhaohua
Depends On: 1810400 1848755
TreeView+ depends on / blocked
Reported: 2020-05-07 15:45 UTC by egarcia
Modified: 2020-07-14 01:44 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1810400
Last Closed: 2020-07-14 01:43:52 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Priority Status Summary Last Updated
Github openshift cluster-api-provider-openstack pull 98 None closed [release-4.4] Bug 1832999: Migrate to MAO 2020-07-06 01:55:08 UTC
Red Hat Product Errata RHBA-2020:2871 None None None 2020-07-14 01:44:15 UTC

Comment 6 sunzhaohua 2020-07-06 05:34:54 UTC
Clusterversion: 4.4.0-0.nightly-2020-07-04-120349
1. Stop kubelet on a worker node that has a pod with local storage/data, such as alert-monitor-1 in namespace openshift-monitoring
2. Drain the node 
$ oc adm drain zhsunosp76-xfctt-worker-pgvsn --ignore-daemonsets --delete-local-data
node/zhsunosp76-xfctt-worker-pgvsn cordoned
WARNING: ignoring DaemonSet-managed Pods: openshift-cluster-node-tuning-operator/tuned-zjplm, openshift-dns/dns-default-jhwkz, openshift-image-registry/node-ca-xn64c, openshift-machine-config-operator/machine-config-daemon-r4r4j, openshift-monitoring/node-exporter-5kckd, openshift-multus/multus-wm6bw, openshift-sdn/ovs-5qbsd, openshift-sdn/sdn-thnz9
evicting pod openshift-monitoring/alertmanager-main-1
evicting pod openshift-ingress/router-default-78c798c4c4-4q852
evicting pod openshift-monitoring/alertmanager-main-2
evicting pod openshift-monitoring/prometheus-k8s-0
evicting pod openshift-monitoring/thanos-querier-5b9dbdc464-mkc6r
pod/router-default-78c798c4c4-4q852 evicted
pod/alertmanager-main-2 evicted
pod/prometheus-k8s-0 evicted
pod/thanos-querier-5b9dbdc464-mkc6r evicted
pod/alertmanager-main-1 evicted
node/zhsunosp76-xfctt-worker-pgvsn evicted

3. Delete the machine associated with that node
$ oc delete machine zhsunosp76-xfctt-worker-pgvsn
machine.machine.openshift.io "zhsunosp76-xfctt-worker-pgvsn" deleted
4. check logs 

I0706 04:45:15.911697       1 controller.go:247] zhsunosp76-xfctt-worker-pgvsn: deleting node "zhsunosp76-xfctt-worker-pgvsn" for machine
I0706 04:45:15.912715       1 reflector.go:175] Starting reflector *v1.Node (10h32m54.332782437s) from sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:224
I0706 04:45:15.912806       1 reflector.go:211] Listing and watching *v1.Node from sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:224
I0706 04:45:16.053459       1 controller.go:261] zhsunosp76-xfctt-worker-pgvsn: machine deletion successful

Comment 8 errata-xmlrpc 2020-07-14 01:43:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.