Bug 1835739 - Support 'oc adm node drain' without --ignore-daemonsets=true --delete-local-data=true flags
Summary: Support 'oc adm node drain' without --ignore-daemonsets=true --delete-local-d...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: oc
Version: 4.4
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.4.z
Assignee: Maciej Szulik
QA Contact: zhou ying
URL:
Whiteboard:
Depends On: 1835628
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-05-14 12:14 UTC by Maciej Szulik
Modified: 2020-06-17 22:26 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Wrong condition in the code caused the logic to ignore deleted pods. Consequence: oc adm node drain was not properly accounting daemon sets and local data attached to pods when draining a node. Fix: Fix the logic, so that all pods are accounted accordingly when draining a node. Result: When trying to drain a node which has a daemonset's pod running, or pod has attached local volume data the drain command will fail pointing to use flags which will ignore the two.
Clone Of: 1835628
Environment:
Last Closed: 2020-06-17 22:26:36 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift oc pull 420 0 None closed [release-4.4] Bug 1835739: Fix oc drain to ignore daemonsets and others 2020-10-21 05:46:16 UTC
Red Hat Product Errata RHBA-2020:2445 0 None None None 2020-06-17 22:26:53 UTC

Description Maciej Szulik 2020-05-14 12:14:28 UTC
+++ This bug was initially created as a clone of Bug #1835628 +++

Description of problem:

Running 'oc adm node drain' without --ignore-daemonsets --delete-local-data flags resulting in error:
error: cannot delete DaemonSet-managed Pods (use --ignore-daemonsets to ignore): openshift-cluster-node-tuning-operator/tuned-2kls9, openshift-dns/dns-default-c675s, openshift-image-registry/node-ca-d57th, openshift-machine-config-operator/machine-config-daemon-c5kx4, openshift-monitoring/node-exporter-jf5cr, openshift-multus/multus-cxqwz, openshift-ovn-kubernetes/ovnkube-node-zbzpv, openshift-ovn-kubernetes/ovs-node-kbjgf
cannot delete Pods with local storage (use --delete-local-data to override): openshift-image-registry/image-registry-5bcb67497d-gvq6b

Version-Release number of selected component (if applicable):
Client Version: 4.5
Server Version: 4.5

How reproducible:
Always with 4.5 client version

Steps to Reproduce:
Run 'oc adm drain NODE'

Actual results:
Error reported, drain aborted

Expected results:
Node drained, application pods evicted

Additional info:
1. When used Client Version 4.4 there is no problem with running the command without flags
2. The similar problem reported for CNV (see linked bug). Our setup is BareMetal without CNV

--- Additional comment from Maciej Szulik on 2020-05-14 13:40:00 CEST ---



--- Additional comment from Maciej Szulik on 2020-05-14 14:13:21 CEST ---

This is not a bug, this was reported upstream in https://github.com/kubernetes/kubectl/issues/803
and only kubectl 1.17 and accompanying oc 4.4 are affected. If you try older version of oc 4.3 or 4.2
you'll get a similar error. I'm closing this as not a bug and I'll try to cherry-pick the fix into 4.4.

Comment 1 Maciej Szulik 2020-05-14 12:15:24 UTC
Actually, it's the opposite, we need to pick https://github.com/kubernetes/kubernetes/pull/87361 fix so that 
oc adm drain warns about daemonsets and local storage.

Comment 2 Maciej Szulik 2020-05-20 09:39:39 UTC
PR waiting in queue https://github.com/openshift/oc/pull/420

Comment 5 zhou ying 2020-06-01 06:39:19 UTC
Confirmed with latest oc: with DaemonSets or Volumes attached should give you the warning and abort the drain. 

[root@dhcp-140-138 ~]# oc version -o yaml 
clientVersion:
  buildDate: "2020-05-29T06:43:55Z"
  compiler: gc
  gitCommit: 1960dd73b123241730531db09489d951228ad853
  gitTreeState: clean
  gitVersion: 4.4.0-202005290638-1960dd7
  goVersion: go1.13.4
  major: ""
  minor: ""
  platform: linux/amd64
openshiftVersion: 4.4.0-0.nightly-2020-05-30-022631
serverVersion:
  buildDate: "2020-05-30T01:52:40Z"
  compiler: gc
  gitCommit: f5fb168
  gitTreeState: clean
  gitVersion: v1.17.1+f5fb168
  goVersion: go1.13.4
  major: "1"
  minor: 17+
  platform: linux/amd64

[root@dhcp-140-138 ~]# oc adm drain node/ip-10-0-187-6.us-east-2.compute.internal
node/ip-10-0-187-6.us-east-2.compute.internal cordoned
error: unable to drain node "ip-10-0-187-6.us-east-2.compute.internal", aborting command...

There are pending nodes to be drained:
 ip-10-0-187-6.xxx-2.compute.internal
cannot delete DaemonSet-managed Pods (use --ignore-daemonsets to ignore): openshift-cluster-node-tuning-operator/tuned-m4rdr, openshift-dns/dns-default-6lhrh, openshift-image-registry/node-ca-6lkn2, openshift-machine-config-operator/machine-config-daemon-hzg4r, openshift-monitoring/node-exporter-n4k45, openshift-multus/multus-647pc, openshift-sdn/ovs-mzkjd, openshift-sdn/sdn-tnd4z
cannot delete Pods with local storage (use --delete-local-data to override): openshift-monitoring/alertmanager-main-2, openshift-monitoring/kube-state-metrics-5595b5958b-bzcpj

Comment 9 errata-xmlrpc 2020-06-17 22:26:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2445


Note You need to log in before you can comment on or make changes to this bug.