Bug 1709422 - sdn daemonset should tolerate taints (3.10)
Summary: sdn daemonset should tolerate taints (3.10)
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.10.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 3.10.z
Assignee: Joseph Callen
QA Contact: Siva Reddy
URL:
Whiteboard:
Depends On:
Blocks: 1690200
TreeView+ depends on / blocked
 
Reported: 2019-05-13 14:57 UTC by Siva Reddy
Modified: 2019-06-27 16:41 UTC (History)
0 users

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-06-27 16:41:12 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:1607 0 None None None 2019-06-27 16:41:15 UTC

Description Siva Reddy 2019-05-13 14:57:21 UTC
Description of problem:
  The sdn pods should tolerate all taints like the other simialr pods like sync, ovs and logging pods.

version:
# oc version
oc v3.10.145
kubernetes v1.10.0+b81c8f8
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://xxxx-master-etcd-1:8443
openshift v3.10.139
kubernetes v1.10.0+b81c8f8

openshift-ansible-3.10.145-1.git.0.b76c9df.el7.noarch.rpm

How reproducible:
Always

Steps to Reproduce:
1. $node is the compute node where the pods are running
2. Note the pods running for sync and ovs,sdn pods
   # oc get pods -n openshift-node -o wide | grep $node ;  oc get pods -n openshift-sdn -o wide | grep $node ;
    sync-vkxwk   1/1       Running   
    ovs-gjkts   1/1       Running   
    sdn-c7c7f   1/1       Running   
3. Note that there are 3 pods running for each of sync, ovs, sdn
4. taint the node
    #oc adm taint node $node NodeWithImpairedVolumes=true:NoExecute
    #oc describe node $node | grep -i taint
      Taints:             NodeWithImpairedVolumes=true:NoExecute
6. Note the pods for sync, ovs and sdn pods.
# oc get pods -n openshift-node -o wide | grep $node ;  oc get pods -n openshift-sdn -o wide | grep $node ;
sync-mtjx7   1/1       Running   0          18m       10.240.0.51
ovs-sb8qz   1/1       Running   0          18m       10.240.0.51 

Actual Results:
   The pods for ovs, sync tolerate the taint but not the sdn pods.

Expected results:
   The sdn pod should tolerate the taint like sync and ovs pods.

Additional info:
   Note that this is affecting the logging pods(https://bugzilla.redhat.com/show_bug.cgi?id=1690200)

Comment 3 Joseph Callen 2019-05-16 16:30:51 UTC
PR: https://github.com/openshift/openshift-ansible/pull/11616

Comment 5 Siva Reddy 2019-06-17 15:12:09 UTC
   The sdn pods are not getting affected by taints anymore, hence moving it to verified.

# oc version
oc v3.10.149
kubernetes v1.10.0+b81c8f8
features: Basic-Auth GSSAPI Kerberos SPNEGO

openshift v3.10.149
kubernetes v1.10.0+b81c8f8

openshift-ansible-3.10.149-1.git.0.eb0262c.el7.noarch.rpm

Steps to verify:
1. $node is the compute node where the pods are running
2. Note the pods running for sync and ovs,sdn pods
   # oc get pods -n openshift-node -o wide | grep $node ;  oc get pods -n openshift-sdn -o wide | grep $node ;
3. Note that there are 3 pods running for each of sync, ovs, sdn
4. taint the node
    #oc adm taint node $node NodeWithImpairedVolumes=true:NoExecute
    #oc describe node $node | grep -i taint
      Taints:             NodeWithImpairedVolumes=true:NoExecute
6. Note the pods for sync, ovs and sdn pods.
# oc get pods -n openshift-node -o wide | grep $node ;  oc get pods -n openshift-sdn -o wide | grep $node ;

      Even after applying the taint the sdn pods are still running unlike before where it got terminated.

7. Check the fluentd pods
  # oc project
  Using project "openshift-logging" on server 
  # oc get pods 
8. recreate the pods by toggle the flag from true to false and then to true
  #oc label node --all --overwrite logging-infra-fluentd=false
    -- note that all fluentd pods get terminated
  #oc label node --all --overwrite logging-infra-fluentd=true
    -- note that the all fluentd pods come back up 



Also verified that the logging pods are not getting affected by taints

Comment 7 errata-xmlrpc 2019-06-27 16:41:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:1607


Note You need to log in before you can comment on or make changes to this bug.