Bug 1690200

Summary:

logging fluentd daemonset should tolerate all taints (3.10)

Product:

OpenShift Container Platform

Reporter:

Siva Reddy <schituku>

Component:

Installer

Assignee:

Vadim Rutkovsky <vrutkovs>

Status:

CLOSED ERRATA

QA Contact:

Siva Reddy <schituku>

Severity:

medium

Docs Contact:

Priority:

unspecified

Version:

3.10.0

CC:

aos-bugs, ewolinet, gpei, jdesousa, jialiu, jokerman, jupierce, mmccomas, rmeggins, schituku, shiywang, smossber, vrutkovs, wmeng

Target Milestone:

---

Target Release:

3.10.z

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

No Doc Update

Doc Text:

Story Points:

---

Clone Of:

1685952

Environment:

Last Closed:

2019-06-27 16:41:12 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

1635462, 1685952, 1685970, 1709422

Bug Blocks:

Attachments:

Description	Flags
The fluentd daemonset file used to create the ds	none
The yaml of the pod that is stuck	none

Comment 1 Siva Reddy 2019-03-19 17:39:18 UTC

Created attachment 1545757 [details]
The fluentd daemonset file used to create the ds

# oc project
Using project "openshift-logging" on server 
# oc get ds
NAME              DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE   NODE SELECTOR                AGE
logging-fluentd   3         3         3         3            3           logging-infra-fluentd=true   2h
# oc get ds logging-fluentd > logging-fluentd.yaml

Comment 2 ewolinet 2019-03-19 18:33:18 UTC

https://github.com/openshift/openshift-ansible/pull/11372

Comment 12 Siva Reddy 2019-04-22 14:36:47 UTC

Created attachment 1557175 [details]
The yaml of the pod that is stuck

Comment 16 Vadim Rutkovsky 2019-04-25 08:32:41 UTC

3.10 PR - https://github.com/openshift/openshift-ansible/pull/11552

Comment 17 Vadim Rutkovsky 2019-05-02 09:38:20 UTC

Fix is available in openshift-ansible-3.10.143-1

Comment 21 Siva Reddy 2019-05-13 15:00:48 UTC

The taint is not affecting the fluentd pods that are already running, but they are not getting created new because of the taints affecting the sdn pods. So marking this bug as verified but for the pods to come up the sdn pods needs to tolerate the taints which is tracked in another bug.

version:
# oc version
oc v3.10.145
kubernetes v1.10.0+b81c8f8
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://xxxx-master-etcd-1:8443
openshift v3.10.139
kubernetes v1.10.0+b81c8f8

openshift-ansible-3.10.145-1.git.0.b76c9df.el7.noarch.rpm

Steps to verify the bug:

1. taint the node
   # oc adm taint node $node NodeWithImpairedVolumes=true:NoExecute
    #oc describe node $node | grep -i taint
2. Check the fluentd pods
  # oc project
  Using project "openshift-logging" on server 
  # oc get pods 
  -- note that there is one fluentd pod per node running and hence the taint is not affecting the existing ones
3. recreate the pods by toggle the flag from true to false and then to true
  #oc label node --all --overwrite logging-infra-fluentd=false
    -- note that all fluentd pods get terminated
  #oc label node --all --overwrite logging-infra-fluentd=true
    -- note that the fluentd pods come back up again except for the tainted node where it is stuck in containercreating
   # oc get pods -o wide | grep $node
logging-fluentd-52dwf                      0/1       ContainerCreating   0          14m       <none>

    As mentioned in the description even though the logging pods are not running, it is verified that the taints are not affecting their creation and hence verifying the bug.

Comment 23 errata-xmlrpc 2019-06-27 16:41:12 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:1607