Bug 1459745 - When enalbe DefaultTolerationSeconds daemonset pods shouldn't have tolerationSeconds
Summary: When enalbe DefaultTolerationSeconds daemonset pods shouldn't have toleration...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 3.6.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 3.0.2
Assignee: Ryan Phillips
QA Contact: Xiaoli Tian
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-06-08 05:02 UTC by DeShuai Ma
Modified: 2019-11-21 18:38 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previously when DefaulTolerationSeconds admission plugin was enabled, Daemonsets were created with default NoExecute toleration with toleration seconds of 300 seconds, which would cause them to evict in case of node problems. This fix ensures that Daemonsets are created with infinite toleration seconds to avoid their eviction in case of node problems.
Clone Of:
Environment:
Last Closed: 2019-11-21 18:38:34 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description DeShuai Ma 2017-06-08 05:02:04 UTC
Description of problem:
When enalbe DefaultTolerationSeconds, DaemonSet pods are created with NoExecute tolerations for node.alpha.kubernetes.io/unreachable and node.alpha.kubernetes.io/notReady with no tolerationSeconds. This ensures that DaemonSet pods are never evicted due to these problems, which matches the behavior when this feature is disabled.
But now it has tolerationSeconds

Version-Release number of selected component (if applicable):
openshift v3.6.96
kubernetes v1.6.1+5115d708d7
etcd 3.1.0


How reproducible:
Always

Steps to Reproduce:
1.Enable DefaultTolerationSeconds
admissionConfig:
  pluginConfig:
    DefaultTolerationSeconds:
      configuration:
        kind: DefaultAdmissionConfig
        apiVersion: v1
        disable: false
2.Create a daemonset
[root@qe-dma36-master-1 ~]# oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/daemon/daemonset.yaml -n dma
daemonset "hello-daemonset" created
[root@qe-dma36-master-1 ~]# oc get ds -n dma
NAME              DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE   NODE-SELECTOR   AGE
hello-daemonset   2         2         0         2            0           <none>          4s

3.Check the daemonset pods's tolerations
[root@qe-dma36-master-1 ~]# oc describe po hello-daemonset-3scfh -n dma | grep -i NoExecute
Tolerations:	node.alpha.kubernetes.io/notReady=:Exists:NoExecute for 300s
		node.alpha.kubernetes.io/unreachable=:Exists:NoExecute for 300s

Actual results:
3. "tolerationSeconds": 300

Expected results:
3. no "tolerationSeconds"

Additional info:
In upstream:
[root@dhcp-140-98 ~]# kubectl describe po hello-daemonset-mz1z | grep NoExecute
Tolerations:    node.alpha.kubernetes.io/notReady=:Exists:NoExecute
        node.alpha.kubernetes.io/unreachable=:Exists:NoExecute

Comment 1 Avesh Agarwal 2017-06-14 19:37:04 UTC
Sent pr to origin: https://github.com/openshift/origin/pull/14653

Comment 2 DeShuai Ma 2017-06-21 05:14:06 UTC
Test on openshift v3.6.121, This bug is fixed.

# oc describe po hello-daemonset-x5q5g -n dma | grep -i NoExecute
Tolerations:	node.alpha.kubernetes.io/notReady=:Exists:NoExecute
		node.alpha.kubernetes.io/unreachable=:Exists:NoExecute

        "tolerations": [
            {
                "effect": "NoExecute",
                "key": "node.alpha.kubernetes.io/notReady",
                "operator": "Exists"
            },
            {
                "effect": "NoExecute",
                "key": "node.alpha.kubernetes.io/unreachable",
                "operator": "Exists"
            }
        ],

Comment 3 DeShuai Ma 2017-06-21 05:15:22 UTC
Could you help move the bug to ON_QA status. I'll verify it.

Comment 4 Avesh Agarwal 2018-02-09 17:59:07 UTC
Not sure what should be target release here? Is just closing it enough?


Note You need to log in before you can comment on or make changes to this bug.