Bug 1459745 - When enalbe DefaultTolerationSeconds daemonset pods shouldn't have tolerationSeconds
When enalbe DefaultTolerationSeconds daemonset pods shouldn't have toleration...
Status: VERIFIED
Product: OpenShift Container Platform
Classification: Red Hat
Component: Pod (Show other bugs)
3.6.0
Unspecified Unspecified
medium Severity medium
: ---
: ---
Assigned To: Avesh Agarwal
DeShuai Ma
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-06-08 01:02 EDT by DeShuai Ma
Modified: 2017-10-18 10:29 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previously when DefaulTolerationSeconds admission plugin was enabled, Daemonsets were created with default NoExecute toleration with toleration seconds of 300 seconds, which would cause them to evict in case of node problems. This fix ensures that Daemonsets are created with infinite toleration seconds to avoid their eviction in case of node problems.
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description DeShuai Ma 2017-06-08 01:02:04 EDT
Description of problem:
When enalbe DefaultTolerationSeconds, DaemonSet pods are created with NoExecute tolerations for node.alpha.kubernetes.io/unreachable and node.alpha.kubernetes.io/notReady with no tolerationSeconds. This ensures that DaemonSet pods are never evicted due to these problems, which matches the behavior when this feature is disabled.
But now it has tolerationSeconds

Version-Release number of selected component (if applicable):
openshift v3.6.96
kubernetes v1.6.1+5115d708d7
etcd 3.1.0


How reproducible:
Always

Steps to Reproduce:
1.Enable DefaultTolerationSeconds
admissionConfig:
  pluginConfig:
    DefaultTolerationSeconds:
      configuration:
        kind: DefaultAdmissionConfig
        apiVersion: v1
        disable: false
2.Create a daemonset
[root@qe-dma36-master-1 ~]# oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/daemon/daemonset.yaml -n dma
daemonset "hello-daemonset" created
[root@qe-dma36-master-1 ~]# oc get ds -n dma
NAME              DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE   NODE-SELECTOR   AGE
hello-daemonset   2         2         0         2            0           <none>          4s

3.Check the daemonset pods's tolerations
[root@qe-dma36-master-1 ~]# oc describe po hello-daemonset-3scfh -n dma | grep -i NoExecute
Tolerations:	node.alpha.kubernetes.io/notReady=:Exists:NoExecute for 300s
		node.alpha.kubernetes.io/unreachable=:Exists:NoExecute for 300s

Actual results:
3. "tolerationSeconds": 300

Expected results:
3. no "tolerationSeconds"

Additional info:
In upstream:
[root@dhcp-140-98 ~]# kubectl describe po hello-daemonset-mz1z | grep NoExecute
Tolerations:    node.alpha.kubernetes.io/notReady=:Exists:NoExecute
        node.alpha.kubernetes.io/unreachable=:Exists:NoExecute
Comment 1 Avesh Agarwal 2017-06-14 15:37:04 EDT
Sent pr to origin: https://github.com/openshift/origin/pull/14653
Comment 2 DeShuai Ma 2017-06-21 01:14:06 EDT
Test on openshift v3.6.121, This bug is fixed.

# oc describe po hello-daemonset-x5q5g -n dma | grep -i NoExecute
Tolerations:	node.alpha.kubernetes.io/notReady=:Exists:NoExecute
		node.alpha.kubernetes.io/unreachable=:Exists:NoExecute

        "tolerations": [
            {
                "effect": "NoExecute",
                "key": "node.alpha.kubernetes.io/notReady",
                "operator": "Exists"
            },
            {
                "effect": "NoExecute",
                "key": "node.alpha.kubernetes.io/unreachable",
                "operator": "Exists"
            }
        ],
Comment 3 DeShuai Ma 2017-06-21 01:15:22 EDT
Could you help move the bug to ON_QA status. I'll verify it.

Note You need to log in before you can comment on or make changes to this bug.