Bug 1459745
| Summary: | When enalbe DefaultTolerationSeconds daemonset pods shouldn't have tolerationSeconds | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | DeShuai Ma <dma> |
| Component: | Node | Assignee: | Ryan Phillips <rphillips> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Xiaoli Tian <xtian> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 3.6.0 | CC: | aos-bugs, decarr, jokerman, mmccomas |
| Target Milestone: | --- | ||
| Target Release: | 3.0.2 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
Previously when DefaulTolerationSeconds admission plugin was enabled, Daemonsets were created with default NoExecute toleration with toleration seconds of 300 seconds, which would cause them to evict in case of node problems. This fix ensures that Daemonsets are created with infinite toleration seconds to avoid their eviction in case of node problems.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-11-21 18:38:34 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Sent pr to origin: https://github.com/openshift/origin/pull/14653 Test on openshift v3.6.121, This bug is fixed.
# oc describe po hello-daemonset-x5q5g -n dma | grep -i NoExecute
Tolerations: node.alpha.kubernetes.io/notReady=:Exists:NoExecute
node.alpha.kubernetes.io/unreachable=:Exists:NoExecute
"tolerations": [
{
"effect": "NoExecute",
"key": "node.alpha.kubernetes.io/notReady",
"operator": "Exists"
},
{
"effect": "NoExecute",
"key": "node.alpha.kubernetes.io/unreachable",
"operator": "Exists"
}
],
Could you help move the bug to ON_QA status. I'll verify it. Not sure what should be target release here? Is just closing it enough? |
Description of problem: When enalbe DefaultTolerationSeconds, DaemonSet pods are created with NoExecute tolerations for node.alpha.kubernetes.io/unreachable and node.alpha.kubernetes.io/notReady with no tolerationSeconds. This ensures that DaemonSet pods are never evicted due to these problems, which matches the behavior when this feature is disabled. But now it has tolerationSeconds Version-Release number of selected component (if applicable): openshift v3.6.96 kubernetes v1.6.1+5115d708d7 etcd 3.1.0 How reproducible: Always Steps to Reproduce: 1.Enable DefaultTolerationSeconds admissionConfig: pluginConfig: DefaultTolerationSeconds: configuration: kind: DefaultAdmissionConfig apiVersion: v1 disable: false 2.Create a daemonset [root@qe-dma36-master-1 ~]# oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/daemon/daemonset.yaml -n dma daemonset "hello-daemonset" created [root@qe-dma36-master-1 ~]# oc get ds -n dma NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE-SELECTOR AGE hello-daemonset 2 2 0 2 0 <none> 4s 3.Check the daemonset pods's tolerations [root@qe-dma36-master-1 ~]# oc describe po hello-daemonset-3scfh -n dma | grep -i NoExecute Tolerations: node.alpha.kubernetes.io/notReady=:Exists:NoExecute for 300s node.alpha.kubernetes.io/unreachable=:Exists:NoExecute for 300s Actual results: 3. "tolerationSeconds": 300 Expected results: 3. no "tolerationSeconds" Additional info: In upstream: [root@dhcp-140-98 ~]# kubectl describe po hello-daemonset-mz1z | grep NoExecute Tolerations: node.alpha.kubernetes.io/notReady=:Exists:NoExecute node.alpha.kubernetes.io/unreachable=:Exists:NoExecute