Bug 1543727
| Summary: | Met daemonset quickly recreate pod issue | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | DeShuai Ma <dma> |
| Component: | Installer | Assignee: | Scott Dodson <sdodson> |
| Status: | CLOSED DUPLICATE | QA Contact: | DeShuai Ma <dma> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 3.9.0 | CC: | aos-bugs, dma, jokerman, mfojtik, mmccomas, sdodson |
| Target Milestone: | --- | Keywords: | Reopened |
| Target Release: | 3.9.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-03-13 18:55:54 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Comment 1
DeShuai Ma
2018-02-09 06:10:52 UTC
Is the environment still running? I was hoping to encounter it again. The most important step is to find out why the pod is failed as the restart policy is "always". If you could capture the YAML for the failed pod it would be great. Sorry, the env is not exist. There should be just 2 cases where pod with `restartPolicy: Always` can fail - eviction and failing to matchNodeSelector. I think this is the consequence of one of those scenarios and not an actual DS bug. Feel free to re-open if you encounter it again but without additional info there is nothing to be done here. reproduce on ocp 3.9.7; As project add projectConfig.defaultNodeSelector in pr https://github.com/openshift/openshift-ansible/pull/7364 If I create a ds with 'restartPolicy:Always' (it's default policy), it quickly recreate the pod. [root@host-172-16-120-96 ~]# oc version oc v3.9.7 kubernetes v1.9.1+a0ce1bc657 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://172.16.120.96:8443 openshift v3.9.7 [root@host-172-16-120-96 ~]# oc adm new-project dma Created project dma [root@host-172-16-120-96 ~]# oc create -f https://raw.githubusercontent.com/mdshuai/testfile-openshift/master/daemonset/daemonset.yaml -n dma daemonset "hello-daemonset" created [root@host-172-16-120-96 ~]# oc get ds -n dma NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE hello-daemonset 2 2 0 2 0 <none> 4s [root@host-172-16-120-96 ~]# oc get po -n dma NAME READY STATUS RESTARTS AGE hello-daemonset-2d9z5 1/1 Running 0 9s hello-daemonset-vgjdl 0/1 Pending 0 1s [root@host-172-16-120-96 ~]# oc get po -n dma -w NAME READY STATUS RESTARTS AGE hello-daemonset-2d9z5 1/1 Running 0 12s hello-daemonset-ftbcm 0/1 Pending 0 0s hello-daemonset-ftbcm 0/1 MatchNodeSelector 0 0s hello-daemonset-ftbcm 0/1 Terminating 0 0s hello-daemonset-ftbcm 0/1 Terminating 0 0s hello-daemonset-559l4 0/1 Pending 0 0s hello-daemonset-559l4 0/1 MatchNodeSelector 0 1s hello-daemonset-559l4 0/1 Terminating 0 1s hello-daemonset-559l4 0/1 Terminating 0 1s hello-daemonset-7xfs5 0/1 Pending 0 0s hello-daemonset-7xfs5 0/1 MatchNodeSelector 0 0s hello-daemonset-7xfs5 0/1 Terminating 0 0s hello-daemonset-7xfs5 0/1 Terminating 0 0s hello-daemonset-hs4wg 0/1 Pending 0 0s hello-daemonset-hs4wg 0/1 MatchNodeSelector 0 1s hello-daemonset-hs4wg 0/1 Terminating 0 1s hello-daemonset-hs4wg 0/1 Terminating 0 1s hello-daemonset-qwlfn 0/1 Pending 0 0s hello-daemonset-qwlfn 0/1 MatchNodeSelector 0 1s hello-daemonset-qwlfn 0/1 Terminating 0 1s hello-daemonset-qwlfn 0/1 Terminating 0 1s hello-daemonset-sz2vv 0/1 Pending 0 0s hello-daemonset-sz2vv 0/1 MatchNodeSelector 0 0s hello-daemonset-sz2vv 0/1 Terminating 0 0s hello-daemonset-sz2vv 0/1 Terminating 0 0s hello-daemonset-hhpzs 0/1 Pending 0 0s hello-daemonset-hhpzs 0/1 MatchNodeSelector 0 1s hello-daemonset-hhpzs 0/1 Terminating 0 1s hello-daemonset-hhpzs 0/1 Terminating 0 1s hello-daemonset-f58q7 0/1 Pending 0 0s hello-daemonset-f58q7 0/1 MatchNodeSelector 0 1s hello-daemonset-f58q7 0/1 Terminating 0 1s hello-daemonset-f58q7 0/1 Terminating 0 1s hello-daemonset-ptw29 0/1 Pending 0 0s hello-daemonset-ptw29 0/1 MatchNodeSelector 0 0s hello-daemonset-ptw29 0/1 Terminating 0 0s hello-daemonset-ptw29 0/1 Terminating 0 0s hello-daemonset-khh5p 0/1 Pending 0 0s hello-daemonset-khh5p 0/1 MatchNodeSelector 0 1s hello-daemonset-khh5p 0/1 Terminating 0 1s hello-daemonset-khh5p 0/1 Terminating 0 1s hello-daemonset-zjspp 0/1 Pending 0 0s hello-daemonset-zjspp 0/1 MatchNodeSelector 0 0s hello-daemonset-zjspp 0/1 Terminating 0 0s hello-daemonset-zjspp 0/1 Terminating 0 0s hello-daemonset-p99h7 0/1 Pending 0 0s we need limit the interval/rate to recreate pod For short term fix the issue we need revert https://github.com/openshift/openshift-ansible/pull/7364 dma good catch with the project default node selectors and ansible ;) Project defaultNodeSelectors are incompatible with DaemonSets and we should avoid setting them. detailed explanation here: https://bugzilla.redhat.com/show_bug.cgi?id=1501514#c9 *** This bug has been marked as a duplicate of bug 1501514 *** |