Bug 1886726
Summary: | On some rare occasions pods will start without optional secrets created before pod creation | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Juan Luis de Sousa-Valadas <jdesousa> |
Component: | Node | Assignee: | Harshal Patil <harpatil> |
Node sub component: | Kubelet | QA Contact: | MinLi <minmli> |
Status: | CLOSED WORKSFORME | Docs Contact: | |
Severity: | medium | ||
Priority: | medium | CC: | aos-bugs, deads, dosmith, harpatil, jokerman |
Version: | 4.6 | ||
Target Milestone: | --- | ||
Target Release: | 4.7.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-12-11 04:27:05 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Juan Luis de Sousa-Valadas
2020-10-09 08:48:13 UTC
This is impacting 4.6 jobs: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.6-informing#release-openshift-ocp-installer-e2e-aws-ovn-4.6 . Sippy reports 52% failure over the past three days. Lowering priority and severity because we have a workaround in OVN which is making the secret mandatory. I had a quick look at the journal again and the secretVolumeManager received the data at 14:53:59, but the container was started at: 14:55:43 and died at 15:00:59 During these 5 minutes the files were never present in the filesystem. $ zcat journal | grep secret.go:207 | grep ovn-cert |grep master-1 Oct 08 14:53:35.560817 ci-op-07tdp2hv-23229-hzznd-master-1 hyperkube[1813]: I1008 14:53:35.560703 1813 secret.go:207] Received secret openshift-ovn-kubernetes/ovn-cert containing (0) pieces of data, 0 total bytes Oct 08 14:53:35.662890 ci-op-07tdp2hv-23229-hzznd-master-1 hyperkube[1813]: I1008 14:53:35.662458 1813 secret.go:207] Received secret openshift-ovn-kubernetes/ovn-cert containing (0) pieces of data, 0 total bytes Oct 08 14:53:59.039225 ci-op-07tdp2hv-23229-hzznd-master-1 hyperkube[1813]: I1008 14:53:59.039111 1813 secret.go:207] Received secret openshift-ovn-kubernetes/ovn-cert containing (2) pieces of data, 4099 total bytes Oct 08 14:54:00.042725 ci-op-07tdp2hv-23229-hzznd-master-1 hyperkube[1813]: I1008 14:54:00.042529 1813 secret.go:207] Received secret openshift-ovn-kubernetes/ovn-cert containing (2) pieces of data, 4099 total bytes { "containerID": "cri-o://3be754f81faa64fac7c59af6b60fc66803b0d0ee956ddec6420114c06f77d4a1", "image": "registry.build01.ci.openshift.org/ci-op-07tdp2hv/stable@sha256:a7a3af407006529edd36c1dc94171533ed9b985a017e6da85130b0ed6275a7c8", "imageID": "registry.build01.ci.openshift.org/ci-op-07tdp2hv/stable@sha256:a7a3af407006529edd36c1dc94171533ed9b985a017e6da85130b0ed6275a7c8", "lastState": { "terminated": { "containerID": "cri-o://91d7441a52750ff525fe077d41618c3b15e2baa27ea6e418dfe676113698c0fc", "exitCode": 137, "finishedAt": "2020-10-08T15:00:59Z", "message": <snip>", "reason": "Error", "startedAt": "2020-10-08T14:55:43Z" } }, "name": "sbdb", "ready": true, "restartCount": 1, "started": true, "state": { "running": { "startedAt": "2020-10-08T15:01:30Z" } } } Since OVN did not exist prior to the 4.6 release, this is not a regression. Moving to 4.7. We will continue to look at it. Ryan, I don't have anything against moving this to 4.7, but I don't know if this happens only on OVN. We only detected it in OVN because a lot of critical components were accidentally relying on this feature. *** Bug 1872874 has been marked as a duplicate of this bug. *** |