Bug 2151248
| Summary: | SSP pods moving to CrashLoopBackOff state for long duration when tlssecurityProfile is changed often | ||
|---|---|---|---|
| Product: | Container Native Virtualization (CNV) | Reporter: | Geetika Kapoor <gkapoor> |
| Component: | Infrastructure | Assignee: | opokorny |
| Status: | ASSIGNED --- | QA Contact: | Geetika Kapoor <gkapoor> |
| Severity: | medium | Docs Contact: | |
| Priority: | low | ||
| Version: | 4.12.0 | CC: | sasundar, stirabos, ycui |
| Target Milestone: | --- | ||
| Target Release: | 4.14.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | Bug | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
I have seen a similar issue and reported a bug - https://bugzilla.redhat.com/show_bug.cgi?id=2150333 I believe this should be the same issue. *** Bug 2150333 has been marked as a duplicate of this bug. *** Currently SSP restarts itself on each change on tlsSecurityProfile. This is somehow acceptable for end users that are probably going to amend the configuration only once but it's definitely cumbersome for automated tests that tries to apply more changes in a row. https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#restart-policy says: After containers in a Pod exit, the kubelet restarts them with an exponential back-off delay (10s, 20s, 40s, …), that is capped at five minutes. Once a container has executed for 10 minutes without any problems, the kubelet resets the restart backoff timer for that container. So, in order to avoid paying the CrashLoopBackOff timne, you should wait 10 minutes between a configuration change and the next one. |
Description of problem: ssp pod continue to be in CrashLoopBackOff for nearly ~5 mins when tlssecurityProfile is changed often. 1. Set HCO tlsSecurityProfile as old. oc get hco kubevirt-hyperconverged -n openshift-cnv -ojsonpath={.spec.tlsSecurityProfile} {"old":{},"type":"Old"} 2. Set ssp tlssecurityProfile explicitly to custom. oc patch ssp -n openshift-cnv --type=json ssp-kubevirt-hyperconverged -p '[{"op": "replace", "path": /spec/tlsSecurityProfile, "value": {custom: {minTLSVersion: "VersionTLS13", ciphers: ["TLS_AES_128_GCM_SHA256", "TLS_CHACHA20_POLY1305_SHA256"]}, type: "Custom"} }]' 3. Expected is HCO should try to propogate it's tls settings to ssp. $ oc get ssp ssp-kubevirt-hyperconverged -n openshift-cnv -ojsonpath={.spec.tlsSecurityProfile} {"old":{},"type":"Old"} However during this whole procedure, ssp(ssp-operator-79bbc48bc5-tch2n) pod continue to be in CrashLoopBackOff for nearly ~5 mins. oc get pods -A -w | grep -i ssp openshift-cnv ssp-operator-79bbc48bc5-tch2n 0/1 CrashLoopBackOff 10 (4m54s ago) 28h Version-Release number of selected component (if applicable): 4.12 How reproducible: always Steps to Reproduce: 1.mentioned above 2. 3. Actual results: ssp pods goes to crashed state and sometimes it is too often and for longer time. Expected results: ssp pods should not be crashed often and wait time should be less Additional info: