Summary: | csi-snapshot-controller needs to handle API server downtime gracefully in SNO | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Fabio Bertinatto <fbertina> |
Component: | Storage | Assignee: | Fabio Bertinatto <fbertina> |
Storage sub component: | Kubernetes External Components | QA Contact: | Rohit Patil <ropatil> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | high | ||
Priority: | high | CC: | aos-bugs, jsafrane, nelluri, ropatil, vgrinber |
Version: | 4.9 | ||
Target Milestone: | --- | ||
Target Release: | 4.9.0 | ||
Hardware: | Unspecified | ||
OS: | Linux | ||
Whiteboard: | chaos | ||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | 1986215 | Environment: | |
Last Closed: | 2021-10-18 17:45:44 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Bug Depends On: | 1986215 | ||
Bug Blocks: | 1984730 |
Comment 1
Fabio Bertinatto
2021-08-10 20:18:17 UTC
Hi All, Verified on build: 4.9.0-0.nightly-2021-08-29-010334. There are no restarts observed on the csi-snapshot-controller. Observation. As there are no restart happened with csi-snapshot-controller, getting logs for this pod with non existed container throwing message as "Error from server (BadRequest): previous terminated container" Execution steps 1. Created sno cluster rohitpatil@ropatil-mac Downloads % oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-129-189.us-west-1.compute.internal Ready master,worker 59m v1.22.0-rc.0+b708912 2. Observed the logs rohitpatil@ropatil-mac Downloads % oc get pods -n openshift-cluster-storage-operator NAME READY STATUS RESTARTS AGE cluster-storage-operator-7566bdd45c-scfws 1/1 Running 1 (76m ago) 79m csi-snapshot-controller-85b5595f65-65w6h 1/1 Running 0 77m csi-snapshot-controller-operator-7f6c5fb9b4-2w5b9 1/1 Running 0 79m csi-snapshot-webhook-696b489f7b-z95sq 1/1 Running 0 77m 3. Triggered the oc path command rohitpatil@ropatil-mac Downloads % oc patch kubeapiserver/cluster --type merge -p '{"spec":{"forceRedeploymentReason":"ITERATION1"}}' kubeapiserver.operator.openshift.io/cluster patched 4. Observed the logs NAME READY STATUS RESTARTS AGE cluster-storage-operator-7566bdd45c-scfws 1/1 Running 1 (67m ago) 70m csi-snapshot-controller-85b5595f65-65w6h 1/1 Running 0 68m csi-snapshot-controller-operator-7f6c5fb9b4-2w5b9 1/1 Running 0 69m csi-snapshot-webhook-696b489f7b-z95sq 1/1 Running 0 68m 5. Error Message: rohitpatil@ropatil-mac Downloads % oc logs -p csi-snapshot-controller-85b5595f65-65w6h -n openshift-cluster-storage-operator Error from server (BadRequest): previous terminated container "snapshot-controller" in pod "csi-snapshot-controller-85b5595f65-65w6h" not found Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759 |