Description of problem: The oVirt CSI driver operator is constantly restarting since it's inception Containers: ovirt-csi-driver-operator: Container ID: cri-o://4aded21619cec53cd9c6c06ffd1988909059d66acc23720a0895431ce775968d Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:86a675ddbace0069c6d860629724f1dcebccc639fc032093afa04ec7e13b1940 Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:86a675ddbace0069c6d860629724f1dcebccc639fc032093afa04ec7e13b1940 Port: <none> Host Port: <none> Args: start --node=$(KUBE_NODE_NAME) -v=2 State: Running Started: Wed, 17 Feb 2021 13:12:15 +0200 Last State: Terminated Reason: Error Exit Code: 1 Started: Wed, 17 Feb 2021 13:01:08 +0200 Finished: Wed, 17 Feb 2021 13:12:14 +0200 Ready: True Restart Count: 945 This happens because configInformers in the operator code were not started[1], as a result the ConfigObserver to sync the cache The error in the operator log: 706171 1 shared_informer.go:266] stop requested 707427 1 base_controller.go:95] unable to sync caches for ConfigObserver Version-Release number of selected component (if applicable): How reproducible: 100% Steps to Reproduce: 1. Should happen on any cluster >4.7 with ovirt csi driver operator 2. 3. Actual results: ovirt CSI driver operator pod keeps restarting Expected results: The operator should not restart unless there is a real issue [1] https://github.com/openshift/ovirt-csi-driver-operator/blob/master/pkg/operator/starter.go#L128
steps to reproduce: 1) oc project openshift-cluster-csi-drivers 2) oc status In project openshift-cluster-csi-drivers on server https://api.primary.ocp.rhev.lab.eng.brq.redhat.com:6443 deployment/ovirt-csi-driver-controller deploys quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0feb29efe901393bf80594af53ec8bbef34bbc6303c71cdfb7c779bacc461531,quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:def80d6439c31c03f4d5e5bfa4f209bddfd3b7423d38d90f483a1ad1a10c0e01,quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:b3a0f319143cdd04122e50490ffa60e93024e18ace3c105041c432f2daf961fa,quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:f14455d69f404747e4458528744fd8ab9c2f5243004b3f7bff0323e73072b681 deployment #1 running for 13 days - 1 pod deployment/ovirt-csi-driver-operator deploys quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:86a675ddbace0069c6d860629724f1dcebccc639fc032093afa04ec7e13b1940 deployment #1 running for 13 days - 1 pod (warning: 974 restarts) 3) we got warning 4) [root@ocp-qe-1 primary]# oc get pods NAME READY STATUS RESTARTS AGE ovirt-csi-driver-controller-7db477884c-tflht 4/4 Running 0 7d14h ovirt-csi-driver-node-8qnxw 3/3 Running 0 13d ovirt-csi-driver-node-h5xvc 3/3 Running 0 13d ovirt-csi-driver-node-jtf7s 3/3 Running 1 13d ovirt-csi-driver-node-lnxmx 3/3 Running 0 13d ovirt-csi-driver-node-sg2td 3/3 Running 0 13d ovirt-csi-driver-node-wvnbm 3/3 Running 0 13d ovirt-csi-driver-operator-89d7bb77b-rn2m5 1/1 Running 975 7d14h 5) oc logs pod/ovirt-csi-driver-operator-89d7bb77b-rn2m5 -n openshift-cluster-csi-drivers 6) oc describe pod/ovirt-csi-driver-operator-89d7bb77b-rn2m5 -n openshift-cluster-csi-drivers
This is not a blocker for OCP 4.7.0, but need to be fixed in the first available OCP 4.7.z stream.
ocp: 4.8.0-0.nightly-2021-02-22-111248 ovirt: 4.4.2.6-1.el8 steps to reproduce: 1) install 4.8 cluster 2) oc project openshift-cluster-csi-drivers 3) oc status - > I don't see any warning 4) oc get pods NAME READY STATUS RESTARTS AGE ovirt-csi-driver-controller-5bcbbd4c47-7kvld 4/4 Running 0 127m ovirt-csi-driver-node-4r2mg 3/3 Running 0 127m ovirt-csi-driver-node-6tx6f 3/3 Running 0 127m ovirt-csi-driver-node-7prgg 3/3 Running 1 112m ovirt-csi-driver-node-bf54l 3/3 Running 0 113m ovirt-csi-driver-node-r5jxd 3/3 Running 0 127m ovirt-csi-driver-operator-8487469d4f-j72ft 1/1 Running 1 127m there is no pod that do a lot of restart
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438