Description of problem: When deploying the CSI driver manually, probably also during IPI installation, a modified CRD of the *oVirt-CSI-driver-operator* gets deployed: ~~~ spec: validation: openAPIV3Schema: properties: spec: properties: + driverConfig: + description: CSIDriverConfig is the CSI driver specific configuration + properties: + driverName: + description: DriverName holds the name of the CSI driver + enum: + - ebs.csi.aws.com + - manila.csi.openstack.org + - csi.ovirt.org + type: string + required: + - driverName + type: object ~~~ and other minor adjustments. This definition will get reconciled by the Cluster-Version-Operator during reconcile run: ~~~ I1005 15:58:34.592743 1 apiext.go:66] Updating CRD clustercsidrivers.operator.openshift.io due to diff: &v1.CustomResourceDefinition{ ~~~ Additionally the ClusterResource defined with: ~~~ spec: driverConfig: driverName: csi.ovirt.org ~~~ will get reverted to no longer contain `driverConfig` at all. A subsequent test of a deployment without the CRD and a CR without `driverConfig` showed that the ovirt-csi-driver does not seem to make use of this property. Version-Release number of selected component (if applicable): ~~~ % oc version Client Version: 4.8.13 Server Version: 4.8.11 Kubernetes Version: v1.21.1+9807387 ~~~ How reproducible: Steps to Reproduce: 1. Install RHV-IPI cluster with RHV-CSI enabled 2. Validate output of `oc get clustercsidriver/csi.ovirt.org -o go-template='{{.spec}} {{"\n"}}'` and verify if `driverConfig` is present. 3. Check output of `oc get crd/clustercsidrivers.operator.openshift.io -o yaml | grep "configDriver"` and verify if `driverConfig` is present. Actual results: Neither CRD, nor CR does contain 'driverConfig'. Expected results: Neither CRD, nor `driverConfig` in CR exists in initial deployment. Additional info: When deploying the CSI-driver operator by using ArgoCD, a validation of the YAML files is enforced (unlike a deployment by using oc-command). This renders the deployment to fail in case the CVO reconciled the CRD and therefore invalidates the enriched CR. As the `configDriver` property does not seem to be used, we should remove the CRD entirely and the additional property from the CR.
Are you sure that this is happening on regular installation? I'll verify it next week when I will install a new cluster, but just from looking this is how I explain the bug. Generally for all the operators other than ovirt there is no manifest dir, they deleted it because it is just not used by the driver and it is hard to make sure they are updated with whatever is being deployed by the storage operator - we kept it mainly for the day2 stuff. Looking at our manifest directory it looks like we have the CRD[1] and the CR[2]. The CRD actually comes from OCP API[3] so we probably just need to remove that file from our manifests dir. The CR comes from the storage operator, so I guess it was reconciled after you installed the driver or something, but that section was removed[4]. I will send a PR to update the manifests dir. [1] https://github.com/openshift/ovirt-csi-driver-operator/blob/master/manifests/00_crd.yaml#L43-L54 [2] https://github.com/openshift/ovirt-csi-driver-operator/blob/master/manifests/08_cr.yaml#L7-L8 [3] https://docs.openshift.com/container-platform/4.6/rest_api/operator_apis/clustercsidriver-operator-openshift-io-v1.html [4] https://github.com/openshift/cluster-storage-operator/commit/0463e2adc1a6c8cbbc578387baef341edc537097
Created a PR please take a look and if you see it solves your issue please comment: /unhold /lgtm
Hi Gal, Not sure about a fresh RHV-IPI cluster, as I only have resources to run one OCP-cluster at a time. But I applied the changes from the pull-request into my deployment and the driver is way less chatty about missing permissions and other stuff. Currently I'm still checking the functionality of the driver, but as of now all looks good. No restarts of pods after 3 hours of runtime realized. Claiming and removal of disks also works. Will the /manifests directory remain in the image? Currently I'm extracting the yaml files from there in order to do the deployment.
(In reply to Andreas Bleischwitz from comment #6) > Hi Gal, > > Not sure about a fresh RHV-IPI cluster, as I only have resources to run one > OCP-cluster at a time. > > But I applied the changes from the pull-request into my deployment and the > driver is way less chatty about missing permissions and other stuff. > Currently I'm still checking the functionality of the driver, but as of now > all looks good. > > No restarts of pods after 3 hours of runtime realized. Claiming and removal > of disks also works. > Great update me whenever you are comfortable with it and I will merge and backport. > Will the /manifests directory remain in the image? Yes, I looked at your day2 repo and it just updated the manifest directory. > Currently I'm extracting the yaml files from there in order to do the deployment.
After having the deployment running for several days now, I didn't see any restarts of pods caused by missing permissions. CVO does no longer reconcile the CRD and the CR can get applied without providing "driverConfig" ``` % oc get crd/clustercsidrivers.operator.openshift.io -o yaml | grep "driverConfig" | wc -l 0 ``` Setting to verified.
Forgot to mention: I tested based on ovirt-csi-driver-operator quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:1a49f947d7bc4be906e931434ded09da176821f9ad58ee1dc2c0462777165b05 and manual adjustments of deployments based on https://github.com/openshift/ovirt-csi-driver-operator/pull/72
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056