Bug 2010946
Summary: | concurrent CRD from ovirt-csi-driver-operator gets reconciled by CVO after deployment, changing CR as well. | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Andreas Bleischwitz <ableisch> |
Component: | Storage | Assignee: | aos-storage-staff <aos-storage-staff> |
Storage sub component: | oVirt CSI Driver | QA Contact: | Andreas Bleischwitz <a.bleischwitz> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | high | ||
Priority: | medium | CC: | a.bleischwitz, aos-bugs, jpasztor, mburman |
Version: | 4.8 | ||
Target Milestone: | --- | ||
Target Release: | 4.10.0 | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2022-03-10 16:17:15 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 2024491 |
Description
Andreas Bleischwitz
2021-10-05 16:44:59 UTC
Are you sure that this is happening on regular installation? I'll verify it next week when I will install a new cluster, but just from looking this is how I explain the bug. Generally for all the operators other than ovirt there is no manifest dir, they deleted it because it is just not used by the driver and it is hard to make sure they are updated with whatever is being deployed by the storage operator - we kept it mainly for the day2 stuff. Looking at our manifest directory it looks like we have the CRD[1] and the CR[2]. The CRD actually comes from OCP API[3] so we probably just need to remove that file from our manifests dir. The CR comes from the storage operator, so I guess it was reconciled after you installed the driver or something, but that section was removed[4]. I will send a PR to update the manifests dir. [1] https://github.com/openshift/ovirt-csi-driver-operator/blob/master/manifests/00_crd.yaml#L43-L54 [2] https://github.com/openshift/ovirt-csi-driver-operator/blob/master/manifests/08_cr.yaml#L7-L8 [3] https://docs.openshift.com/container-platform/4.6/rest_api/operator_apis/clustercsidriver-operator-openshift-io-v1.html [4] https://github.com/openshift/cluster-storage-operator/commit/0463e2adc1a6c8cbbc578387baef341edc537097 Created a PR please take a look and if you see it solves your issue please comment: /unhold /lgtm Created a PR please take a look and if you see it solves your issue please comment: /unhold /lgtm Hi Gal, Not sure about a fresh RHV-IPI cluster, as I only have resources to run one OCP-cluster at a time. But I applied the changes from the pull-request into my deployment and the driver is way less chatty about missing permissions and other stuff. Currently I'm still checking the functionality of the driver, but as of now all looks good. No restarts of pods after 3 hours of runtime realized. Claiming and removal of disks also works. Will the /manifests directory remain in the image? Currently I'm extracting the yaml files from there in order to do the deployment. (In reply to Andreas Bleischwitz from comment #6) > Hi Gal, > > Not sure about a fresh RHV-IPI cluster, as I only have resources to run one > OCP-cluster at a time. > > But I applied the changes from the pull-request into my deployment and the > driver is way less chatty about missing permissions and other stuff. > Currently I'm still checking the functionality of the driver, but as of now > all looks good. > > No restarts of pods after 3 hours of runtime realized. Claiming and removal > of disks also works. > Great update me whenever you are comfortable with it and I will merge and backport. > Will the /manifests directory remain in the image? Yes, I looked at your day2 repo and it just updated the manifest directory. > Currently I'm extracting the yaml files from there in order to do the deployment. After having the deployment running for several days now, I didn't see any restarts of pods caused by missing permissions. CVO does no longer reconcile the CRD and the CR can get applied without providing "driverConfig" ``` % oc get crd/clustercsidrivers.operator.openshift.io -o yaml | grep "driverConfig" | wc -l 0 ``` Setting to verified. Forgot to mention: I tested based on ovirt-csi-driver-operator quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:1a49f947d7bc4be906e931434ded09da176821f9ad58ee1dc2c0462777165b05 and manual adjustments of deployments based on https://github.com/openshift/ovirt-csi-driver-operator/pull/72 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056 |