Description of problem (please be detailed as possible and provide log snippests): ODF operator tries to set a controller reference on a storage cluster that it didn't create. This prevents ODF operator from reconciling in the case where a controller ref already exists on the storage cluster CR, which is the case with storage clusters deployed using the OCS OSD Deployer (ODF Managed Service) The controller ref was added as a response for the following BZ. https://bugzilla.redhat.com/show_bug.cgi?id=2004030 Version of all relevant components (if applicable): odf-operator 4.9.1 Does this issue impact your ability to continue to work with the product? (please explain in detail what is the user impact)? Yes, prevent deployment of ODF 4.9 via in the Managed Service use case. Is there any workaround available to the best of your knowledge? No Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? Can this issue reproducible? Yes, deploy ODF using the ocs-osd-deployer Can this issue reproduce from the UI? No If this is a regression, please provide more details to justify this: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
A possible resolution: Replace SetControllerReference with SetOwnerReference here: https://github.com/red-hat-storage/odf-operator/blob/release-4.9/controllers/vendors.go#L77 This should allow StorageCluster to be owned by StorageSystem but not mark it as a controller for StorageCluster
I agree that this should be fixed, so giving devel_ack+. I'll leave it up to others to determine which versions this will need to be backported to.
able to deploy a cluster using managed service add-on of ocs-provider-qe and ocs-consumer-qe. This issue is resolved. Verify onboarding on OCS - 4.10.0-197 OCP 4.9.23 oc get csv NAME DISPLAY VERSION REPLACES PHASE mcg-operator.v4.10.0 NooBaa Operator 4.10.0 Succeeded ocs-operator.v4.10.0 OpenShift Container Storage 4.10.0 Succeeded ocs-osd-deployer.v2.0.0 OCS OSD Deployer 2.0.0 Succeeded odf-csi-addons-operator.v4.10.0 CSI Addons 4.10.0 Succeeded odf-operator.v4.10.0 OpenShift Data Foundation 4.10.0 Succeeded ose-prometheus-operator.4.8.0 Prometheus Operator 4.8.0 Succeeded route-monitor-operator.v0.1.406-54ff884 Route Monitor Operator 0.1.406-54ff884 route-monitor-operator.v0.1.404-e29b74b Succeeded provider ======= storagecluster ========== NAME AGE PHASE EXTERNAL CREATED AT VERSION ocs-storagecluster 24m Ready 2022-03-22T07:21:56Z ======= cephcluster ========== NAME DATADIRHOSTPATH MONCOUNT AGE PHASE MESSAGE HEALTH EXTERNAL ocs-storagecluster-cephcluster /var/lib/rook 3 23m Ready Cluster created successfully HEALTH_OK ======= cluster health status===== HEALTH_OK Consumer: oc get storagecluster NAME AGE PHASE EXTERNAL CREATED AT VERSION ocs-storagecluster 24h Ready true 2022-03-22T08:46:20Z ======= cluster health status===== HEALTH_OK ====== cephcluster ========== NAME DATADIRHOSTPATH MONCOUNT AGE PHASE MESSAGE HEALTH EXTERNAL ocs-storagecluster-cephcluster 36m Connected Cluster connected successfully HEALTH_OK true Both consumer and provider onboarded successfully. hence moving this BZ to verified
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.10.0 enhancement, security & bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:1372