Created attachment 1681505 [details] Catalog Operator Logs Description of problem (please be detailed as possible and provide log snippests): On a fresh OCP Cluster, used OCS 4.4-rc2 build to deploy OCS 4.3 (via stable-4.3) channel, but the installation did not succeed. It shows some conflicts with the CRD ownership. Version of all relevant components (if applicable): ocs-olm-operator:4.4.0-rc2 OCP 4.4 CI Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Yes Is there any workaround available to the best of your knowledge? NO (not when using 4.4-rc2) Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 4 Can this issue reproducible? Yes Can this issue reproduce from the UI? Yes If this is a regression, please provide more details to justify this: Probably yes, because things were working as expected before. Steps to Reproduce: 1. deploy custom CatalogSource for OCS with image quay.io/rhceph-dev/ocs-olm-operator:4.4.0-rc2 2. Install operator using this custom catalog 3. Choose `Stable-4.3` channel instead of `Stable-4.4` Actual results: The installation will get stuck due to dependency resolution conflict in OLM. Expected results: There should be no conflict as we own all the CRDs that are reported to be conflicting. Additional info: Using OCS 4.4-rc2 for installing OCS 4.4 worked fine. Dev builds that we used to try replicating this problem didn't have this issue. If this were to be released now, OCS 4.3 will probably be uninstallable too. The log is similar to what we see in https://bugzilla.redhat.com/show_bug.cgi?id=1823937 but the cause is different. There are no OCS logs as nothing was created. Following is the part of OLM log that shows conflict in OLM. ``` time="2020-04-24T10:38:52Z" level=info msg="Adding related objects for operator-lifecycle-manager-catalog" time="2020-04-24T10:40:16Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=/apis/operators.coreos.com/v1alpha1/namespaces/openshift-storage/subscriptions/ocs-operator E0424 10:40:16.867398 1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ceph.rook.io/v1/CephCluster (cephclusters) already provided by ocs-operator.v4.3.0 E0424 10:40:16.892887 1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ceph.rook.io/v1/CephClient (cephclients) already provided by ocs-operator.v4.3.0 E0424 10:40:16.924082 1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ocs.openshift.io/v1/StorageCluster (storageclusters) already provided by ocs-operator.v4.3.0 time="2020-04-24T10:40:16Z" level=info msg=syncing event=delete reconciling="*v1alpha1.Subscription" selflink=/apis/operators.coreos.com/v1alpha1/namespaces/openshift-storage/subscriptions/ocs-operator E0424 10:40:16.941146 1 reconciler.go:257] unexpected subscription state in installplan reconciler *subscription.subscriptionDeletedState time="2020-04-24T10:40:32Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=/apis/operators.coreos.com/v1alpha1/namespaces/openshift-storage/subscriptions/ocs-operator time="2020-04-24T10:40:32Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=/apis/operators.coreos.com/v1alpha1/namespaces/openshift-storage/subscriptions/ocs-operator time="2020-04-24T10:40:32Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=/apis/operators.coreos.com/v1alpha1/namespaces/openshift-storage/subscriptions/ocs-operator E0424 10:40:32.228215 1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ocs.openshift.io/v1/StorageClusterInitialization (storageclusterinitializations) already provided by ocs-operator.v4.3.0 E0424 10:40:32.249414 1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: noobaa.io/v1alpha1/BucketClass (bucketclasses) already provided by ocs-operator.v4.3.0 E0424 10:40:32.423202 1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ocs.openshift.io/v1/StorageCluster (storageclusters) already provided by ocs-operator.v4.3.0 E0424 10:40:32.818993 1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ceph.rook.io/v1/CephCluster (cephclusters) already provided by ocs-operator.v4.3.0 E0424 10:40:33.219068 1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ocs.openshift.io/v1/OCSInitialization (ocsinitializations) already provided by ocs-operator.v4.3.0 E0424 10:40:33.620300 1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ceph.rook.io/v1/CephClient (cephclients) already provided by ocs-operator.v4.3.0 E0424 10:40:34.020881 1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ceph.rook.io/v1/CephCluster (cephclusters) already provided by ocs-operator.v4.3.0 E0424 10:40:34.420049 1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ocs.openshift.io/v1/OCSInitialization (ocsinitializations) already provided by ocs-operator.v4.3.0 E0424 10:40:35.220042 1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: noobaa.io/v1alpha1/NooBaa (noobaas) already provided by ocs-operator.v4.3.0 E0424 10:40:35.620484 1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ceph.rook.io/v1/CephCluster (cephclusters) already provided by ocs-operator.v4.3.0 E0424 10:40:36.020533 1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ocs.openshift.io/v1/StorageCluster (storageclusters) already provided by ocs-operator.v4.3.0 E0424 10:40:36.427634 1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ocs.openshift.io/v1/StorageCluster (storageclusters) already provided by ocs-operator.v4.3.0 E0424 10:40:36.819191 1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: noobaa.io/v1alpha1/NooBaa (noobaas) already provided by ocs-operator.v4.3.0 E0424 10:40:37.219918 1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ceph.rook.io/v1/CephCluster (cephclusters) already provided by ocs-operator.v4.3.0 E0424 10:40:37.619084 1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ocs.openshift.io/v1/OCSInitialization (ocsinitializations) already provided by ocs-operator.v4.3.0 E0424 10:40:38.019977 1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ceph.rook.io/v1/CephObjectStoreUser (cephobjectstoreusers) already provided by ocs-operator.v4.3.0 ```
Hi Umanga, Could you please explain what is the use case of choosing `Stable-4.3` while installing 4.4.0-rc2?
Thanks Boris for explanation. > The way we build ocs-olm now, you can install any z-stream version from the latest CSV, you just need to manually select it, i.e. we are shipping all the CSVs for all the released content, including all the z-stream releases. Do you have any documentation for it? AFAIK in the installation of OCS you are just selecting channel which means stable-4.3 or stable-4.4 for example and it automatically taking the latest available in this channel. That's how auto upgrade roll automatically to latest one version and if you are the customer and installing form scratch new OCS cluster you will get latest version in that channel. If there is the way how to install for example 4.2.0 instead of 4.2.3 I am more than happy to hear how to achieve this.
That is the info I got from Umanga the last time we were talking about it. IIRC, you were supposed to have an option to select it from the web UI. @Umanga: Can you elaborate on how to (manually) select to deploy a release that is not latest for a channel?
@umanga, since Petr also reproduced this issue and the newer RC build of OCS 4.4 (i.e 4.4.0-rc3) , we should now consider this BZ as a blocker and try to bring in the fix. AFAIU, Since this involves installation of OCS 4.3 from the 4.4 bundle, we would need to fix it in the same 4.4 release Marking it as a blocker as of now. Let us know if there needs to be a change, we all can have a discussion.
When doing this installation, is there an OCS catalog entry from the redhat-operators catalog that has the OCS 4.3 GA CSV? If so, there may be a conflict with having the OCS 4.3 GA CSV being offered from both redhat-operators and ocs-olm-operator. Does that make sense? Can you try disabling the redhat-operators catalog and seeing if the installation succeeds?
For now I am adding the must gather logs I've collected on cluster deployed on Thursday with lib bucket provisioner V2. http://rhsqe-repo.lab.eng.blr.redhat.com/cns/ocs-qe-bugs/bz-1827689/ I will try to reproduce with disabled red-hat-operators today and will get to you back with the results.
Must gather are in the same path I shared last time in this tar file: http://rhsqe-repo.lab.eng.blr.redhat.com/cns/ocs-qe-bugs/bz-1827689/logs-with-disabled-rh-operator.tar.gz
This is now tracking OCP/OLM BZ. acking
As decided by the stakeholders, we are falling back to reverting the CRD owning change in ocs-operator. https://github.com/openshift/ocs-operator/pull/518 has been merged to the release branch.
4.3 was verified here: https://ocs4-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/qe-deploy-ocs-cluster/7921/parameters/ 4.2 was verified here: https://ocs4-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/qe-deploy-ocs-cluster/7893/parameters/ With build: quay.io/rhceph-dev/ocs-olm-operator:4.4.0-428.ci
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2393