Bug 1827689 - OCS 4.3 Installation fails when using OCS 4.4-rc2 registry bundle
Summary: OCS 4.3 Installation fails when using OCS 4.4-rc2 registry bundle
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Container Storage
Classification: Red Hat
Component: build
Version: 4.4
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: OCS 4.4.0
Assignee: Christina Meno
QA Contact: Petr Balogh
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-04-24 14:32 UTC by umanga
Modified: 2020-09-23 09:06 UTC (History)
11 users (show)

Fixed In Version: 4.4.0-rc6
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1834936 (view as bug list)
Environment:
Last Closed: 2020-06-04 12:54:39 UTC
Target Upstream Version:


Attachments (Terms of Use)
Catalog Operator Logs (21.33 KB, application/octet-stream)
2020-04-24 14:32 UTC, umanga
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift ocs-operator pull 518 0 None closed Bug 1823937: [release-4.4] Revert owning OB and OBC CRDs 2020-12-30 12:14:19 UTC
Red Hat Product Errata RHBA-2020:2393 0 None None None 2020-06-04 12:54:53 UTC

Description umanga 2020-04-24 14:32:15 UTC
Created attachment 1681505 [details]
Catalog Operator Logs

Description of problem (please be detailed as possible and provide log
snippests):

On a fresh OCP Cluster, used OCS 4.4-rc2 build to deploy OCS 4.3 (via stable-4.3) channel, but
the installation did not succeed. It shows some conflicts with the CRD ownership.

Version of all relevant components (if applicable):

ocs-olm-operator:4.4.0-rc2
OCP 4.4 CI 

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Yes

Is there any workaround available to the best of your knowledge?
NO (not when using 4.4-rc2)

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
4

Can this issue reproducible?
Yes

Can this issue reproduce from the UI?
Yes

If this is a regression, please provide more details to justify this:

Probably yes, because things were working as expected before.

Steps to Reproduce:
1. deploy custom CatalogSource for OCS with image quay.io/rhceph-dev/ocs-olm-operator:4.4.0-rc2
2. Install operator using this custom catalog
3. Choose `Stable-4.3` channel instead of `Stable-4.4`


Actual results:
The installation will get stuck due to dependency resolution conflict in OLM.

Expected results:
There should be no conflict as we own all the CRDs that are reported to be conflicting.

Additional info:

Using OCS 4.4-rc2 for installing OCS 4.4 worked fine.

Dev builds that we used to try replicating this problem didn't have this issue.

If this were to be released now, OCS 4.3 will probably be uninstallable too.
The log is similar to what we see in https://bugzilla.redhat.com/show_bug.cgi?id=1823937 
but the cause is different.

There are no OCS logs as nothing was created.

Following is the part of OLM log that shows conflict in OLM.

```
time="2020-04-24T10:38:52Z" level=info msg="Adding related objects for operator-lifecycle-manager-catalog"
time="2020-04-24T10:40:16Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=/apis/operators.coreos.com/v1alpha1/namespaces/openshift-storage/subscriptions/ocs-operator
E0424 10:40:16.867398       1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ceph.rook.io/v1/CephCluster (cephclusters) already provided by ocs-operator.v4.3.0
E0424 10:40:16.892887       1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ceph.rook.io/v1/CephClient (cephclients) already provided by ocs-operator.v4.3.0
E0424 10:40:16.924082       1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ocs.openshift.io/v1/StorageCluster (storageclusters) already provided by ocs-operator.v4.3.0
time="2020-04-24T10:40:16Z" level=info msg=syncing event=delete reconciling="*v1alpha1.Subscription" selflink=/apis/operators.coreos.com/v1alpha1/namespaces/openshift-storage/subscriptions/ocs-operator
E0424 10:40:16.941146       1 reconciler.go:257] unexpected subscription state in installplan reconciler *subscription.subscriptionDeletedState
time="2020-04-24T10:40:32Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=/apis/operators.coreos.com/v1alpha1/namespaces/openshift-storage/subscriptions/ocs-operator
time="2020-04-24T10:40:32Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=/apis/operators.coreos.com/v1alpha1/namespaces/openshift-storage/subscriptions/ocs-operator
time="2020-04-24T10:40:32Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=/apis/operators.coreos.com/v1alpha1/namespaces/openshift-storage/subscriptions/ocs-operator
E0424 10:40:32.228215       1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ocs.openshift.io/v1/StorageClusterInitialization (storageclusterinitializations) already provided by ocs-operator.v4.3.0
E0424 10:40:32.249414       1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: noobaa.io/v1alpha1/BucketClass (bucketclasses) already provided by ocs-operator.v4.3.0
E0424 10:40:32.423202       1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ocs.openshift.io/v1/StorageCluster (storageclusters) already provided by ocs-operator.v4.3.0
E0424 10:40:32.818993       1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ceph.rook.io/v1/CephCluster (cephclusters) already provided by ocs-operator.v4.3.0
E0424 10:40:33.219068       1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ocs.openshift.io/v1/OCSInitialization (ocsinitializations) already provided by ocs-operator.v4.3.0
E0424 10:40:33.620300       1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ceph.rook.io/v1/CephClient (cephclients) already provided by ocs-operator.v4.3.0
E0424 10:40:34.020881       1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ceph.rook.io/v1/CephCluster (cephclusters) already provided by ocs-operator.v4.3.0
E0424 10:40:34.420049       1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ocs.openshift.io/v1/OCSInitialization (ocsinitializations) already provided by ocs-operator.v4.3.0
E0424 10:40:35.220042       1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: noobaa.io/v1alpha1/NooBaa (noobaas) already provided by ocs-operator.v4.3.0
E0424 10:40:35.620484       1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ceph.rook.io/v1/CephCluster (cephclusters) already provided by ocs-operator.v4.3.0
E0424 10:40:36.020533       1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ocs.openshift.io/v1/StorageCluster (storageclusters) already provided by ocs-operator.v4.3.0
E0424 10:40:36.427634       1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ocs.openshift.io/v1/StorageCluster (storageclusters) already provided by ocs-operator.v4.3.0
E0424 10:40:36.819191       1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: noobaa.io/v1alpha1/NooBaa (noobaas) already provided by ocs-operator.v4.3.0
E0424 10:40:37.219918       1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ceph.rook.io/v1/CephCluster (cephclusters) already provided by ocs-operator.v4.3.0
E0424 10:40:37.619084       1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ocs.openshift.io/v1/OCSInitialization (ocsinitializations) already provided by ocs-operator.v4.3.0
E0424 10:40:38.019977       1 queueinformer_operator.go:290] sync "openshift-storage" failed: error calculating generation changes due to new bundle: ceph.rook.io/v1/CephObjectStoreUser (cephobjectstoreusers) already provided by ocs-operator.v4.3.0
```

Comment 3 Elad 2020-04-24 17:47:53 UTC
Hi Umanga, 

Could you please explain what is the use case of choosing `Stable-4.3` while installing 4.4.0-rc2?

Comment 9 Petr Balogh 2020-04-30 09:11:47 UTC
Thanks Boris for explanation.

> The way we build ocs-olm now, you can install any z-stream version from the latest CSV, you just need to manually select it, i.e. we are shipping all the CSVs for all the released content, including all the z-stream releases.

Do you have any documentation for it?  AFAIK in the installation of OCS you are just selecting channel which means stable-4.3 or stable-4.4 for example and it automatically taking the latest available in this channel.  That's how auto upgrade roll automatically to latest one version and if you are the customer and installing form scratch new OCS cluster you will get latest version in that channel.  If there is the way how to install for example 4.2.0 instead of 4.2.3 I am more than happy to hear how to achieve this.

Comment 10 Boris Ranto 2020-04-30 13:11:43 UTC
That is the info I got from Umanga the last time we were talking about it. IIRC, you were supposed to have an option to select it from the web UI.

@Umanga: Can you elaborate on how to (manually) select to deploy a release that is not latest for a channel?

Comment 16 Neha Berry 2020-05-08 16:19:56 UTC
@umanga, since Petr also reproduced this issue and the newer RC build of OCS 4.4 (i.e 4.4.0-rc3) , we should now consider this BZ as a blocker and try to bring in the fix.

AFAIU, Since this involves installation of OCS 4.3 from the 4.4 bundle, we would need to fix it in the same 4.4 release

Marking it as a blocker as of now. Let us know if there needs to be a change, we all can have a discussion.

Comment 17 Jose A. Rivera 2020-05-08 16:51:43 UTC
When doing this installation, is there an OCS catalog entry from the redhat-operators catalog that has the OCS 4.3 GA CSV? If so, there may be a conflict with having the OCS 4.3 GA CSV being offered from both redhat-operators and ocs-olm-operator. Does that make sense?

Can you try disabling the redhat-operators catalog and seeing if the installation succeeds?

Comment 18 Petr Balogh 2020-05-11 12:43:46 UTC
For now I am adding the must gather logs I've collected on cluster deployed on Thursday with lib bucket provisioner V2.
http://rhsqe-repo.lab.eng.blr.redhat.com/cns/ocs-qe-bugs/bz-1827689/

I will try to reproduce with disabled red-hat-operators today and will get to you back with the results.

Comment 20 Petr Balogh 2020-05-11 14:57:18 UTC
Must gather are in the same path I shared last time in this tar file:
http://rhsqe-repo.lab.eng.blr.redhat.com/cns/ocs-qe-bugs/bz-1827689/logs-with-disabled-rh-operator.tar.gz

Comment 23 Michael Adam 2020-05-14 12:34:50 UTC
This is now tracking OCP/OLM BZ. acking

Comment 26 Michael Adam 2020-05-15 23:38:57 UTC
As decided by the stakeholders, we are falling back to reverting the CRD owning change in ocs-operator.

https://github.com/openshift/ocs-operator/pull/518
has been merged to the release branch.

Comment 28 Petr Balogh 2020-05-27 10:56:51 UTC
4.3 was verified here:
https://ocs4-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/qe-deploy-ocs-cluster/7921/parameters/

4.2 was verified here:
https://ocs4-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/qe-deploy-ocs-cluster/7893/parameters/

With build: quay.io/rhceph-dev/ocs-olm-operator:4.4.0-428.ci

Comment 31 errata-xmlrpc 2020-06-04 12:54:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2393


Note You need to log in before you can comment on or make changes to this bug.