Description of problem: Cluster with addon version v2.0.1 and OCP version 4.8.36 upgrade Failed to addon deployer version v2.0.2 while preparing for Deployer UPgrade v2.0.1 to v2.0.2 on the stagging stable add-on, we have 2 types of cluster setup Setup 1. Provide OCP4.10.14+ ODF addon v2.0.1 and 2 Consumer with OCP4.10.14 and ODF Consumer add-on v2.0.1 Setup 2. Provide OCP4.10.14+ ODF addon v2.0.1 and 2 Consumer with OCP4.8.36 and ODF Consumer add-on v2.0.1 Upgrade succeeded on seyp 1 provider and consumer however failed to upgrade on consumers of setup2 clusters Version-Release number of selected component (if applicable): oc get csv -n openshift-storage NAME DISPLAY VERSION REPLACES PHASE mcg-operator.v4.10.2 NooBaa Operator 4.10.2 mcg-operator.v4.10.1 Succeeded ocs-operator.v4.10.0 OpenShift Container Storage 4.10.0 Succeeded ocs-osd-deployer.v2.0.1 OCS OSD Deployer 2.0.1 ocs-osd-deployer.v2.0.0 Succeeded odf-operator.v4.10.0 OpenShift Data Foundation 4.10.0 Succeeded ose-prometheus-operator.4.10.0 Prometheus Operator 4.10.0 ose-prometheus-operator.4.8.0 Succeeded route-monitor-operator.v0.1.418-6459408 Route Monitor Operator 0.1.418-6459408 route-monitor-operator.v0.1.408-c2256a2 Succeeded Openshitversion: 4.8.36 Addon - ocs-consumer in stagging env How reproducible: 4/4 Steps to Reproduce: 1. Create an appliance provider cluster with OCP 4.10 and ocs-provider addon (rosa create service --type ocs-provider --name $CLUSTER_NAME --size 20 --onboarding-validation-key $CONSUMER_KEY --subnet-ids $SUBNET_IDS ) 2.Create rosa Consumer cluster with OCP4.8 and ocs-consumer addon 3.Initiate upgrade (https://gitlab.cee.redhat.com/service/managed-tenants/-/merge_requests/2376 https://gitlab.cee.redhat.com/service/managed-tenants/-/merge_requests/2377) Actual results: Consumer CLuster with OCP4.8 and ODF 2.0.1 Failed to upgrade to deployer version v2.0.2 Expected results: Consumer CLuster with OCP4.8 and ODF 2.0.1 should also upgrade to deployer version v2.0.2 Additional info: Logs: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/sgatfane-m26c1/sgatfane-m26c1_20220526T145938/openshift-cluster-dir/bz_upgrade_2091594/ The similar issue was observed while upgrading QE add-on more discussion in slack thread: https://coreos.slack.com/archives/C01L46M0FQC/p1652954205830049 => https://coreos.slack.com/archives/C01L46M0FQC/p1653912629377239?thread_ts=1652954205.830049&cid=C01L46M0FQC https://coreos.slack.com/archives/C01L46M0FQC/p1653914999880409 Gchat room thread: https://chat.google.com/room/AAAASHA9vWs/xmAh4PDRZh0 the probable reason mentioned in g-chat thread: `Previously addon catalogSource was created in openshift-marketplace but MT-SRE have updated tooling such that addon catalog will get created in targetNamespace That’s the reason they have to create network policy for catalogSource`
RCA: ODF 4.10 deployments include an operator named odf-csi-addons-operator which odf-operator is creating a subscription object for in code. Because the subscription is created manually, and not using OLM dependencies, it means that the subscription is created with a static catalog namespace which is openshift-marketplace. On ODF MS deployments, we override the marketplace catalog with a local catalog, inside the openshift-storage namespace. This work for all dependencies that are coming via olm dependencies including ocs-operator and mcg-operator. But for the odf-csi-addons-operator operator the subscription is still referring to the openshift-marketplace catalog. On OCP 4.8 deployments, OLM is unable to satisfy the subscription. OLM model operator upgrade is "all or nothing" inside a single namespace. This means that a single unsatisfied subscription will block any other subscription updates/changes until that issue is resolved. Because we have a broken subscription in the namespace the addon (deployer) upgrade is halted and will not continue until the odf-csi-addons-operator subscription will either be deleted or updated. -------------------------------------------------- Manual Mitigation (workaround): An SRE will have to go into the openshift-namespace and edit the subscription for odf-csi-addons-operator, changing the catalog namesapce from openshift-marketplace to openshift-storage. This workaround was tried and proven successful -------------------------------------------------- Fix: The product needs to add the odf-csi-addons-operator into odf-operator dependencies.yaml to be resolved by OLM.