Description of problem (please be detailed as possible and provide log snippests): upgrade not started for ODF 4.10 Version of all relevant components (if applicable): upgrade from: ocs-operator.v4.9.0 upgrade to: ocs-registry:4.10.0-50 Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Not able to upgrade Is there any workaround available to the best of your knowledge? NA Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 1 Can this issue reproducible? 2/2 Can this issue reproduce from the UI? Not tried If this is a regression, please provide more details to justify this: Steps to Reproduce: 1. run test_upgrade test case using ocs-ci 2. check upgrade is started or not Actual results: upgrade is not started Expected results: upgrade should succeed Additional info: > $ oc get csv NAME DISPLAY VERSION REPLACES PHASE mcg-operator.v4.9.0 NooBaa Operator 4.9.0 Succeeded ocs-operator.v4.9.0 OpenShift Container Storage 4.9.0 Succeeded odf-operator.v4.9.0 OpenShift Data Foundation 4.9.0 Succeeded > $ oc get subscription NAME PACKAGE SOURCE CHANNEL mcg-operator-stable-4.9-redhat-operators-openshift-marketplace mcg-operator redhat-operators stable-4.9 ocs-operator-stable-4.9-redhat-operators-openshift-marketplace ocs-operator redhat-operators stable-4.9 odf-operator odf-operator redhat-operators stable-4.10 [vavuthu@vavuthu rem]$ > $ oc describe subscriptions.operators.coreos.com odf-operator Name: odf-operator Namespace: openshift-storage Labels: operators.coreos.com/odf-operator.openshift-storage= Annotations: <none> API Version: operators.coreos.com/v1alpha1 Kind: Subscription Spec: Channel: stable-4.10 Name: odf-operator Source: redhat-operators Source Namespace: openshift-marketplace Status: Catalog Health: Catalog Source Ref: API Version: operators.coreos.com/v1alpha1 Kind: CatalogSource Name: certified-operators Namespace: openshift-marketplace Resource Version: 31559 UID: 4802d418-60a6-4b76-b239-22aa3e5143e4 Healthy: true Last Updated: 2021-12-22T04:28:11Z Catalog Source Ref: API Version: operators.coreos.com/v1alpha1 Kind: CatalogSource Name: community-operators Namespace: openshift-marketplace Resource Version: 38125 UID: 5a18af75-a682-4f2b-af66-2ae1447b1dcf Healthy: true Last Updated: 2021-12-22T04:28:11Z Catalog Source Ref: API Version: operators.coreos.com/v1alpha1 Kind: CatalogSource Name: redhat-marketplace Namespace: openshift-marketplace Resource Version: 38766 UID: 54336309-010f-4f2a-a03b-b2238c9b82a6 Healthy: true Last Updated: 2021-12-22T04:28:11Z Catalog Source Ref: API Version: operators.coreos.com/v1alpha1 Kind: CatalogSource Name: redhat-operators Namespace: openshift-marketplace Resource Version: 38775 UID: b88f5213-8b54-4dc4-b48f-3e5d096b48ff Healthy: true Last Updated: 2021-12-22T04:28:11Z Conditions: Last Transition Time: 2021-12-22T04:28:11Z Message: all available catalogsources are healthy Reason: AllCatalogSourcesHealthy Status: False Type: CatalogSourcesUnhealthy Current CSV: odf-operator.v4.9.0 Install Plan Generation: 1 Install Plan Ref: API Version: operators.coreos.com/v1alpha1 Kind: InstallPlan Name: install-w268s Namespace: openshift-storage Resource Version: 25299 UID: 159fb987-59fb-4378-b0c9-40494442d380 Installed CSV: odf-operator.v4.9.0 Installplan: API Version: operators.coreos.com/v1alpha1 Kind: InstallPlan Name: install-w268s Uuid: 159fb987-59fb-4378-b0c9-40494442d380 Last Updated: 2021-12-22T04:28:11Z State: AtLatestKnown Events: <none> Job: https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/2676/consoleFull must gather: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j-128vu1cs33-ua/j-128vu1cs33-ua_20211222T034509/logs/failed_testcase_ocs_logs_1640147166/test_upgrade_ocs_logs/ > cluster is alive for debugging
As of now dependencies.yaml in the odf-operator has ocs-operator 4.10 only which is causing this problem. When we try to upgrade the odf-operator from 4.9 to 4.10 OLM is not able to upgrade as odf-operator 4.10 can not be run with the ocs-operator 4.9 because of its dependencies.yaml. To come out of this situation we need to add ocs-operator 4.9 to 4.10 in the dependencies.yaml of the odf-operator. Moving it to the build team as dependencies.yaml is handled by them. @branto Can you pls add the 4.9 also in the dependencies.yaml, I remember we were facing some difficulties when we had it in the initial builds of 4.10 and we removed it to solve the issue. Lets do it again and run tests in the debug mode so that the setup won't get destroyed automatically.
Upgrade failed even we see odf-operator is succeeded but there are missing mcg and ocs-operator > csvs after upgrade ( missing mcg and ocs operator ) NAME DISPLAY VERSION REPLACES PHASE odf-operator.v4.10.0 OpenShift Data Foundation 4.10.0 odf-operator.v4.9.1 Succeeded > subscriptions NAME PACKAGE SOURCE CHANNEL mcg-operator-stable-4.9-redhat-operators-openshift-marketplace mcg-operator redhat-operators stable-4.10 ocs-operator-stable-4.9-redhat-operators-openshift-marketplace ocs-operator redhat-operators stable-4.10 odf-operator odf-operator redhat-operators stable-4.10 > install plans NAME CSV APPROVAL APPROVED install-7dbnx odf-operator.v4.9.1 Automatic true install-hv59z odf-operator.v4.10.0 Automatic true > storagesytem yaml status: conditions: - lastHeartbeatTime: "2022-01-12T00:15:25Z" lastTransitionTime: "2022-01-11T23:36:53Z" message: Reconcile is in progress reason: Reconciling status: "False" type: Available - lastHeartbeatTime: "2022-01-12T00:15:25Z" lastTransitionTime: "2022-01-11T23:36:53Z" message: Reconcile is in progress reason: Reconciling status: "True" type: Progressing - lastHeartbeatTime: "2022-01-12T00:15:25Z" lastTransitionTime: "2022-01-11T23:02:30Z" message: StorageSystem CR is valid reason: Valid status: "False" type: StorageSystemInvalid - lastHeartbeatTime: "2022-01-12T00:15:25Z" lastTransitionTime: "2022-01-11T23:36:54Z" message: InstallPlan not found for CSV mcg-operator.v4.10.0; InstallPlan not found for CSV ocs-operator.v4.10.0 reason: NotReady status: "False" type: VendorCsvReady - lastHeartbeatTime: "2022-01-11T23:02:30Z" lastTransitionTime: "2022-01-11T23:02:30Z" reason: Found status: "True" type: VendorSystemPresent
> odf operator log 2022-01-11T23:58:45.259671211Z 2022-01-11T23:58:45.259Z ERROR controller-runtime.manager.controller.storagesystem Reconciler error {"reconciler group": "odf.openshift.io", "reconciler kind": "StorageSystem", "name": "ocs-storagecluster-storagesystem", "namespace": "openshift-storage", "error": "InstallPlan not found for CSV mcg-operator.v4.10.0; InstallPlan not found for CSV ocs-operator.v4.10.0", "errorCauses": [{"error": "InstallPlan not found for CSV mcg-operator.v4.10.0"}, {"error": "InstallPlan not found for CSV ocs-operator.v4.10.0"}]} 2022-01-11T23:58:45.259671211Z sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem 2022-01-11T23:58:45.259671211Z /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253 2022-01-11T23:58:45.259671211Z sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 2022-01-11T23:58:45.259671211Z /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:214 2022-01-12T00:15:24.919494632Z 2022-01-12T00:15:24.919Z ERROR controller-runtime.manager.controller.subscription Reconciler error {"reconciler group": "operators.coreos.com", "reconciler kind": "Subscription", "name": "odf-operator", "namespace": "openshift-storage", "error": "InstallPlan not found for CSV mcg-operator.v4.10.0; InstallPlan not found for CSV ocs-operator.v4.10.0", "errorCauses": [{"error": "InstallPlan not found for CSV mcg-operator.v4.10.0"}, {"error": "InstallPlan not found for CSV ocs-operator.v4.10.0"}]} 2022-01-12T00:15:24.919494632Z sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem 2022-01-12T00:15:24.919494632Z /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253 2022-01-12T00:15:24.919494632Z sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 2022-01-12T00:15:24.919494632Z /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:214 2022-01-12T00:15:25.260432607Z 2022-01-12T00:15:25.260Z INFO controllers.StorageSystem storagesystem instance found {"instance": "openshift-storage/ocs-storagecluster-storagesystem"} 2022-01-12T00:15:25.265186695Z 2022-01-12T00:15:25.265Z INFO controllers.StorageSystem Updating quickstarts {"instance": "openshift-storage/ocs-storagecluster-storagesystem", "Name": "getting-started-odf", "Namespace": ""} 2022-01-12T00:15:25.269798107Z 2022-01-12T00:15:25.269Z INFO controllers.StorageSystem Updating quickstarts {"instance": "openshift-storage/ocs-storagecluster-storagesystem", "Name": "odf-configuration", "Namespace": ""} 2022-01-12T00:15:25.270183063Z 2022-01-12T00:15:25.270Z ERROR controllers.StorageSystem failed to validate CSV {"instance": "openshift-storage/ocs-storagecluster-storagesystem", "ClusterServiceVersion": "mcg-operator.v4.10.0", "error": "InstallPlan not found for CSV mcg-operator.v4.10.0"} 2022-01-12T00:15:25.270183063Z github.com/red-hat-data-services/odf-operator/controllers.(*StorageSystemReconciler).reconcile 2022-01-12T00:15:25.270183063Z /remote-source/app/controllers/storagesystem_controller.go:163 2022-01-12T00:15:25.270183063Z github.com/red-hat-data-services/odf-operator/controllers.(*StorageSystemReconciler).Reconcile 2022-01-12T00:15:25.270183063Z /remote-source/app/controllers/storagesystem_controller.go:87 2022-01-12T00:15:25.270183063Z sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler 2022-01-12T00:15:25.270183063Z /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:298 2022-01-12T00:15:25.270183063Z sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem 2022-01-12T00:15:25.270183063Z /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253 2022-01-12T00:15:25.270183063Z sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 2022-01-12T00:15:25.270183063Z /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:214 2022-01-12T00:15:25.270183063Z 2022-01-12T00:15:25.270Z ERROR controllers.StorageSystem failed to validate CSV {"instance": "openshift-storage/ocs-storagecluster-storagesystem", "ClusterServiceVersion": "ocs-operator.v4.10.0", "error": "InstallPlan not found for CSV ocs-operator.v4.10.0"}
(In reply to Vijay Avuthu from comment #9) > > odf operator log > > 2022-01-11T23:58:45.259671211Z 2022-01-11T23:58:45.259Z ERROR > controller-runtime.manager.controller.storagesystem Reconciler error > {"reconciler group": "odf.openshift.io", "reconciler kind": "StorageSystem", > "name": "ocs-storagecluster-storagesystem", "namespace": > "openshift-storage", "error": "InstallPlan not found for CSV > mcg-operator.v4.10.0; InstallPlan not found for CSV ocs-operator.v4.10.0", > "errorCauses": [{"error": "InstallPlan not found for CSV > mcg-operator.v4.10.0"}, {"error": "InstallPlan not found for CSV > ocs-operator.v4.10.0"}]} > 2022-01-11T23:58:45.259671211Z > sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller). > processNextWorkItem > 2022-01-11T23:58:45.259671211Z > /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/ > controller/controller.go:253 > 2022-01-11T23:58:45.259671211Z > sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start. > func2.2 > 2022-01-11T23:58:45.259671211Z > /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/ > controller/controller.go:214 > 2022-01-12T00:15:24.919494632Z 2022-01-12T00:15:24.919Z ERROR > controller-runtime.manager.controller.subscription Reconciler error > {"reconciler group": "operators.coreos.com", "reconciler kind": > "Subscription", "name": "odf-operator", "namespace": "openshift-storage", > "error": "InstallPlan not found for CSV mcg-operator.v4.10.0; InstallPlan > not found for CSV ocs-operator.v4.10.0", "errorCauses": [{"error": > "InstallPlan not found for CSV mcg-operator.v4.10.0"}, {"error": > "InstallPlan not found for CSV ocs-operator.v4.10.0"}]} > 2022-01-12T00:15:24.919494632Z > sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller). > processNextWorkItem > 2022-01-12T00:15:24.919494632Z > /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/ > controller/controller.go:253 > 2022-01-12T00:15:24.919494632Z > sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start. > func2.2 > 2022-01-12T00:15:24.919494632Z > /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/ > controller/controller.go:214 > 2022-01-12T00:15:25.260432607Z 2022-01-12T00:15:25.260Z INFO > controllers.StorageSystem storagesystem instance found {"instance": > "openshift-storage/ocs-storagecluster-storagesystem"} > 2022-01-12T00:15:25.265186695Z 2022-01-12T00:15:25.265Z INFO > controllers.StorageSystem Updating quickstarts {"instance": > "openshift-storage/ocs-storagecluster-storagesystem", "Name": > "getting-started-odf", "Namespace": ""} > 2022-01-12T00:15:25.269798107Z 2022-01-12T00:15:25.269Z INFO > controllers.StorageSystem Updating quickstarts {"instance": > "openshift-storage/ocs-storagecluster-storagesystem", "Name": > "odf-configuration", "Namespace": ""} > 2022-01-12T00:15:25.270183063Z 2022-01-12T00:15:25.270Z ERROR > controllers.StorageSystem failed to validate CSV {"instance": > "openshift-storage/ocs-storagecluster-storagesystem", > "ClusterServiceVersion": "mcg-operator.v4.10.0", "error": "InstallPlan not > found for CSV mcg-operator.v4.10.0"} > 2022-01-12T00:15:25.270183063Z > github.com/red-hat-data-services/odf-operator/controllers. > (*StorageSystemReconciler).reconcile > 2022-01-12T00:15:25.270183063Z > /remote-source/app/controllers/storagesystem_controller.go:163 > 2022-01-12T00:15:25.270183063Z > github.com/red-hat-data-services/odf-operator/controllers. > (*StorageSystemReconciler).Reconcile > 2022-01-12T00:15:25.270183063Z > /remote-source/app/controllers/storagesystem_controller.go:87 > 2022-01-12T00:15:25.270183063Z > sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller). > reconcileHandler > 2022-01-12T00:15:25.270183063Z > /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/ > controller/controller.go:298 > 2022-01-12T00:15:25.270183063Z > sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller). > processNextWorkItem > 2022-01-12T00:15:25.270183063Z > /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/ > controller/controller.go:253 > 2022-01-12T00:15:25.270183063Z > sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start. > func2.2 > 2022-01-12T00:15:25.270183063Z > /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/ > controller/controller.go:214 > 2022-01-12T00:15:25.270183063Z 2022-01-12T00:15:25.270Z ERROR > controllers.StorageSystem failed to validate CSV {"instance": > "openshift-storage/ocs-storagecluster-storagesystem", > "ClusterServiceVersion": "ocs-operator.v4.10.0", "error": "InstallPlan not > found for CSV ocs-operator.v4.10.0"} must gather logs: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j-138vu1cs33-ua/j-138vu1cs33-ua_20220111T223532/logs/failed_testcase_ocs_logs_1641943067/test_upgrade_ocs_logs/ job: https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/2874/consoleFull
root-cause: while upgrading odf-operator odf-operator csv get deleted and replaced with the new one. which cause ocs-operator, mcg-operator to be deleted as the odf-operator csv was the owner to the ocs-operator, mcg-operator. To prevent ocs-operator, mcg-operator to be deleted we need to remove odf-operator CSV as a owner and add odf-operator subscription as a owner for the garbage collector. PR: https://github.com/red-hat-storage/odf-operator/pull/166
I see that Vijay tried to re-trigger and it passed here: https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/3035/testReport/tests.ecosystem.upgrade/ For sure I am trying once more here with latest build: https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-trigger-aws-ipi-3az-rhcos-3m-3w-upgrade-ocs-auto/130/
we are hitting 2 separate issues while( after ) upgrading and raised issue for the same. https://bugzilla.redhat.com/show_bug.cgi?id=2043513 https://bugzilla.redhat.com/show_bug.cgi?id=2043510 since all csv are upgraded , marking this bug as verified and we will track other issues as separate bugs
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.10.0 enhancement, security & bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:1372