Bug 2034805 - upgrade not started for ODF 4.10
Summary: upgrade not started for ODF 4.10
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: odf-operator
Version: 4.10
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: ODF 4.10.0
Assignee: Nitin Goyal
QA Contact: Vijay Avuthu
URL:
Whiteboard:
Depends On:
Blocks: 2041522
TreeView+ depends on / blocked
 
Reported: 2021-12-22 08:23 UTC by Vijay Avuthu
Modified: 2023-08-09 17:00 UTC (History)
12 users (show)

Fixed In Version: 4.10.0-113
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 2041522 (view as bug list)
Environment:
Last Closed: 2022-04-13 18:50:46 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2022:1372 0 None None None 2022-04-13 18:53:23 UTC

Description Vijay Avuthu 2021-12-22 08:23:39 UTC
Description of problem (please be detailed as possible and provide log
snippests):

upgrade not started for ODF 4.10


Version of all relevant components (if applicable):

upgrade from: ocs-operator.v4.9.0
upgrade to: ocs-registry:4.10.0-50


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Not able to upgrade

Is there any workaround available to the best of your knowledge?
NA

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
2/2

Can this issue reproduce from the UI?
Not tried

If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. run test_upgrade test case using ocs-ci
2. check upgrade is started or not


Actual results:

upgrade is not started


Expected results:

upgrade should succeed


Additional info:

> $ oc get csv
NAME                  DISPLAY                       VERSION   REPLACES   PHASE
mcg-operator.v4.9.0   NooBaa Operator               4.9.0                Succeeded
ocs-operator.v4.9.0   OpenShift Container Storage   4.9.0                Succeeded
odf-operator.v4.9.0   OpenShift Data Foundation     4.9.0                Succeeded

> $ oc get subscription
NAME                                                             PACKAGE        SOURCE             CHANNEL
mcg-operator-stable-4.9-redhat-operators-openshift-marketplace   mcg-operator   redhat-operators   stable-4.9
ocs-operator-stable-4.9-redhat-operators-openshift-marketplace   ocs-operator   redhat-operators   stable-4.9
odf-operator                                                     odf-operator   redhat-operators   stable-4.10
[vavuthu@vavuthu rem]$ 

> $ oc describe subscriptions.operators.coreos.com odf-operator
Name:         odf-operator
Namespace:    openshift-storage
Labels:       operators.coreos.com/odf-operator.openshift-storage=
Annotations:  <none>
API Version:  operators.coreos.com/v1alpha1
Kind:         Subscription

Spec:
  Channel:           stable-4.10
  Name:              odf-operator
  Source:            redhat-operators
  Source Namespace:  openshift-marketplace
Status:
  Catalog Health:
    Catalog Source Ref:
      API Version:       operators.coreos.com/v1alpha1
      Kind:              CatalogSource
      Name:              certified-operators
      Namespace:         openshift-marketplace
      Resource Version:  31559
      UID:               4802d418-60a6-4b76-b239-22aa3e5143e4
    Healthy:             true
    Last Updated:        2021-12-22T04:28:11Z
    Catalog Source Ref:
      API Version:       operators.coreos.com/v1alpha1
      Kind:              CatalogSource
      Name:              community-operators
      Namespace:         openshift-marketplace
      Resource Version:  38125
      UID:               5a18af75-a682-4f2b-af66-2ae1447b1dcf
    Healthy:             true
    Last Updated:        2021-12-22T04:28:11Z
    Catalog Source Ref:
      API Version:       operators.coreos.com/v1alpha1
      Kind:              CatalogSource
      Name:              redhat-marketplace
      Namespace:         openshift-marketplace
      Resource Version:  38766
      UID:               54336309-010f-4f2a-a03b-b2238c9b82a6
    Healthy:             true
    Last Updated:        2021-12-22T04:28:11Z
    Catalog Source Ref:
      API Version:       operators.coreos.com/v1alpha1
      Kind:              CatalogSource
      Name:              redhat-operators
      Namespace:         openshift-marketplace
      Resource Version:  38775
      UID:               b88f5213-8b54-4dc4-b48f-3e5d096b48ff
    Healthy:             true
    Last Updated:        2021-12-22T04:28:11Z
  Conditions:
    Last Transition Time:   2021-12-22T04:28:11Z
    Message:                all available catalogsources are healthy
    Reason:                 AllCatalogSourcesHealthy
    Status:                 False
    Type:                   CatalogSourcesUnhealthy
  Current CSV:              odf-operator.v4.9.0
  Install Plan Generation:  1
  Install Plan Ref:
    API Version:       operators.coreos.com/v1alpha1
    Kind:              InstallPlan
    Name:              install-w268s
    Namespace:         openshift-storage
    Resource Version:  25299
    UID:               159fb987-59fb-4378-b0c9-40494442d380
  Installed CSV:       odf-operator.v4.9.0
  Installplan:
    API Version:  operators.coreos.com/v1alpha1
    Kind:         InstallPlan
    Name:         install-w268s
    Uuid:         159fb987-59fb-4378-b0c9-40494442d380
  Last Updated:   2021-12-22T04:28:11Z
  State:          AtLatestKnown
Events:           <none>


Job: https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/2676/consoleFull

must gather: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j-128vu1cs33-ua/j-128vu1cs33-ua_20211222T034509/logs/failed_testcase_ocs_logs_1640147166/test_upgrade_ocs_logs/

> cluster is alive for debugging

Comment 5 Nitin Goyal 2022-01-05 04:32:29 UTC
As of now dependencies.yaml in the odf-operator has ocs-operator 4.10 only which is causing this problem. 

When we try to upgrade the odf-operator from 4.9 to 4.10 OLM is not able to upgrade as odf-operator 4.10 can not be run with the ocs-operator 4.9 because of its dependencies.yaml. To come out of this situation we need to add ocs-operator 4.9 to 4.10 in the dependencies.yaml of the odf-operator.

Moving it to the build team as dependencies.yaml is handled by them.

@branto Can you pls add the 4.9 also in the dependencies.yaml, I remember we were facing some difficulties when we had it in the initial builds of 4.10 and we removed it to solve the issue. Lets do it again and run tests in the debug mode so that the setup won't get destroyed automatically.

Comment 8 Vijay Avuthu 2022-01-12 08:32:15 UTC
Upgrade failed even we see odf-operator is succeeded but there are missing mcg and ocs-operator

> csvs after upgrade ( missing mcg and ocs operator )

NAME                   DISPLAY                     VERSION   REPLACES              PHASE
odf-operator.v4.10.0   OpenShift Data Foundation   4.10.0    odf-operator.v4.9.1   Succeeded

> subscriptions

NAME                                                             PACKAGE        SOURCE             CHANNEL
mcg-operator-stable-4.9-redhat-operators-openshift-marketplace   mcg-operator   redhat-operators   stable-4.10
ocs-operator-stable-4.9-redhat-operators-openshift-marketplace   ocs-operator   redhat-operators   stable-4.10
odf-operator                                                     odf-operator   redhat-operators   stable-4.10


> install plans

NAME            CSV                    APPROVAL    APPROVED
install-7dbnx   odf-operator.v4.9.1    Automatic   true
install-hv59z   odf-operator.v4.10.0   Automatic   true

> storagesytem yaml

 status:
    conditions:
    - lastHeartbeatTime: "2022-01-12T00:15:25Z"
      lastTransitionTime: "2022-01-11T23:36:53Z"
      message: Reconcile is in progress
      reason: Reconciling
      status: "False"
      type: Available
    - lastHeartbeatTime: "2022-01-12T00:15:25Z"
      lastTransitionTime: "2022-01-11T23:36:53Z"
      message: Reconcile is in progress
      reason: Reconciling
      status: "True"
      type: Progressing
    - lastHeartbeatTime: "2022-01-12T00:15:25Z"
      lastTransitionTime: "2022-01-11T23:02:30Z"
      message: StorageSystem CR is valid
      reason: Valid
      status: "False"
      type: StorageSystemInvalid
    - lastHeartbeatTime: "2022-01-12T00:15:25Z"
      lastTransitionTime: "2022-01-11T23:36:54Z"
      message: InstallPlan not found for CSV mcg-operator.v4.10.0; InstallPlan not found for CSV ocs-operator.v4.10.0
      reason: NotReady
      status: "False"
      type: VendorCsvReady
    - lastHeartbeatTime: "2022-01-11T23:02:30Z"
      lastTransitionTime: "2022-01-11T23:02:30Z"
      reason: Found
      status: "True"
      type: VendorSystemPresent

Comment 9 Vijay Avuthu 2022-01-12 08:35:30 UTC
> odf operator log

2022-01-11T23:58:45.259671211Z 2022-01-11T23:58:45.259Z	ERROR	controller-runtime.manager.controller.storagesystem	Reconciler error	{"reconciler group": "odf.openshift.io", "reconciler kind": "StorageSystem", "name": "ocs-storagecluster-storagesystem", "namespace": "openshift-storage", "error": "InstallPlan not found for CSV mcg-operator.v4.10.0; InstallPlan not found for CSV ocs-operator.v4.10.0", "errorCauses": [{"error": "InstallPlan not found for CSV mcg-operator.v4.10.0"}, {"error": "InstallPlan not found for CSV ocs-operator.v4.10.0"}]}
2022-01-11T23:58:45.259671211Z sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
2022-01-11T23:58:45.259671211Z 	/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253
2022-01-11T23:58:45.259671211Z sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
2022-01-11T23:58:45.259671211Z 	/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:214
2022-01-12T00:15:24.919494632Z 2022-01-12T00:15:24.919Z	ERROR	controller-runtime.manager.controller.subscription	Reconciler error	{"reconciler group": "operators.coreos.com", "reconciler kind": "Subscription", "name": "odf-operator", "namespace": "openshift-storage", "error": "InstallPlan not found for CSV mcg-operator.v4.10.0; InstallPlan not found for CSV ocs-operator.v4.10.0", "errorCauses": [{"error": "InstallPlan not found for CSV mcg-operator.v4.10.0"}, {"error": "InstallPlan not found for CSV ocs-operator.v4.10.0"}]}
2022-01-12T00:15:24.919494632Z sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
2022-01-12T00:15:24.919494632Z 	/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253
2022-01-12T00:15:24.919494632Z sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
2022-01-12T00:15:24.919494632Z 	/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:214
2022-01-12T00:15:25.260432607Z 2022-01-12T00:15:25.260Z	INFO	controllers.StorageSystem	storagesystem instance found	{"instance": "openshift-storage/ocs-storagecluster-storagesystem"}
2022-01-12T00:15:25.265186695Z 2022-01-12T00:15:25.265Z	INFO	controllers.StorageSystem	Updating quickstarts	{"instance": "openshift-storage/ocs-storagecluster-storagesystem", "Name": "getting-started-odf", "Namespace": ""}
2022-01-12T00:15:25.269798107Z 2022-01-12T00:15:25.269Z	INFO	controllers.StorageSystem	Updating quickstarts	{"instance": "openshift-storage/ocs-storagecluster-storagesystem", "Name": "odf-configuration", "Namespace": ""}
2022-01-12T00:15:25.270183063Z 2022-01-12T00:15:25.270Z	ERROR	controllers.StorageSystem	failed to validate CSV	{"instance": "openshift-storage/ocs-storagecluster-storagesystem", "ClusterServiceVersion": "mcg-operator.v4.10.0", "error": "InstallPlan not found for CSV mcg-operator.v4.10.0"}
2022-01-12T00:15:25.270183063Z github.com/red-hat-data-services/odf-operator/controllers.(*StorageSystemReconciler).reconcile
2022-01-12T00:15:25.270183063Z 	/remote-source/app/controllers/storagesystem_controller.go:163
2022-01-12T00:15:25.270183063Z github.com/red-hat-data-services/odf-operator/controllers.(*StorageSystemReconciler).Reconcile
2022-01-12T00:15:25.270183063Z 	/remote-source/app/controllers/storagesystem_controller.go:87
2022-01-12T00:15:25.270183063Z sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
2022-01-12T00:15:25.270183063Z 	/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:298
2022-01-12T00:15:25.270183063Z sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
2022-01-12T00:15:25.270183063Z 	/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253
2022-01-12T00:15:25.270183063Z sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
2022-01-12T00:15:25.270183063Z 	/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:214
2022-01-12T00:15:25.270183063Z 2022-01-12T00:15:25.270Z	ERROR	controllers.StorageSystem	failed to validate CSV	{"instance": "openshift-storage/ocs-storagecluster-storagesystem", "ClusterServiceVersion": "ocs-operator.v4.10.0", "error": "InstallPlan not found for CSV ocs-operator.v4.10.0"}

Comment 10 Vijay Avuthu 2022-01-12 09:29:29 UTC
(In reply to Vijay Avuthu from comment #9)
> > odf operator log
> 
> 2022-01-11T23:58:45.259671211Z 2022-01-11T23:58:45.259Z	ERROR
> controller-runtime.manager.controller.storagesystem	Reconciler error
> {"reconciler group": "odf.openshift.io", "reconciler kind": "StorageSystem",
> "name": "ocs-storagecluster-storagesystem", "namespace":
> "openshift-storage", "error": "InstallPlan not found for CSV
> mcg-operator.v4.10.0; InstallPlan not found for CSV ocs-operator.v4.10.0",
> "errorCauses": [{"error": "InstallPlan not found for CSV
> mcg-operator.v4.10.0"}, {"error": "InstallPlan not found for CSV
> ocs-operator.v4.10.0"}]}
> 2022-01-11T23:58:45.259671211Z
> sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).
> processNextWorkItem
> 2022-01-11T23:58:45.259671211Z 
> /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/
> controller/controller.go:253
> 2022-01-11T23:58:45.259671211Z
> sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.
> func2.2
> 2022-01-11T23:58:45.259671211Z 
> /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/
> controller/controller.go:214
> 2022-01-12T00:15:24.919494632Z 2022-01-12T00:15:24.919Z	ERROR
> controller-runtime.manager.controller.subscription	Reconciler error
> {"reconciler group": "operators.coreos.com", "reconciler kind":
> "Subscription", "name": "odf-operator", "namespace": "openshift-storage",
> "error": "InstallPlan not found for CSV mcg-operator.v4.10.0; InstallPlan
> not found for CSV ocs-operator.v4.10.0", "errorCauses": [{"error":
> "InstallPlan not found for CSV mcg-operator.v4.10.0"}, {"error":
> "InstallPlan not found for CSV ocs-operator.v4.10.0"}]}
> 2022-01-12T00:15:24.919494632Z
> sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).
> processNextWorkItem
> 2022-01-12T00:15:24.919494632Z 
> /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/
> controller/controller.go:253
> 2022-01-12T00:15:24.919494632Z
> sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.
> func2.2
> 2022-01-12T00:15:24.919494632Z 
> /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/
> controller/controller.go:214
> 2022-01-12T00:15:25.260432607Z 2022-01-12T00:15:25.260Z	INFO
> controllers.StorageSystem	storagesystem instance found	{"instance":
> "openshift-storage/ocs-storagecluster-storagesystem"}
> 2022-01-12T00:15:25.265186695Z 2022-01-12T00:15:25.265Z	INFO
> controllers.StorageSystem	Updating quickstarts	{"instance":
> "openshift-storage/ocs-storagecluster-storagesystem", "Name":
> "getting-started-odf", "Namespace": ""}
> 2022-01-12T00:15:25.269798107Z 2022-01-12T00:15:25.269Z	INFO
> controllers.StorageSystem	Updating quickstarts	{"instance":
> "openshift-storage/ocs-storagecluster-storagesystem", "Name":
> "odf-configuration", "Namespace": ""}
> 2022-01-12T00:15:25.270183063Z 2022-01-12T00:15:25.270Z	ERROR
> controllers.StorageSystem	failed to validate CSV	{"instance":
> "openshift-storage/ocs-storagecluster-storagesystem",
> "ClusterServiceVersion": "mcg-operator.v4.10.0", "error": "InstallPlan not
> found for CSV mcg-operator.v4.10.0"}
> 2022-01-12T00:15:25.270183063Z
> github.com/red-hat-data-services/odf-operator/controllers.
> (*StorageSystemReconciler).reconcile
> 2022-01-12T00:15:25.270183063Z 
> /remote-source/app/controllers/storagesystem_controller.go:163
> 2022-01-12T00:15:25.270183063Z
> github.com/red-hat-data-services/odf-operator/controllers.
> (*StorageSystemReconciler).Reconcile
> 2022-01-12T00:15:25.270183063Z 
> /remote-source/app/controllers/storagesystem_controller.go:87
> 2022-01-12T00:15:25.270183063Z
> sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).
> reconcileHandler
> 2022-01-12T00:15:25.270183063Z 
> /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/
> controller/controller.go:298
> 2022-01-12T00:15:25.270183063Z
> sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).
> processNextWorkItem
> 2022-01-12T00:15:25.270183063Z 
> /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/
> controller/controller.go:253
> 2022-01-12T00:15:25.270183063Z
> sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.
> func2.2
> 2022-01-12T00:15:25.270183063Z 
> /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/
> controller/controller.go:214
> 2022-01-12T00:15:25.270183063Z 2022-01-12T00:15:25.270Z	ERROR
> controllers.StorageSystem	failed to validate CSV	{"instance":
> "openshift-storage/ocs-storagecluster-storagesystem",
> "ClusterServiceVersion": "ocs-operator.v4.10.0", "error": "InstallPlan not
> found for CSV ocs-operator.v4.10.0"}


must gather logs: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j-138vu1cs33-ua/j-138vu1cs33-ua_20220111T223532/logs/failed_testcase_ocs_logs_1641943067/test_upgrade_ocs_logs/
job: https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/2874/consoleFull

Comment 11 Nitin Goyal 2022-01-17 15:33:41 UTC
root-cause:

while upgrading odf-operator odf-operator csv get deleted and replaced
with the new one. which cause ocs-operator, mcg-operator to be deleted
as the odf-operator csv was the owner to the ocs-operator, mcg-operator.

To prevent ocs-operator, mcg-operator to be deleted we need to remove
odf-operator CSV as a owner and add odf-operator subscription as a owner
for the garbage collector.

PR: https://github.com/red-hat-storage/odf-operator/pull/166

Comment 17 Vijay Avuthu 2022-01-25 02:47:56 UTC
we are hitting 2 separate issues while( after ) upgrading and raised issue for the same.

https://bugzilla.redhat.com/show_bug.cgi?id=2043513

https://bugzilla.redhat.com/show_bug.cgi?id=2043510

since all csv are upgraded , marking this bug as verified and we will track other issues as separate bugs

Comment 22 errata-xmlrpc 2022-04-13 18:50:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.10.0 enhancement, security & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1372

Comment 23 errata-xmlrpc 2022-04-13 18:53:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.10.0 enhancement, security & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1372


Note You need to log in before you can comment on or make changes to this bug.