Bug 2231074
Summary: | Upgrade from 4.12.z to 4.13.z fails if StorageCluster is not created | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | umanga <uchapaga> | |
Component: | ocs-operator | Assignee: | Malay Kumar parida <mparida> | |
Status: | CLOSED ERRATA | QA Contact: | Oded <oviner> | |
Severity: | low | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 4.13 | CC: | mparida, odf-bz-bot | |
Target Milestone: | --- | |||
Target Release: | ODF 4.14.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | 4.14.0-128 | Doc Type: | No Doc Update | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 2235571 (view as bug list) | Environment: | ||
Last Closed: | 2023-11-08 18:53:30 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 2235571 |
Description
umanga
2023-08-10 13:24:03 UTC
The same issue can also be seen when upgrading without storagecluster from 4.13 to 4.14 as well. The root cause for this is the ocsinitialization controller just creates the ocs-operator-config cm(from which rook-ceph-operator pod takes env values for configuration). The task of keeping it updated is on the storagecluster controller. But when a storagecluster is not there and an upgrade happens the configmap is not updated but the new rook operator looks for the Key for configuration. So the rook-cpeh-operator pod fails & so does the upgrade. The solution is to change the behavior of the handling of the configmap so that when the storagecluster controller is not present ocsinitialization controller should own the configmap & keep it updated. But when the storagecluster is present it should do nothing and leave it to the storagecluster controller. Why should the ODF operator be upgraded if the storage cluster is not installed? We can delete ODF and re-install the new operator Hi Oded although that's true that they can just delete the whole thing and re-install the new version, this was a genuine bug in our code that shouldn't have been there. The failure to upgrade might cause confusion and unnecessary support cases, so better just to fix it. Hi Malay, Can you check the test procedure? 1. Deploy OCP4.14 cluster without ODF 2. Install ODF4.13 without install storagecluster [ quay.io/rhceph-dev/ocs-registry:4.13.3-5 ] 3. Upgrade ODF4.14 [ quay.io/rhceph-dev/ocs-registry:4.14.0-135 ] Yes Oded, This is the way to test it. Bug Fixed 1. Install OCP 4.14 4.14.0-0.nightly-2023-09-15-233408 2. Install ODF4.13 [4.13.2-3] Opertor via UI $ oc get csv -A NAMESPACE NAME DISPLAY VERSION REPLACES PHASE openshift-operator-lifecycle-manager packageserver Package Server 0.0.1-snapshot Succeeded openshift-storage mcg-operator.v4.13.2-rhodf NooBaa Operator 4.13.2-rhodf mcg-operator.v4.13.1-rhodf Succeeded openshift-storage ocs-operator.v4.13.2-rhodf OpenShift Container Storage 4.13.2-rhodf ocs-operator.v4.13.1-rhodf Succeeded openshift-storage odf-csi-addons-operator.v4.13.2-rhodf CSI Addons 4.13.2-rhodf odf-csi-addons-operator.v4.13.1-rhodf Succeeded openshift-storage odf-operator.v4.13.2-rhodf OpenShift Data Foundation 4.13.2-rhodf odf-operator.v4.13.1-rhodf Succeeded 3. Upgrade ODF4.14 a.Disabling default source: redhat-operators $ oc patch operatorhub.config.openshift.io/cluster -p='{"spec":{"sources":[{"disabled":true,"name":"redhat-operators"}]}}' --type=merge operatorhub.config.openshift.io/cluster patched b.Change channel in subscription odf-operator [stable-4.13 -> stable-4.14] $ oc edit subscription odf-operator -n openshift-storage C. Create CatalogSource with “quay.io/rhceph-dev/ocs-registry:latest-stable-4.14” image $ oc create -f CatalogSource.yaml catalogsource.operators.coreos.com/redhat-operators created $ cat CatalogSource.yaml --- apiVersion: operators.coreos.com/v1alpha1 kind: CatalogSource metadata: name: redhat-operators namespace: openshift-marketplace labels: ocs-operator-internal: "true" spec: displayName: Openshift Container Storage icon: base64data: "" mediatype: "" image: quay.io/rhceph-dev/ocs-registry:latest-stable-4.14 publisher: Red Hat sourceType: grpc priority: 100 # If the registry image still have the same tag (latest-stable-4.6, or for stage testing) # we need to have this updateStrategy, otherwise we will not see new pushed content. updateStrategy: registryPoll: interval: 15m $ oc get CatalogSource redhat-operators -n openshift-marketplace NAME DISPLAY TYPE PUBLISHER AGE redhat-operators Openshift Container Storage grpc Red Hat 3m15s d.Apply icsp.yaml $ podman run --entrypoint cat quay.io/rhceph-dev/ocs-registry:latest-stable-4.14 /icsp.yaml | oc apply -f - imagecontentsourcepolicy.operator.openshift.io/df-repo created 4.Check CSV:$ oc get csv -A NAMESPACE NAME DISPLAY VERSION REPLACES PHASE openshift-operator-lifecycle-manager packageserver Package Server 0.0.1-snapshot Succeeded openshift-storage mcg-operator.v4.14.0-135.stable NooBaa Operator 4.14.0-135.stable mcg-operator.v4.13.2-rhodf Succeeded openshift-storage ocs-operator.v4.14.0-135.stable OpenShift Container Storage 4.14.0-135.stable ocs-operator.v4.13.2-rhodf Succeeded openshift-storage odf-csi-addons-operator.v4.14.0-135.stable CSI Addons 4.14.0-135.stable odf-csi-addons-operator.v4.13.2-rhodf Succeeded openshift-storage odf-operator.v4.14.0-135.stable OpenShift Data Foundation 4.14.0-135.stable odf-operator.v4.13.2-rhodf Succeeded 5. Check pods on openshift-storage $ oc get pods -n openshift-storage NAME READY STATUS RESTARTS AGE csi-addons-controller-manager-5f9c677f6-b2kln 2/2 Running 0 4m1s noobaa-operator-d9ddd977f-rcf6r 2/2 Running 0 4m24s ocs-metrics-exporter-7b5bf57957-ht66f 1/1 Running 0 4m44s ocs-operator-568fbbb9c4-j86hk 1/1 Running 0 4m44s odf-console-d656466b5-vxszp 1/1 Running 0 22m odf-operator-controller-manager-66c899649b-g6rkb 2/2 Running 1 (10m ago) 22m rook-ceph-operator-587cc6f966-sr8qt 1/1 Running 0 4m19s For more Info: https://docs.google.com/document/d/1lf4Enu-efynTMl0N77xxHQeVTfnSis0mX1L3i6vNa_4/edit Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.14.0 security, enhancement & bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:6832 |