Bug 2231074

Summary: Upgrade from 4.12.z to 4.13.z fails if StorageCluster is not created
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: umanga <uchapaga>
Component: ocs-operatorAssignee: Malay Kumar parida <mparida>
Status: NEW --- QA Contact: Elad <ebenahar>
Severity: low Docs Contact:
Priority: unspecified    
Version: 4.13CC: mparida, odf-bz-bot
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description umanga 2023-08-10 13:24:03 UTC
Description of problem (please be detailed as possible and provide log
snippests):

If ODF v4.12.z is installed but StorageCluster is not yet created, and we try to upgrade to ODF v4.13.z, it does not succeed as the "rook-ceph-operator" is stuck in "CreateContainerConfigError" error.

➜  ~ oc get pod/rook-ceph-operator-799f4557f8-z76dn
NAME                                  READY   STATUS                       RESTARTS   AGE
rook-ceph-operator-799f4557f8-z76dn   0/1     CreateContainerConfigError   0          85s
---

➜  ~ oc describe pod/rook-ceph-operator-799f4557f8-z76dn
Events:
  Type     Reason                 Age                 From               Message
  ----     ------                 ----                ----               -------
  Normal   Scheduled              110s                default-scheduler  Successfully assigned openshift-storage/rook-ceph-operator-799f4557f8-z76dn to t3-585mv-worker-0-b5rl7
  Normal   Pulled                 6s (x10 over 107s)  kubelet            Container image "icr.io/cpopen/rook-ceph-operator@sha256:70aebdc2b80283fc69f77acc7390667868939dea5839070673814b6351fda4d7" already present on machine
  Warning  Failed                 6s (x10 over 107s)  kubelet            Error: couldn't find key CSI_ENABLE_READ_AFFINITY in ConfigMap openshift-storage/ocs-operator-config
---

➜  ~ oc get cm ocs-operator-config -oyaml
apiVersion: v1
data:
  CSI_CLUSTER_NAME: 8a514d5d-f345-42bd-8fa7-54c37e9c9fe2
kind: ConfigMap
metadata:
  creationTimestamp: "2023-08-10T07:16:14Z"
  name: ocs-operator-config
  namespace: openshift-storage
  ownerReferences:
  - apiVersion: ocs.openshift.io/v1
    blockOwnerDeletion: true
    controller: true
    kind: OCSInitialization
    name: ocsinit
    uid: 6cdfa990-37e1-4596-b0e5-69baedafc0f3
  resourceVersion: "17531216"
  uid: 22e4fa9c-a8ca-40fa-8e92-c2c4b4f5119d


Version of all relevant components (if applicable):
ODF v4.13.z

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?

Yes

Is there any workaround available to the best of your knowledge?
Yes. Delete the "ocs-operator-config" ConfigMap.

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?

1

Can this issue reproducible?

Yes

Can this issue reproduce from the UI?

Yes

If this is a regression, please provide more details to justify this:

It is a regression, since it didn't happen in previous upgrades. But, it's a corner case and a very minor issue which was never tested.

Steps to Reproduce:
1. Install ODF operator v4.12.z. Do not create StorageCluster.
2. Upgrade to ODF operator v4.13.z.
3. Check the operator pod status.


Actual results:
rook-ceph-operator pod is in "CreateContainerConfigError" blocking the upgrade.

Expected results:
Upgrade should complete without issue.

Additional info: