Bug 2211343 - [MCG-Only]: upgrade failed from 4.12 to 4.13 due to missing CSI_ENABLE_READ_AFFINITY in ConfigMap openshift-storage/ocs-operator-config
Summary: [MCG-Only]: upgrade failed from 4.12 to 4.13 due to missing CSI_ENABLE_READ_A...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: ocs-operator
Version: 4.13
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: ODF 4.13.0
Assignee: Malay Kumar parida
QA Contact: Oded
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-05-31 07:26 UTC by Vijay Avuthu
Modified: 2023-08-09 17:00 UTC (History)
3 users (show)

Fixed In Version: 4.13.0-214
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-06-21 15:25:39 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github red-hat-storage ocs-operator pull 2065 0 None open Always call the ensureOCSOperatorConfig func irrespective of the mode 2023-05-31 08:20:03 UTC
Github red-hat-storage ocs-operator pull 2066 0 None open Bug 2211343:[release-4.13] Always call the ensureOCSOperatorConfig func irrespective of the mode 2023-05-31 11:40:08 UTC
Red Hat Product Errata RHBA-2023:3742 0 None None None 2023-06-21 15:25:52 UTC

Description Vijay Avuthu 2023-05-31 07:26:59 UTC
Description of problem (please be detailed as possible and provide log
snippests):

MCG-Only deployment.

upgrade failed from 4.12 to 4.13 due to missing CSI_ENABLE_READ_AFFINITY in ConfigMap openshift-storage/ocs-operator-config


Version of all relevant components (if applicable):

Initial deployment versions:

openshift installer (4.12.0-0.nightly-2023-05-29-223551)
odf-operator.v4.12.3-rhodf

then OCP is upgrdaed to 4.13.0-0.nightly-2023-05-30-074322 successfully
then tried ODF upgrdae to ocs-registry:4.13.0-207 which was failed.


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Yes

Is there any workaround available to the best of your knowledge?
No

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
2/2

Can this issue reproduce from the UI?
Not Tried

If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Install ODF/OCP 4.12 ( MCG-Only) and then upgrade OCP to 4.13 and then upgrade ODF to 4.13
2. check all CSV's are upgrdaed to 4.13
3.


Actual results:

$ oc get csv
NAME                                         DISPLAY                       VERSION             REPLACES                                PHASE
mcg-operator.v4.13.0-207.stable              NooBaa Operator               4.13.0-207.stable   mcg-operator.v4.12.3-rhodf              Succeeded
ocs-operator.v4.12.3-rhodf                   OpenShift Container Storage   4.12.3-rhodf        ocs-operator.v4.12.2-rhodf              Replacing
ocs-operator.v4.13.0-207.stable              OpenShift Container Storage   4.13.0-207.stable   ocs-operator.v4.12.3-rhodf              Failed
odf-csi-addons-operator.v4.13.0-207.stable   CSI Addons                    4.13.0-207.stable   odf-csi-addons-operator.v4.12.3-rhodf   Succeeded
odf-operator.v4.13.0-207.stable              OpenShift Data Foundation     4.13.0-207.stable   odf-operator.v4.12.3-rhodf              Succeeded



Expected results:
 All CSV's should upgrade to 4.13

Additional info:

$ oc describe csv ocs-operator.v4.13.0-207.stable  
Name:         ocs-operator.v4.13.0-207.stable
Namespace:    openshift-storage
Labels:       full_version=4.13.0-207
              operatorframework.io/arch.amd64=supported
              operatorframework.io/arch.ppc64le=supported
              operatorframework.io/arch.s390x=supported
              operators.coreos.com/ocs-operator.openshift-storage=
Annotations:  alm-examples:
                

Events:
  Type     Reason               Age                From                        Message
  ----     ------               ----               ----                        -------
  Normal   RequirementsUnknown  72m                operator-lifecycle-manager  requirements not yet checked
  Normal   RequirementsNotMet   72m                operator-lifecycle-manager  one or more requirements couldn't be found
  Normal   InstallWaiting       72m                operator-lifecycle-manager  installing: waiting for deployment ocs-operator to become ready: deployment "ocs-operator" not available: Deployment does not have minimum availability.
  Normal   NeedsReinstall       67m                operator-lifecycle-manager  installing: waiting for deployment rook-ceph-operator to become ready: deployment "rook-ceph-operator" not available: Deployment does not have minimum availability.
  Normal   AllRequirementsMet   67m (x3 over 72m)  operator-lifecycle-manager  all requirements found, attempting install
  Normal   InstallSucceeded     67m (x2 over 72m)  operator-lifecycle-manager  waiting for install components to report healthy
  Normal   InstallWaiting       67m (x3 over 71m)  operator-lifecycle-manager  installing: waiting for deployment rook-ceph-operator to become ready: deployment "rook-ceph-operator" not available: Deployment does not have minimum availability.
  Warning  InstallCheckFailed   62m (x3 over 67m)  operator-lifecycle-manager  install timeout
  Warning  InstallCheckFailed   62m (x2 over 62m)  operator-lifecycle-manager  install failed: deployment rook-ceph-operator not ready before timeout: deployment "rook-ceph-operator" exceeded its progress deadline


> $ oc get pods
NAME                                               READY   STATUS                       RESTARTS   AGE
csi-addons-controller-manager-c44ff597-d7cm8       2/2     Running                      0          74m
noobaa-core-0                                      1/1     Running                      0          72m
noobaa-db-pg-0                                     1/1     Running                      0          73m
noobaa-default-backing-store-noobaa-pod-729181a7   1/1     Running                      0          72m
noobaa-endpoint-64955bc688-55c8s                   1/1     Running                      0          73m
noobaa-operator-76799d976d-dtmn4                   1/1     Running                      0          73m
ocs-metrics-exporter-f68bf4cd6-9kpzw               1/1     Running                      0          73m
ocs-operator-5c9fb4759b-zsmgj                      1/1     Running                      0          73m
odf-console-6cfd48f9bd-7tn6j                       1/1     Running                      0          75m
odf-operator-controller-manager-787f679865-j42d6   2/2     Running                      0          75m
rook-ceph-operator-78fbd5f69c-mmtb6                0/1     CreateContainerConfigError   0          73m

> 
$ oc get pod rook-ceph-operator-78fbd5f69c-mmtb6 -o yaml
apiVersion: v1
kind: Pod
metadata:
  annotations:
    alm-examples: |2-

tatus:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2023-05-31T06:05:04Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2023-05-31T06:05:04Z"
    message: 'containers with unready status: [rook-ceph-operator]'
    reason: ContainersNotReady
    status: "False"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2023-05-31T06:05:04Z"
    message: 'containers with unready status: [rook-ceph-operator]'
    reason: ContainersNotReady
    status: "False"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2023-05-31T06:05:04Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - image: quay.io/rhceph-dev/odf4-rook-ceph-rhel9-operator@sha256:b98046453da7b1104fca9116f28ecd1a5c2bb8074556510cd531548caaaf0786
    imageID: ""
    lastState: {}
    name: rook-ceph-operator
    ready: false
    restartCount: 0
    started: false
    state:
      waiting:
        message: couldn't find key CSI_ENABLE_READ_AFFINITY in ConfigMap openshift-storage/ocs-operator-config
        reason: CreateContainerConfigError


job url: https://url.corp.redhat.com/7dc9442
must gather: https://url.corp.redhat.com/f8f48a0

Comment 4 Malay Kumar parida 2023-05-31 08:05:39 UTC
We call the ensureOCSOperatorConfig function inside the cephcluster.go file while creation of cephcluster. But when Nooba is standalone a cephcluster is not getting created. So the function to ensure the reconciliation of the ocs-operator-cm is notbeing called. We have to move that function to somewhere in the reconcile.go file so it's always called irrespective of the mode of deployment.

Comment 8 Vijay Avuthu 2023-06-07 15:31:00 UTC
upgrade passed from ocs-operator.v4.12.3-rhodf to ocs-registry:4.13.0-214

2023-06-07 20:17:46  14:47:46 - MainThread - ocs_ci.utility.utils - INFO  - Executing command: oc --kubeconfig /home/jenkins/current-cluster-dir/openshift-cluster-dir/auth/kubeconfig -n openshift-storage get csv ocs-operator.v4.13.0-rhodf -n openshift-storage -o yaml
2023-06-07 20:17:46  14:47:46 - MainThread - ocs_ci.ocs.ocp - INFO  - Resource ocs-operator.v4.13.0-rhodf is in phase: Succeeded!

jenkins job: https://url.corp.redhat.com/e010d64

logs: https://url.corp.redhat.com/6fcf347

BUILD ID: 4.13.0-214 RUN ID: 1686144272

Comment 10 errata-xmlrpc 2023-06-21 15:25:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenShift Data Foundation 4.13.0 enhancement and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:3742


Note You need to log in before you can comment on or make changes to this bug.