Bug 2107073

Summary: ROOK_CSI_ENABLE_CEPHFS is "false" after upgrading the provider cluster alone to ODF 4.11.0
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Jilju Joy <jijoy>
Component: ocs-operatorAssignee: Malay Kumar parida <mparida>
Status: CLOSED ERRATA QA Contact: Jilju Joy <jijoy>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.11CC: aeyal, jarrpa, kramdoss, madam, mparida, muagarwa, nberry, nigoyal, ocs-bugs, odf-bz-bot, omitrani, rcyriac, sostapov
Target Milestone: ---   
Target Release: ODF 4.10.6   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: 2107023
: 2110274 (view as bug list) Environment:
Last Closed: 2022-09-21 17:29:32 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jilju Joy 2022-07-14 09:29:53 UTC
+++ This bug was initially created as a clone of Bug #2107023 +++

Description of problem:
When the provider cluster alone is upgraded to ODF 4.11.0 from ODF 4.10.4, CephFS PVCs cannot be created on the consumer cluster because the value of ROOK_CSI_ENABLE_CEPHFS in the consumer configmap 'rook-ceph-operator-config' is set as "false"
The ODF version in consumer is still 4.10.4

From consumer cluster:

$ oc get cm rook-ceph-operator-config -oyaml -nopenshift-storage | grep ROOK_CSI_ENABLE_CEPHFS
  ROOK_CSI_ENABLE_CEPHFS: "false"
        f:ROOK_CSI_ENABLE_CEPHFS: {}


Must gather logs before upgrading provider and consumer cluster from ODF 4.10.4 to 4.11.0-113:

Consumer http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/jijoy-j13-c3/jijoy-j13-c3_20220713T081317/logs/testcases_1657705862/

Provider http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/jijoy-j13-pr/jijoy-j13-pr_20220713T043423/logs/testcases_1657705913/


Must gather logs collected after upgrading the provider cluster to ODF 4.11.0-113:
Consumer http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/jijoy-j13-c3/jijoy-j13-c3_20220713T081317/logs/testcases_1657715380/

Provider http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/jijoy-j13-pr/jijoy-j13-pr_20220713T043423/logs/testcases_1657715387/

==================================================================
Version-Release number of selected component (if applicable):
ODF 4.11.0-113 on provider cluster
ODF 4.10.4 on consumer cluster

OCP 4.10.20
ocs-osd-deployer.v2.0.3

======================================================================
How reproducible:
2/2

Steps to Reproduce:
1. Install provider and consumer cluster with ODF version 4.10.4.
(ocs-osd-deployer.v2.0.3)
2. Upgrade the provider cluster to ODF 4.11.0
3. Try to create CephFS PVC on the consumer cluster
4. Check the value of ROOK_CSI_ENABLE_CEPHFS in consumer cluster
$ oc get cm rook-ceph-operator-config -oyaml -nopenshift-storage | grep ROOK_CSI_ENABLE_CEPHFS


=====================================================================

Actual results:
Step 3. Cannot create CephFS PVC
Step 4. Value of ROOK_CSI_ENABLE_CEPHFS is "false"
$ oc get cm rook-ceph-operator-config -oyaml -nopenshift-storage | grep ROOK_CSI_ENABLE_CEPHFS
  ROOK_CSI_ENABLE_CEPHFS: "false"
        f:ROOK_CSI_ENABLE_CEPHFS: {}

======================================================================

Expected results:
Step 3. CephFS PVC should reach Bound state
Step 4. Value of ROOK_CSI_ENABLE_CEPHFS should be "true"
$ oc get cm rook-ceph-operator-config -oyaml -nopenshift-storage | grep ROOK_CSI_ENABLE_CEPHFS
  ROOK_CSI_ENABLE_CEPHFS: "true"
        f:ROOK_CSI_ENABLE_CEPHFS: {}


Additional info:

Comment 2 Mudit Agarwal 2022-07-19 13:49:42 UTC
Not a 4.11 blocker

Comment 3 Malay Kumar parida 2022-07-25 04:24:36 UTC
This issue was earlier happening with upgrading both the consumer & provider cluster from 4.10 to 4.11, this BZ had the issue https://bugzilla.redhat.com/show_bug.cgi?id=2096823. This was included in 4.11 but not backported to 4.10. We need to backport this to 4.10. But this can't be directly backported to 4.10 as it has a couple of lines from another PR-https://github.com/red-hat-storage/ocs-operator/pull/1663.

So we need to backport this- https://github.com/red-hat-storage/ocs-operator/pull/1663 1st.
Then we have to backport this- https://github.com/red-hat-storage/ocs-operator/pull/1710.

Comment 4 Malay Kumar parida 2022-07-25 04:28:52 UTC
After this is complete, The customer has to first upgrade to the latest 4.10 z stream before upgrading to 4.11 versions.

Comment 5 Malay Kumar parida 2022-07-25 11:40:24 UTC
Update- Automated cherry pick was not possible due to further merge conflicts, So manually backporting the required changes.

Comment 6 Mudit Agarwal 2022-07-26 06:10:31 UTC
We just need the fix in 4.10.z, it is already fixed in 4.11 via BZ #2096823

Comment 7 Mudit Agarwal 2022-07-26 06:11:21 UTC
*** Bug 2110274 has been marked as a duplicate of this bug. ***

Comment 12 Malay Kumar parida 2022-08-08 06:37:14 UTC
*** Bug 2107023 has been marked as a duplicate of this bug. ***

Comment 20 Malay Kumar parida 2022-09-09 17:21:51 UTC
Yes, I think it's good enough.

Comment 27 errata-xmlrpc 2022-09-21 17:29:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenShift Data Foundation 4.10.6 Bug Fix Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:6675