Bug 2107206

Summary: ODF upgrade from 4.9 to 4.10 with RHCS 5.0.4 external will result in inability to create new RWX PVs
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: gsternag <gsternag>
Component: ocs-operatorAssignee: Jose A. Rivera <jrivera>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Martin Bukatovic <mbukatov>
Severity: low Docs Contact:
Priority: unspecified    
Version: 4.10CC: bkunal, jrivera, madam, muagarwa, nigoyal, ocs-bugs, odf-bz-bot, sostapov
Target Milestone: ---Flags: bkunal: needinfo? (gsternag)
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-10-12 11:10:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description gsternag@redhat.com 2022-07-14 13:56:34 UTC
Description of problem (please be detailed as possible and provide log
snippests):
ODF operator update from version 4.9 to 4.10 with external RHCS does not check external Ceph version and then fails to create new RWX PVs

Version of all relevant components (if applicable):
ODF 4.10
RHCS 5.0.4

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Yes

Is there any workaround available to the best of your knowledge?
Maybe, upgrade RHCS to latest (5.1.2) but that upgrade fails (BZ #2107203)

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
2

Can this issue reproducible?
Yes

Can this issue reproduce from the UI?
Yes

If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Install or use RHCS 5.0.4 cluster for use with ODF.
2. Upgrade ODF Operator from 4.9 to 4.10
3. Try to create a RWX (CephFS-based) PV


Actual results:
Fails to create PV

Expected results:
PVC stuck in pending - storageclass ocs-external

Additional info:
This also happened exactly the same way between ODF 4.7 -> 4.8 upgrade while having RHCS 4.1 installed. Upgrade to 4.1z4 solved the issue.

Comment 2 gsternag@redhat.com 2022-07-15 11:10:02 UTC
Updating RHCS from 5.0.4 to 5.1.2 fixed the problem. What remains is the following expectation:
a) we document this properly in the ODF releases notes or the ODF admin guide
b) For an update of an existing ODF external deployment, the ODF Operator must check which Ceph release is installed before it actually installs. If it is an unsupported Ceph release, then it should display that and refuse not install the new ODF Operator. By not doing so, a user will end up with an unusable OpenShift storage environment which is definitely not desirable. 
c) we remove this strong binding between ODF versions and Ceph versions. No customer will want to upgrade their Ceph cluster everytime an ODF update is available. It's just not feasible from an operational PoV unless that Ceph cluster is solely used for OpenShift workloads. You wouldn't want having to upgrade your AWS EBS versions or NetApp firmware either.

Comment 3 Bipin Kunal 2022-07-15 16:19:04 UTC
Hi Gerald,

   Good to know that you don't see issue with 5.1.2.

   I confirmed with QE team that we have automated upgrade tests in the external mode and we do run all tier 1 tests post upgrade. As a part of the tier 1 test, we do create RWX PVs and we did not observe any issue. 

  Today I tested it as well with RHCS-5.0.4 and then upgraded ODF-4.9.9 to ODF-4.10.4 and did not observe any issue either. May be I was lucky enough. 

  If you are able to consistently able to reproduce the issue, I would appreciate you provide us more details on the sequence of steps being executed, errors you see while RWX PV creation along with odf must-gather.

-Bipin Kunal

Comment 4 Nitin Goyal 2022-10-12 11:10:18 UTC
I am closing this bug as there is no traffic on it for a few months, Please reopen it if you feel otherwise.