Description of problem (please be detailed as possible and provide log snippests): ODF operator update from version 4.9 to 4.10 with external RHCS does not check external Ceph version and then fails to create new RWX PVs Version of all relevant components (if applicable): ODF 4.10 RHCS 5.0.4 Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Yes Is there any workaround available to the best of your knowledge? Maybe, upgrade RHCS to latest (5.1.2) but that upgrade fails (BZ #2107203) Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 2 Can this issue reproducible? Yes Can this issue reproduce from the UI? Yes If this is a regression, please provide more details to justify this: Steps to Reproduce: 1. Install or use RHCS 5.0.4 cluster for use with ODF. 2. Upgrade ODF Operator from 4.9 to 4.10 3. Try to create a RWX (CephFS-based) PV Actual results: Fails to create PV Expected results: PVC stuck in pending - storageclass ocs-external Additional info: This also happened exactly the same way between ODF 4.7 -> 4.8 upgrade while having RHCS 4.1 installed. Upgrade to 4.1z4 solved the issue.
Updating RHCS from 5.0.4 to 5.1.2 fixed the problem. What remains is the following expectation: a) we document this properly in the ODF releases notes or the ODF admin guide b) For an update of an existing ODF external deployment, the ODF Operator must check which Ceph release is installed before it actually installs. If it is an unsupported Ceph release, then it should display that and refuse not install the new ODF Operator. By not doing so, a user will end up with an unusable OpenShift storage environment which is definitely not desirable. c) we remove this strong binding between ODF versions and Ceph versions. No customer will want to upgrade their Ceph cluster everytime an ODF update is available. It's just not feasible from an operational PoV unless that Ceph cluster is solely used for OpenShift workloads. You wouldn't want having to upgrade your AWS EBS versions or NetApp firmware either.
Hi Gerald, Good to know that you don't see issue with 5.1.2. I confirmed with QE team that we have automated upgrade tests in the external mode and we do run all tier 1 tests post upgrade. As a part of the tier 1 test, we do create RWX PVs and we did not observe any issue. Today I tested it as well with RHCS-5.0.4 and then upgraded ODF-4.9.9 to ODF-4.10.4 and did not observe any issue either. May be I was lucky enough. If you are able to consistently able to reproduce the issue, I would appreciate you provide us more details on the sequence of steps being executed, errors you see while RWX PV creation along with odf must-gather. -Bipin Kunal
I am closing this bug as there is no traffic on it for a few months, Please reopen it if you feel otherwise.