Description of problem (please be detailed as possible and provide log snippests): RGW pod did not get created when OCS was deployed using arbiter mode Version of all relevant components (if applicable): OCP version:- 4.7.0-0.nightly-2021-01-22-134922 OCS version:- ocs-operator.v4.7.0-238.ci Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? yes Is there any workaround available to the best of your knowledge? Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 1 Can this issue reproducible? yes Can this issue reproduce from the UI? yes If this is a regression, please provide more details to justify this: Steps to Reproduce: 1. Deploy OCP over BM 2. Deploy OCS with arbiter mode enable 3. Actual results: No, rgw pod were created Expected results: RGW pod should be created Additional info: Snippet from rook-operator pods ceph-object-store-user-controller: CephObjectStore "ocs-storagecluster-cephobjectstore" found 2021-01-25 18:41:43.202269 I | op-mon: parsing mon endpoints: d=172.30.22.175:6789,e=172.30.66.131:6789,a=172.30.188.243:6789,b=172.30.4.167:6789,c=172.30.236.10:6789 2021-01-25 18:41:43.202395 I | ceph-object-store-user-controller: CephObjectStore "ocs-storagecluster-cephobjectstore" found 2021-01-25 18:41:43.215974 I | op-mon: parsing mon endpoints: d=172.30.22.175:6789,e=172.30.66.131:6789,a=172.30.188.243:6789,b=172.30.4.167:6789,c=172.30.236.10:6789 2021-01-25 18:41:43.216064 I | ceph-object-store-user-controller: CephObjectStore "ocs-storagecluster-cephobjectstore" found
This is caused by the replication size of 4 not being set in the CephObjectStore CR by the OCS operator. The operator log shows: 2021-01-25T08:08:01.951969805Z 2021-01-25 08:08:01.951903 E | ceph-object-controller: failed to reconcile invalid object store "ocs-storagecluster-cephobjectstore" arguments: invalid metadata pool spec: pools in a stretch cluster must have replication size 4 This should be covered with the fix for https://bugzilla.redhat.com/show_bug.cgi?id=1914203
Since the fix is simple and the feature is an MVP, giving devel_ack+
Acking again after change in component.
*** Bug 1926618 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2041
Adding AutomationTriaged keyword. A specific test case is not needed because this will be covered in OCS install verification step while deploying an Arbiter cluster.