Bug 1920202

Summary: RGW pod did not get created when OCS was deployed using arbiter mode
Product: [Red Hat Storage] Red Hat OpenShift Container Storage Reporter: Pratik Surve <prsurve>
Component: ocs-operatorAssignee: Raghavendra Talur <rtalur>
Status: CLOSED ERRATA QA Contact: Pratik Surve <prsurve>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 4.7CC: alayani, ebenahar, jarrpa, jijoy, jthottan, madam, mbukatov, muagarwa, ocs-bugs, sostapov, tnielsen
Target Milestone: ---Keywords: AutomationTriaged
Target Release: OCS 4.7.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 4.7.0-262.ci Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-05-19 09:18:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Pratik Surve 2021-01-25 18:44:26 UTC
Description of problem (please be detailed as possible and provide log
snippests):
RGW pod did not get created when OCS was deployed using arbiter mode

Version of all relevant components (if applicable):
OCP version:- 4.7.0-0.nightly-2021-01-22-134922
OCS version:- ocs-operator.v4.7.0-238.ci

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
yes

Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
yes

Can this issue reproduce from the UI?
yes

If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Deploy OCP over BM
2. Deploy OCS with arbiter mode enable 
3.


Actual results:
No, rgw pod were created 

Expected results:
RGW pod should be created

Additional info:

Snippet from rook-operator pods

ceph-object-store-user-controller: CephObjectStore "ocs-storagecluster-cephobjectstore" found
2021-01-25 18:41:43.202269 I | op-mon: parsing mon endpoints: d=172.30.22.175:6789,e=172.30.66.131:6789,a=172.30.188.243:6789,b=172.30.4.167:6789,c=172.30.236.10:6789
2021-01-25 18:41:43.202395 I | ceph-object-store-user-controller: CephObjectStore "ocs-storagecluster-cephobjectstore" found
2021-01-25 18:41:43.215974 I | op-mon: parsing mon endpoints: d=172.30.22.175:6789,e=172.30.66.131:6789,a=172.30.188.243:6789,b=172.30.4.167:6789,c=172.30.236.10:6789
2021-01-25 18:41:43.216064 I | ceph-object-store-user-controller: CephObjectStore "ocs-storagecluster-cephobjectstore" found

Comment 5 Travis Nielsen 2021-02-01 23:21:21 UTC
This is caused by the replication size of 4 not being set in the CephObjectStore CR by the OCS operator.

The operator log shows:
2021-01-25T08:08:01.951969805Z 2021-01-25 08:08:01.951903 E | ceph-object-controller: failed to reconcile invalid object store "ocs-storagecluster-cephobjectstore" arguments: invalid metadata pool spec: pools in a stretch cluster must have replication size 4

This should be covered with the fix for https://bugzilla.redhat.com/show_bug.cgi?id=1914203

Comment 6 Jose A. Rivera 2021-02-03 14:49:20 UTC
Since the fix is simple and the feature is an MVP, giving devel_ack+

Comment 7 Martin Bukatovic 2021-02-03 21:48:58 UTC
Acking again after change in component.

Comment 12 Mudit Agarwal 2021-02-09 13:50:19 UTC
*** Bug 1926618 has been marked as a duplicate of this bug. ***

Comment 16 errata-xmlrpc 2021-05-19 09:18:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2041

Comment 17 Jilju Joy 2021-09-09 09:48:23 UTC
Adding AutomationTriaged keyword. A specific test case is not needed because this will be covered in OCS install verification step while deploying an Arbiter cluster.