Bug 2052996

Summary: ODF deployment fails using RHCS in external mode due to cephobjectstoreuser
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Sonia Garudi <sgarudi>
Component: rookAssignee: Sébastien Han <shan>
Status: CLOSED ERRATA QA Contact: Vijay Avuthu <vavuthu>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.10CC: madam, muagarwa, ocs-bugs, odf-bz-bot, sagrawal, shan
Target Milestone: ---   
Target Release: ODF 4.10.0   
Hardware: ppc64le   
OS: Linux   
Whiteboard:
Fixed In Version: 4.10.0-160 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-04-13 18:53:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Sonia Garudi 2022-02-10 12:09:44 UTC
Description of problem (please be detailed as possible and provide log
snippests):
ODF deployment using external RHCS cluster fails.I am using osc-ci for deployment.

[root@sonia-3957-bastion-0 ~]# oc -n openshift-storage get cephcluster ocs-external-storagecluster-cephcluster
NAME                                      DATADIRHOSTPATH   MONCOUNT   AGE     PHASE       MESSAGE                          HEALTH      EXTERNAL
ocs-external-storagecluster-cephcluster                                4h45m   Connected   Cluster connected successfully   HEALTH_OK   true
[root@sonia-3957-bastion-0 ~]# oc -n openshift-storage get storagecluster
NAME                          AGE     PHASE   EXTERNAL   CREATED AT             VERSION
ocs-external-storagecluster   4h45m   Ready   true       2022-02-10T07:20:59Z   4.10.0

The CephObjectStoreUser is in 'ReconcileFailed' state.

[root@sonia-3957-bastion-0 ~]# oc -n openshift-storage get cephobjectstoreuser/noobaa-ceph-objectstore-user -oyaml
apiVersion: ceph.rook.io/v1
kind: CephObjectStoreUser
metadata:
  creationTimestamp: "2022-02-10T07:21:19Z"
  finalizers:
  - cephobjectstoreuser.ceph.rook.io
  generation: 1
  name: noobaa-ceph-objectstore-user
  namespace: openshift-storage
  ownerReferences:
  - apiVersion: noobaa.io/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: NooBaa
    name: noobaa
    uid: cf6b7b28-243e-4c88-9f27-4302a6119cd1
  resourceVersion: "344731"
  uid: 2b4025bc-0d1e-4ed4-a8ba-11bb0bcd9abc
spec:
  displayName: my display name
  store: ocs-external-storagecluster-cephobjectstore
status:
  phase: ReconcileFailed


Found following traces in rook-ceph-operator-* logs:
2022-02-10 07:21:25.910954 I | op-mon: parsing mon endpoints: sonia-rhcs-node1=9.30.180.75:6789
2022-02-10 07:21:26.511061 I | ceph-spec: detecting the ceph image version for image ...
2022-02-10 07:21:26.511154 E | ceph-object-controller: failed to reconcile CephObjectStore "openshift-storage/ocs-external-storagecluster-cephobjectstore". failed to detect running and desired ceph version: failed to detect ceph image version: failed to set up ceph version job: Rook image [quay.io/rhceph-dev/odf4-rook-ceph-rhel8-operator@sha256:9b996e67b34a516d96c5f22762f166f0396be11f90ca298b5cfc78b9f4171d59] and run image [] must be specified
2022-02-10 07:21:27.311475 I | op-mon: parsing mon endpoints: sonia-rhcs-node1=9.30.180.75:6789
2022-02-10 07:21:27.911528 I | ceph-object-store-user-controller: CephObjectStore "ocs-external-storagecluster-cephobjectstore" found
2022-02-10 07:21:27.911571 I | ceph-object-store-user-controller: CephObjectStore "ocs-external-storagecluster-cephobjectstore" found
2022-02-10 07:21:27.911631 I | ceph-object-store-user-controller: creating ceph object user "noobaa-ceph-objectstore-user" in namespace "openshift-storage"
2022-02-10 07:21:27.921848 E | ceph-object-store-user-controller: failed to reconcile failed to create/update object store user "noobaa-ceph-objectstore-user": failed to get details from ceph object user "noobaa-ceph-objectstore-user": Get "http://rook-ceph-rgw-ocs-external-storagecluster-cephobjectstore.openshift-storage.svc:8000/admin/user?display-name=my%20display%20name&format=json&max-buckets=1000&uid=noobaa-ceph-objectstore-user": dial tcp: lookup rook-ceph-rgw-ocs-external-storagecluster-cephobjectstore.openshift-storage.svc on 172.30.0.10:53: no such host


Version of all relevant components (if applicable):


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?


Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue reproducible?


Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1.
2.
3.


Actual results:


Expected results:


Additional info:

Comment 4 Sébastien Han 2022-02-14 16:25:27 UTC
*** Bug 2052995 has been marked as a duplicate of this bug. ***

Comment 8 Vijay Avuthu 2022-03-22 10:47:58 UTC
Verified:
=========

openshift installer (4.10.0-0.nightly-2022-03-19-230512)

ocs-registry:4.10.0-199

Deployment job is successful: https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/3585/consoleFull

and noobaa-ceph-objectstore-user is in Ready state

apiVersion: ceph.rook.io/v1
kind: CephObjectStoreUser
metadata:
  creationTimestamp: "2022-03-22T05:23:26Z"
  finalizers:
  - cephobjectstoreuser.ceph.rook.io
  generation: 1
  name: noobaa-ceph-objectstore-user
  namespace: openshift-storage
  ownerReferences:
  - apiVersion: noobaa.io/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: NooBaa
    name: noobaa
    uid: fe479fb5-1424-4f6a-9e0e-2e2b919046d7
  resourceVersion: "33069"
  uid: 27046980-82f9-48af-b1ff-16fdc23864be
spec:
  displayName: my display name
  store: ocs-external-storagecluster-cephobjectstore
status:
  info:
    secretName: rook-ceph-object-user-ocs-external-storagecluster-cephobjectstore-noobaa-ceph-objectstore-user
  phase: Ready

Comment 10 errata-xmlrpc 2022-04-13 18:53:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.10.0 enhancement, security & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1372