Bug 2062891

Summary: [ODF MS] Cephcluster in Error state "failed to get external ceph mon" with 4.10.0-184 (healthchecker client not found in provider)
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Neha Berry <nberry>
Component: ocs-operatorAssignee: Subham Rai <srai>
Status: CLOSED CURRENTRELEASE QA Contact: suchita <sgatfane>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 4.10CC: ebenahar, madam, muagarwa, ocs-bugs, odf-bz-bot, rperiyas, sgatfane, sostapov
Target Milestone: ---Keywords: AutomationBackLog, Regression
Target Release: ODF 4.10.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 4.10.0-194 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-04-21 09:12:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 6 suchita 2022-03-23 03:13:29 UTC
Verified on version ocs-operator.v4.10.0  full version "4.10.0-197"
OCP version 4.9.23

--------------
========CSV ======
NAME                                      DISPLAY                       VERSION           REPLACES                                  PHASE
mcg-operator.v4.10.0                      NooBaa Operator               4.10.0                                                      Succeeded
ocs-operator.v4.10.0                      OpenShift Container Storage   4.10.0                                                      Succeeded
ocs-osd-deployer.v2.0.0                   OCS OSD Deployer              2.0.0                                                       Succeeded
odf-csi-addons-operator.v4.10.0           CSI Addons                    4.10.0                                                      Succeeded
odf-operator.v4.10.0                      OpenShift Data Foundation     4.10.0                                                      Succeeded
ose-prometheus-operator.4.8.0             Prometheus Operator           4.8.0                                                       Succeeded
route-monitor-operator.v0.1.406-54ff884   Route Monitor Operator        0.1.406-54ff884   route-monitor-operator.v0.1.404-e29b74b   Succeeded
--------------
=========ceph versions========
{
    "mon": {
        "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 3
    },
    "mgr": {
        "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 1
    },
    "osd": {
        "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 3
    },
    "mds": {
        "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 2
    },
    "overall": {
        "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 9
    }
}


======= storagecluster ==========
NAME                 AGE   PHASE   EXTERNAL   CREATED AT             VERSION
ocs-storagecluster   19h   Ready              2022-03-22T07:21:56Z   
--------------
======= cephcluster ==========
NAME                             DATADIRHOSTPATH   MONCOUNT   AGE   PHASE   MESSAGE                        HEALTH      EXTERNAL
ocs-storagecluster-cephcluster   /var/lib/rook     3          19h   Ready   Cluster created successfully   HEALTH_OK   
======= cluster health status=====
HEALTH_OK
Wed Mar 23 03:07:00 AM UTC 2022
======ceph osd tree ===
ID   CLASS  WEIGHT   TYPE NAME                               STATUS  REWEIGHT  PRI-AFF
 -1         6.00000  root default                                                     
 -5         6.00000      region us-east-2                                             
-10         2.00000          zone us-east-2a                                          
 -9         2.00000              host default-1-data-0mmppx                           
  1    ssd  2.00000                  osd.1                       up   1.00000  1.00000
 -4         2.00000          zone us-east-2b                                          
 -3         2.00000              host default-2-data-0wv59j                           
  0    ssd  2.00000                  osd.0                       up   1.00000  1.00000
-14         2.00000          zone us-east-2c                                          
-13         2.00000              host default-0-data-0nj4jf                           
  2    ssd  2.00000                  osd.2                       up   1.00000  1.00000
=====ceph status ====
Wed Mar 23 03:07:15 AM UTC 2022
  cluster:
    id:     43c6f12b-96ef-4f35-a0aa-b388247ab0c6
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum a,b,c (age 19h)
    mgr: a(active, since 19h)
    mds: 1/1 daemons up, 1 hot standby
    osd: 3 osds: 3 up (since 19h), 3 in (since 19h)
 
  data:
    volumes: 1/1 healthy
    pools:   5 pools, 129 pgs
    objects: 80.37k objects, 288 GiB
    usage:   590 GiB used, 5.4 TiB / 6 TiB avail
    pgs:     129 active+clean
 
  io:
    client:   7.3 KiB/s rd, 2.3 MiB/s wr, 2 op/s rd, 339 op/s wr

Comment 7 suchita 2022-03-23 03:18:34 UTC
Referring to outputs in comment#6 cephcluster and storagecluster is in a ready state on the provider cluster. Also, onboarding of consumers is successful.
Moving this BZ to verified status