Bug 1486987

Summary: [GSS] Imported my Ceph cluster into the Storage Console, however the OSDs count is showing zero. The host the OSDs are on are showing up in the Console but not the OSDs itself.
Product: [Red Hat Storage] Red Hat Storage Console Reporter: liuwei <wliu>
Component: DashboardAssignee: Neha Gupta <negupta>
Status: CLOSED EOL QA Contact: sds-qe-bugs
Severity: low Docs Contact:
Priority: unspecified    
Version: 2CC: flucifre, mkarnik, nthomas, sankarshan, shtripat
Target Milestone: ---   
Target Release: 3   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
error shown none

Description liuwei 2017-08-31 01:59:16 UTC
Created attachment 1320358 [details]
error shown

Description of problem:

I have imported my Ceph cluster into the Storage Console, however the OSDs count is showing zero. The host the OSDs are on are showing up in the Console but not the OSDs itself.

My cluster is running all in VMs, I have 3 MONs and 6 ODSs all on different VMs. The cluster imported successfully with no errors so not sure where the issue is.


Version-Release number of selected component (if applicable):
rhscon-ceph-0.0.43-1.el7scon.x86_64                         Wed Aug 16 22:48:17 2017
rhscon-core-0.0.45-1.el7scon.x86_64                         Wed Aug 16 22:48:03 2017
rhscon-core-selinux-0.0.45-1.el7scon.noarch                 Mon Aug  7 11:00:09 2017
rhscon-ui-0.0.60-1.el7scon.noarch                           Wed Aug 16 22:48:10 2017

How reproducible:
100%

Steps to Reproduce:
1.
2.
3.

Actual results:
the OSDs count is showing zero

Expected results:

the OSDs count is shown correctly

Additional info:

From the storage console node , the error messages are below:

Aug 27 08:29:51 dn-ceph-storage-console skyring: #033[31m2017-08-27T08:29:51.539+02:00 ERROR    monitoring.go:487 updateOSDStats]#033[0m skyring:07ed83ea-799e-4599-8dd0-e42ad6b3c30a - Error updating the osd details of osd.0 of cluster ceph.Err not found
Aug 27 08:29:51 dn-ceph-storage-console skyring: #033[31m2017-08-27T08:29:51.54+02:00 ERROR    monitoring.go:487 updateOSDStats]#033[0m skyring:07ed83ea-799e-4599-8dd0-e42ad6b3c30a - Error updating the osd details of osd.2 of cluster ceph.Err not found
Aug 27 08:29:51 dn-ceph-storage-console skyring: #033[31m2017-08-27T08:29:51.541+02:00 ERROR    monitoring.go:487 updateOSDStats]#033[0m skyring:07ed83ea-799e-4599-8dd0-e42ad6b3c30a - Error updating the osd details of osd.3 of cluster ceph.Err not found
Aug 27 08:29:51 dn-ceph-storage-console skyring: #033[31m2017-08-27T08:29:51.542+02:00 ERROR    monitoring.go:487 updateOSDStats]#033[0m skyring:07ed83ea-799e-4599-8dd0-e42ad6b3c30a - Error updating the osd details of osd.7 of cluster ceph.Err not found
Aug 27 08:29:51 dn-ceph-storage-console skyring: #033[31m2017-08-27T08:29:51.543+02:00 ERROR    monitoring.go:487 updateOSDStats]#033[0m skyring:07ed83ea-799e-4599-8dd0-e42ad6b3c30a - Error updating the osd details of osd.4 of cluster ceph.Err not found
Aug 27 08:29:51 dn-ceph-storage-console skyring: #033[31m2017-08-27T08:29:51.543+02:00 ERROR    monitoring.go:487 updateOSDStats]#033[0m skyring:07ed83ea-799e-4599-8dd0-e42ad6b3c30a - Error updating the osd details of osd.6 of cluster ceph.Err not found
Aug 27 08:29:51 dn-ceph-storage-console skyring: #033[31m2017-08-27T08:29:51.544+02:00 ERROR    monitoring.go:487 updateOSDStats]#033[0m skyring:07ed83ea-799e-4599-8dd0-e42ad6b3c30a - Error updating the osd details of osd.5 of cluster ceph.Err not found
Aug 27 08:29:51 dn-ceph-storage-console skyring: #033[31m2017-08-27T08:29:51.545+02:00 ERROR    util.go:265 AnalyseThresholdBreach]#033[0m skyring:07ed83ea-799e-4599-8dd0-e42ad6b3c30a - Error fetching the id for osd.0 in cluster ceph
Aug 27 08:29:51 dn-ceph-storage-console skyring: #033[31m2017-08-27T08:29:51.545+02:00 ERROR    monitoring.go:504 FetchOSDStats]#033[0m skyring:07ed83ea-799e-4599-8dd0-e42ad6b3c30a - Failed to analyse threshold breach for osd utilization of osd.0.Error skyring:07ed83ea-799e-4599-8dd0-e42ad6b3c30a - Error fetching the id for osd.0 in cluster ceph
Aug 27 08:29:51 dn-ceph-storage-console skyring: #033[31m2017-08-27T08:29:51.546+02:00 ERROR    util.go:265 AnalyseThresholdBreach]#033[0m skyring:07ed83ea-799e-4599-8dd0-e42ad6b3c30a - Error fetching the id for osd.2 in cluster ceph
Aug 27 08:29:51 dn-ceph-storage-console skyring: #033[31m2017-08-27T08:29:51.546+02:00 ERROR    monitoring.go:504 FetchOSDStats]#033[0m skyring:07ed83ea-799e-4599-8dd0-e42ad6b3c30a - Failed to analyse threshold breach for osd utilization of osd.2.Error skyring:07ed83ea-799e-4599-8dd0-e42ad6b3c30a - Error fetching the id for osd.2 in cluster ceph
Aug 27 08:29:51 dn-ceph-storage-console skyring: #033[31m2017-08-27T08:29:51.546+02:00 ERROR    util.go:265 AnalyseThresholdBreach]#033[0m skyring:07ed83ea-799e-4599-8dd0-e42ad6b3c30a - Error fetching the id for osd.3 in cluster ceph
Aug 27 08:29:51 dn-ceph-storage-console skyring: #033[31m2017-08-27T08:29:51.546+02:00 ERROR    monitoring.go:504 FetchOSDStats]#033[0m skyring:07ed83ea-799e-4599-8dd0-e42ad6b3c30a - Failed to analyse threshold breach for osd utilization of osd.3.Error skyring:07ed83ea-799e-4599-8dd0-e42ad6b3c30a - Error fetching the id for osd.3 in cluster ceph
Aug 27 08:29:51 dn-ceph-storage-console skyring: #033[31m2017-08-27T08:29:51.547+02:00 ERROR    util.go:265 AnalyseThresholdBreach]#033[0m skyring:07ed83ea-799e-4599-8dd0-e42ad6b3c30a - Error fetching the id for osd.7 in cluster ceph
Aug 27 08:29:51 dn-ceph-storage-console skyring: #033[31m2017-08-27T08:29:51.547+02:00 ERROR    monitoring.go:504 FetchOSDStats]#033[0m skyring:07ed83ea-799e-4599-8dd0-e42ad6b3c30a - Failed to analyse threshold breach for osd utilization of osd.7.Error skyring:07ed83ea-799e-4599-8dd0-e42ad6b3c30a - Error fetching the id for osd.7 in cluster ceph
Aug 27 08:29:51 dn-ceph-storage-console skyring: #033[31m2017-08-27T08:29:51.547+02:00 ERROR    util.go:265 AnalyseThresholdBreach]#033[0m skyring:07ed83ea-799e-4599-8dd0-e42ad6b3c30a - Error fetching the id for osd.4 in cluster ceph
Aug 27 08:29:51 dn-ceph-storage-console skyring: #033[31m2017-08-27T08:29:51.547+02:00 ERROR    monitoring.go:504 FetchOSDStats]#033[0m skyring:07ed83ea-799e-4599-8dd0-e42ad6b3c30a - Failed to analyse threshold breach for osd utilization of osd.4.Error skyring:07ed83ea-799e-4599-8dd0-e42ad6b3c30a - Error fetching the id for osd.4 in cluster ceph
Aug 27 08:29:51 dn-ceph-storage-console skyring: #033[31m2017-08-27T08:29:51.548+02:00 ERROR    util.go:265 AnalyseThresholdBreach]#033[0m skyring:07ed83ea-799e-4599-8dd0-e42ad6b3c30a - Error fetching the id for osd.6 in cluster ceph
Aug 27 08:29:51 dn-ceph-storage-console skyring: #033[31m2017-08-27T08:29:51.548+02:00 ERROR    monitoring.go:504 FetchOSDStats]#033[0m skyring:07ed83ea-799e-4599-8dd0-e42ad6b3c30a - Failed to analyse threshold breach for osd utilization of osd.6.Error skyring:07ed83ea-799e-4599-8dd0-e42ad6b3c30a - Error fetching the id for osd.6 in cluster ceph
Aug 27 08:29:51 dn-ceph-storage-console skyring: #033[31m2017-08-27T08:29:51.548+02:00 ERROR    util.go:265 AnalyseThresholdBreach]#033[0m skyring:07ed83ea-799e-4599-8dd0-e42ad6b3c30a - Error fetching the id for osd.5 in cluster ceph

Aug 27 08:29:51 dn-ceph-storage-console skyring: #033[31m2017-08-27T08:29:51.548+02:00 ERROR    monitoring.go:504 FetchOSDStats]#033[0m skyring:07ed83ea-799e-4599-8dd0-e42ad6b3c30a - Failed to analyse threshold breach for osd utilization of osd.5.Error skyring:07ed83ea-799e-4599-8dd0-e42ad6b3c30a - Error fetching the id for osd.5 in cluster ceph
Aug 27 08:29:53 dn-ceph-storage-console skyring: #033[31m2017-08-27T08:29:53.231+02:00 ERROR    monitoring.go:126 func1]#033[0m skyring:07ed83ea-799e-4599-8dd0-e42ad6b3c30a-


And for the database checking,  the result is as file Screenshot_from_2017-08-30_13-49-28.png shown.

Comment 3 Shubhendu Tripathi 2017-11-24 05:48:34 UTC
I suspect some name mismatch between salt reporting the hostnames in skyring and calamari reporting the server names here.

Due to this OSDs might not be synchronized well in system.

Would request to check if the hostnames listed in response of `api/v2/cluster/<fsid>/server` calamari API and `api/v2/cluster/<fsid>/osd/<osd_id>` are matching with the ones listed in skyring hosts list.

Comment 7 Shubhendu Tripathi 2018-11-19 05:43:47 UTC
This product is EOL now