Created attachment 1723179 [details] Snippet of the device list Description of problem:[5.0] Ceph-Dashboard - Device health status is not getting listed under hosts section in 5.0 dashboard Version-Release number of selected component (if applicable): [root@magna094 ubuntu]# ./cephadm version Using recent ceph image registry-proxy.engineering.redhat.com/rh-osbs/rhceph:ceph-5.0-rhel-8-containers-candidate-96803-20201013192445 ceph version 16.0.0-6275.el8cp (d1e0606106224ac333f1c245150d7484cb626841) pacific (dev) How reproducible: Steps to Reproduce: 1.Install 5.0 cluster 2. Launch dashboard URL 3. Go to hosts section and click on specific host and check for its attributes like device, device health, daemons, inventory and others 4. Observe that device and device state of health is showing "unknown" and device health displayes " failed to retrieve SMART DATA" Cephadm cli is showing up details correctly with ceph orch device ls command Actual results: Device state of health is showing "unknown" and device health displayes " failed to retrieve SMART DATA" Expected results: Values should reflect as per cli output in dashboard also Additional info: https://magna094.ceph.redhat.com:8443/#/login -Dashboard URL admin/ admin123 Bootstrap node: magna094 root/q
Created attachment 1723180 [details] unknown state
@Ernesto, the ask was to confirm from dahsboard team if they using the below to get information from devices. There is no action item from QE i guess. I checked with Juan also on the same. Is the dashboard using: "ceph orch device ls --format json"
I think we need to use the same source of information independently of the tool (dashboard, cli) which is providing the output. In this case , what we want is to get the list of devices in the cluster hosts and what is the state of this devices: This is done in the orchestrator using the command: ``` # ceph orch device ls ``` Which is using in the background the command: ``` # ceph-volume inventory ``` and in the dashboard ( copying what Preethi says:) """ 3. Go to hosts section and click on specific host and check for its attributes like device, device health, daemons, inventory and others 4. Observe that device and device state of health is showing "unknown" and device health displayes " failed to retrieve SMART DATA" """ As you said Ernesto, in the dashboard we are using: ``` # ceph daemon osd.<id> smart <devid>" ``` So the first thing to do is to decide what is the best tool to provide information about storage devices in the cluster hosts, and use the same in the dashboard and in the orchestrator CLI I think that probably we should rely in ceph-volume to get this information, because if you use "ceph daemon osd.<id>" you cannot retrieve the information of the device if you do not have an OSD created using the device. So i think that the better solution is to change the command used in the dashboard to "ceph orch device ls --format json"
I continue thinking the same.. the information about devices form the dashvoard and from the CLI must be the same. And must have the same source. In my view the right source is to use always the command: # ceph orch device ls --json
@JuanMi, Dashboard is currently using this command for retrieving the smart data - 'ceph daemon <svc_type>.<svc_id> smart <device_id>'. This command runs fine for svc_type="osd" but is failing for svc_type="mon". Moreover "ceph orch device ls --json" is not providing relevant smart data attributes. Maybe there is some other way to fetch smart data from ceph-volume.
https://github.com/ceph/ceph/pull/40494 missed the v16.2.0 release upstream. Since this is low severity, I recommend we re-target this to RHCS 5.1.
*** Bug 2025699 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat Ceph Storage 5.1 Security, Enhancement, and Bug Fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:1174