Bug 1779336

Summary: OCS Monitoring is missing ceph_rbd_* metrics
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Martin Bukatovic <mbukatov>
Component: ceph-monitoringAssignee: Anmol Sachan <asachan>
Status: CLOSED WONTFIX QA Contact: Oded <oviner>
Severity: low Docs Contact:
Priority: unspecified    
Version: 4.2CC: afrahman, asachan, bniver, etamir, madam, muagarwa, nthomas, ocs-bugs, odf-bz-bot, owasserm, pcuzner, ratamir, shan, smordech, uchapaga
Target Milestone: ---Keywords: FutureFeature
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 4.6.0 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-06-08 06:31:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1850947    
Bug Blocks: 1916331    
Attachments:
Description Flags
ceph_rbd_metrics
none
ceph_rbd_read_bytes not found
none
example of the rbd perf tool none

Description Martin Bukatovic 2019-12-03 19:06:53 UTC
Description of problem
======================

OCP Prometheus instance in OCS cluster doesn't provide RBD Performance metrics
(ceph_rbd_*).

Version-Release number of selected component
============================================

cluster channel: stable-4.2
cluster version: 4.2.0-0.nightly-2019-12-02-165545
cluster image: registry.svc.ci.openshift.org/ocp/release@sha256:159ba31ba77bc5ae78d09caca06eada94483b27f4f8e8afafdfa036b42bd3d80

storage namespace openshift-cluster-storage-operator
image quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:1f2819a25c4f1945d3a2a63e64b89a4f1f095cbb34e39829785e966151340fdc
 * quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:1f2819a25c4f1945d3a2a63e64b89a4f1f095cbb34e39829785e966151340fdc

storage namespace openshift-storage
image quay.io/rhceph-dev/cephcsi:4.2-225.72ac53b6.release_4.2
 * quay.io/rhceph-dev/cephcsi@sha256:69950f3c8bdf6889e85deafdbc8d07e63910337435d2f5d6a1ba674d08f85c51
image quay.io/openshift/origin-csi-node-driver-registrar:4.2
 * quay.io/openshift/origin-csi-node-driver-registrar@sha256:6671a4a02a9bf5abfa58087b6d2fea430278d7bc5017aab6b88a84203c0dad06
image quay.io/openshift/origin-csi-external-attacher:4.2
 * quay.io/openshift/origin-csi-external-attacher@sha256:dafeaa49292fce721c1ed763efd1f45b02a9c4f78d7087a854067c5e2718d93c
image quay.io/openshift/origin-csi-external-provisioner:4.2
 * quay.io/openshift/origin-csi-external-provisioner@sha256:dbe8b5e1bebfed1e8f68be1d968ff0356270a0311c21f3920e0f3bec3b4d97ea
image quay.io/openshift/origin-csi-external-snapshotter:4.2
 * quay.io/openshift/origin-csi-external-snapshotter@sha256:910f54893ac1ae42b80735b0def153ee62aa7a73d5090d2955afc007b663ec79
image quay.io/rhceph-dev/mcg-core:5.2.11-20.59b35a4f0.5.2
 * quay.io/rhceph-dev/mcg-core@sha256:3f9da7525a751837750cfe8fd49d8707090c7a1c4e257c23b882184e5f97f76e
image registry.access.redhat.com/rhscl/mongodb-36-rhel7:latest
 * registry.access.redhat.com/rhscl/mongodb-36-rhel7@sha256:c845fa631fafab54addb20e33af2a8bc2a693af07ffd9a9fa1bc2715d39b40ca
image quay.io/rhceph-dev/mcg-operator:2.0.9-35.db6a79b.2.0
 * quay.io/rhceph-dev/mcg-operator@sha256:e065c52e0132450477223f4262a495db84de0963ed639457c961e52c36618f35
image quay.io/rhceph-dev/ocs-operator:4.2-259.e29a26e.release_4.2
 * quay.io/rhceph-dev/ocs-operator@sha256:339ad733e127e7eb182e46cfd74200615c42bd4f1d44331e2f13479b8ccfad8d
image quay.io/rhceph-dev/rook-ceph:4.2-257.9a1d8592.ocs_4.2
 * quay.io/rhceph-dev/rook-ceph@sha256:1ae7998e77ccea9a07a39da0d196326d9e485d05bd24df48c088a740dd219410
image quay.io/rhceph-dev/rhceph:4-119.20191128.ci.0
 * quay.io/rhceph-dev/rhceph@sha256:2bbb7704bae17b8a20c6ff1c240d2b1e83cd185fff2aa80ccc85968365cbb4e0

Note: rhceph:4-119.2019112 image comes with Nautilus, as can be seen in output
of ceph version command:

```
# ceph --version
ceph version 14.2.4-55.el8cp (4f4ac92930e1b5abf963781f77361a7dd4cffafa) nautilus (stable)
```

How reproducible
================

100%

Steps to Reproduce
==================

1. Install OCP/OCS cluster (I did this via red-hat-storage/ocs-ci, using
   downstream OCS images, ocs-ci commit 89a1698)
2. Login as kubeadmin to OCP Console
3. Go to Monitoring -> Metrics page
4. Type eg. ceph_rbd_write_ops into the query and run it

Actual results
==============

There are no results, the metric is not defined.

Expected results
================

Values are provided.

Additional info
===============

RBD Performance metrics (ceph_rbd_*) are available since Nautilus:
https://ceph.com/rbd/new-in-nautilus-rbd-performance-monitoring/

https://jira.coreos.com/browse/KNIP-634?focusedCommentId=117344&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-117344

Comment 2 Michael Adam 2019-12-09 17:32:28 UTC
Certainly needs to be fixed but I doubt it's critical enough to block 4.2.0 GA.

@Nishanth, proposing to move to 4.3 (or 4.2.z).

Comment 3 Nishanth Thomas 2019-12-10 07:40:18 UTC
Moved out to 4.4

Comment 4 Yaniv Kaul 2019-12-10 08:18:41 UTC
(In reply to Nishanth Thomas from comment #3)
> Moved out to 4.4

Why?

Comment 5 Paul Cuzner 2019-12-11 05:04:02 UTC
I think the pools need to be added to the prometheus modules rbd_stat_pools setting. Once this is done and the module disabled/enabled data should be visible.

Be aware though that if anything is relying on the prometheus port being there (liveness check for example), it could in theory look to k8s that the mgr is down - so if this happens you know why!

Comment 6 Paul Cuzner 2019-12-11 05:45:53 UTC
Just checked the process on a local machine. prometheus doesn't need to be restarted - but the rbd_stat_pool needs to be updated/defined

Once the pool is defined
e.g. ceph config set mgr mgr/prometheus/rbd_stats_pools rbd

you'll see stats like this;
ceph_rbd_write_bytes{pool="rbd",namespace="",image="testdisk"} 0.0
# HELP ceph_rbd_read_bytes RBD image bytes read
# TYPE ceph_rbd_read_bytes counter
ceph_rbd_read_bytes{pool="rbd",namespace="",image="testdisk"} 0.0
# HELP ceph_rbd_write_latency_sum RBD image writes latency (msec) Total
# TYPE ceph_rbd_write_latency_sum counter
ceph_rbd_write_latency_sum{pool="rbd",namespace="",image="testdisk"} 0.0
# HELP ceph_rbd_write_latency_count RBD image writes latency (msec) Count
# TYPE ceph_rbd_write_latency_count counter
ceph_rbd_write_latency_count{pool="rbd",namespace="",image="testdisk"} 0.0
# HELP ceph_rbd_read_latency_sum RBD image reads latency (msec) Total
# TYPE ceph_rbd_read_latency_sum counter
ceph_rbd_read_latency_sum{pool="rbd",namespace="",image="testdisk"} 0.0
# HELP ceph_rbd_read_latency_count RBD image reads latency (msec) Count
# TYPE ceph_rbd_read_latency_count counter
ceph_rbd_read_latency_count{pool="rbd",namespace="",image="testdisk"} 0.0

in addition to rbd_stats_pools (space or comma separated list of pools) there is also rbd_stats_pools_refresh_interval (which IIRC defaults to 5mins)

Given the dependency on defining the pool, for OCS this would probably need to be tied into the rook-ceph work-flow. For example, storageclass created on rook-block provider, rook would need to update the rbd_stats_pools list - and obviously the reverse is also true. 

Alternatively, maybe we could change the code to report on pools that have the rbd application enabled, and backport? Might be simpler

however, if we're expecting 000's of rbd's this could put load on the mgr and the prometheus instance storage too.

Given the above, I don't think this is a 4.2 thing - more like an RFE for 4.3 or 4.4

Comment 7 Nishanth Thomas 2019-12-11 06:08:26 UTC
(In reply to Yaniv Kaul from comment #4)
> (In reply to Nishanth Thomas from comment #3)
> > Moved out to 4.4
> 
> Why?

Its more of an RFE;hence I think its better to handle this in 4.4 as 4.3 window is short.
Wanted to check if there is an urgency to get this done for 4.3 and in that case we can prioritize this. None of the dashboard features waiting on this.

Comment 8 Yaniv Kaul 2019-12-19 08:03:22 UTC
(In reply to Paul Cuzner from comment #6)
> Just checked the process on a local machine. prometheus doesn't need to be
> restarted - but the rbd_stat_pool needs to be updated/defined
> 
> Once the pool is defined
> e.g. ceph config set mgr mgr/prometheus/rbd_stats_pools rbd
> 
> you'll see stats like this;
> ceph_rbd_write_bytes{pool="rbd",namespace="",image="testdisk"} 0.0
> # HELP ceph_rbd_read_bytes RBD image bytes read
> # TYPE ceph_rbd_read_bytes counter
> ceph_rbd_read_bytes{pool="rbd",namespace="",image="testdisk"} 0.0
> # HELP ceph_rbd_write_latency_sum RBD image writes latency (msec) Total
> # TYPE ceph_rbd_write_latency_sum counter
> ceph_rbd_write_latency_sum{pool="rbd",namespace="",image="testdisk"} 0.0
> # HELP ceph_rbd_write_latency_count RBD image writes latency (msec) Count
> # TYPE ceph_rbd_write_latency_count counter
> ceph_rbd_write_latency_count{pool="rbd",namespace="",image="testdisk"} 0.0
> # HELP ceph_rbd_read_latency_sum RBD image reads latency (msec) Total
> # TYPE ceph_rbd_read_latency_sum counter
> ceph_rbd_read_latency_sum{pool="rbd",namespace="",image="testdisk"} 0.0
> # HELP ceph_rbd_read_latency_count RBD image reads latency (msec) Count
> # TYPE ceph_rbd_read_latency_count counter
> ceph_rbd_read_latency_count{pool="rbd",namespace="",image="testdisk"} 0.0
> 
> in addition to rbd_stats_pools (space or comma separated list of pools)
> there is also rbd_stats_pools_refresh_interval (which IIRC defaults to 5mins)
> 
> Given the dependency on defining the pool, for OCS this would probably need
> to be tied into the rook-ceph work-flow. For example, storageclass created
> on rook-block provider, rook would need to update the rbd_stats_pools list -
> and obviously the reverse is also true. 
> 
> Alternatively, maybe we could change the code to report on pools that have
> the rbd application enabled, and backport? Might be simpler
> 
> however, if we're expecting 000's of rbd's this could put load on the mgr
> and the prometheus instance storage too.

Yes, we do expect that number eventually.

> 
> Given the above, I don't think this is a 4.2 thing - more like an RFE for
> 4.3 or 4.4

is there an open upstream issue on Rook to get this done?

Comment 12 Eran Tamir 2020-04-01 15:02:53 UTC
@nishanth,  I think it's valuable in Prometheus, mainly for debugging. 
Based on the feedback we will get, we will consider adding it to the dashboards.

Comment 13 Yaniv Kaul 2020-05-17 08:10:28 UTC
(In reply to Eran Tamir from comment #12)
> @nishanth,  I think it's valuable in Prometheus, mainly for debugging. 
> Based on the feedback we will get, we will consider adding it to the
> dashboards.

So what's the priority here? Obviously it's not a 4.4 material.

Comment 14 Yaniv Kaul 2020-06-01 07:21:02 UTC
(In reply to Eran Tamir from comment #12)
> @nishanth,  I think it's valuable in Prometheus, mainly for debugging. 
> Based on the feedback we will get, we will consider adding it to the
> dashboards.

Closing for the time being, till we get some feedback.

Comment 15 Martin Bukatovic 2020-06-01 18:43:01 UTC
(In reply to Yaniv Kaul from comment #14)
> (In reply to Eran Tamir from comment #12)
> > @nishanth,  I think it's valuable in Prometheus, mainly for debugging. 
> > Based on the feedback we will get, we will consider adding it to the
> > dashboards.
> 
> Closing for the time being, till we get some feedback.

As I note in comment 10, this feature should be considered during a redesign
of the current way OCS PV dashboard reports storage utilization. I suggest to
keep this open, until this redesign is actually planned.

If that happened and I just missed it, I'm sorry for the trouble. In such a case,
please reference it here in a comment.

Comment 16 Sébastien Han 2020-06-09 13:20:18 UTC
From a Rook perspective, we can add the "ceph config set mgr mgr/prometheus/rbd_stats_pools rbd" as part of the pool creation (with CephBlockPool CRD).
Paul, is the CLI identical with rbd_stats_pools_refresh_interval option? Is it also pool based?

Thanks.

Comment 17 Paul Cuzner 2020-07-13 05:40:06 UTC
Seb, the setting is a comma or space separated string, containing all pools - so you'll need to get current and append. The refresh interval is the timer for when the scrape pool data is refreshed, the rbd stats are gathered at every collection interval.

However, although this is relatively easy to enable I'd be wary of turning it on by default until we understand the impact it has on the mgr for 1000's of PV's. As I said earlier in this thread we also don't expose it in the dashboard - so adding this overhead has limited benefit.

If the main goal here is debug, would the "rbd top" commands from the rbd_support module be an alternate? It's autoenabled anyway.

Comment 18 Sébastien Han 2020-07-15 08:18:18 UTC
(In reply to Paul Cuzner from comment #17)
> Seb, the setting is a comma or space separated string, containing all pools
> - so you'll need to get current and append. The refresh interval is the
> timer for when the scrape pool data is refreshed, the rbd stats are gathered
> at every collection interval.
> 
> However, although this is relatively easy to enable I'd be wary of turning
> it on by default until we understand the impact it has on the mgr for 1000's
> of PV's. As I said earlier in this thread we also don't expose it in the
> dashboard - so adding this overhead has limited benefit.

That's the approach that has been taken, it's disabled by default on CephBlockPool creation.

> 
> If the main goal here is debug, would the "rbd top" commands from the
> rbd_support module be an alternate? It's autoenabled anyway.

Thanks!

Comment 24 Oded 2020-09-30 10:12:08 UTC
There are not "ceph_rbd*" metrics.

SetUp:
prvider:vmare
OCP Version:4.6.0-0.nightly-2020-09-29-170625
OCS Version:ocs-operator.v4.6.0-101.ci

sh-4.4# ceph versions
{
    "mon": {
        "ceph version 14.2.8-111.el8cp (2e6029d57bc594eceba4751373da6505028c2650) nautilus (stable)": 3
    },
    "mgr": {
        "ceph version 14.2.8-111.el8cp (2e6029d57bc594eceba4751373da6505028c2650) nautilus (stable)": 1
    },
    "osd": {
        "ceph version 14.2.8-111.el8cp (2e6029d57bc594eceba4751373da6505028c2650) nautilus (stable)": 3
    },
    "mds": {
        "ceph version 14.2.8-111.el8cp (2e6029d57bc594eceba4751373da6505028c2650) nautilus (stable)": 2
    },
    "rgw": {
        "ceph version 14.2.8-111.el8cp (2e6029d57bc594eceba4751373da6505028c2650) nautilus (stable)": 2
    },
    "overall": {
        "ceph version 14.2.8-111.el8cp (2e6029d57bc594eceba4751373da6505028c2650) nautilus (stable)": 11
    }
}


Test Process:
1.Install OCP+OCS cluster
2.Login as kubeadmin to OCP Console
3.Go to Monitoring -> Metrics page
4.Type eg. ceph_rbd into the query and run it
There are not "ceph_rbd*" metrics.
**Attached screenshot

Comment 25 Oded 2020-09-30 10:14:18 UTC
Created attachment 1717817 [details]
ceph_rbd_metrics

Comment 26 Oded 2020-09-30 10:34:32 UTC
Created attachment 1717825 [details]
ceph_rbd_read_bytes not found

Comment 27 umanga 2020-10-06 14:15:13 UTC
Even though everything is configured as expected and tested before, RBD metrics is indeed missing.
Will need someone with Ceph expertise to identify what went wrong on ceph-mgr side.

Comment 29 Paul Cuzner 2020-10-07 04:46:14 UTC
Created attachment 1719558 [details]
example of the rbd perf tool

Comment 31 Sébastien Han 2020-10-07 07:45:14 UTC
Must gather logs?

Is the config correctly applied on the pool?
Can someone query the mgr metrics directly?

Comment 32 Eran Tamir 2020-10-08 13:05:34 UTC
@Shiri can you please add all dashboard BZs as a reference here?

Comment 36 Martin Bukatovic 2021-01-20 22:37:29 UTC
Reopening based on comment https://bugzilla.redhat.com/show_bug.cgi?id=1916331#c3 from Travis.