Bug 2237412 - [4.13] ceph_rbd_* metrics are missing
Summary: [4.13] ceph_rbd_* metrics are missing
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: ceph-monitoring
Version: 4.13
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Juan Miguel Olmo
QA Contact: Daniel Osypenko
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-09-05 12:02 UTC by Daniel Osypenko
Modified: 2023-09-13 07:22 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-09-11 15:03:48 UTC
Embargoed:


Attachments (Terms of Use)

Description Daniel Osypenko 2023-09-05 12:02:25 UTC
This bug was initially created as a copy of Bug #2227781

I am copying this bug because: 
Issue is stable in its reproduction. test_ceph_rbd_metrics_available fails. Same list of missing metrics:
'ceph_rbd_write_ops',
'ceph_rbd_read_ops',
'ceph_rbd_write_bytes',
'ceph_rbd_read_bytes',
'ceph_rbd_write_latency_sum',
'ceph_rbd_write_latency_count'

must-gather: https://url.corp.redhat.com/5656d5a

Description of problem (please be detailed as possible and provide log
snippests):
There are missing rbd metrics:

ceph_rbd_write_ops
ceph_rbd_read_ops
ceph_rbd_write_bytes
ceph_rbd_read_bytes
ceph_rbd_write_latency_sum
ceph_rbd_write_latency_count


Version of all relevant components (if applicable):

Original bz:
ODF 4.13.1-9
OCP 4.13
-----
Cloned (new) bz
openshift client (4.13.0-0.nightly-2023-09-01-215139)
OCS ocs-operator.v4.13.2-rhodf
configuration:
OCS4-13-Downstream-OCP4-13-VSPHERE6-UPI-1AZ-RHCOS-EXTERNAL-3M-3W-OCS-deployment


Steps to Reproduce:
1. Install OCP/ODF cluster
2. After installation, check whether Prometheus provides values for ceph_rbd_* metrics listed above.

Actual results:
OCP Prometheus provides no values for any of the ceph_rbd_* metrics listed above.

Expected results:
OCP Prometheus provides values for all ceph_rbd_* metrics listed above.

Additional info:
This was discovered as part of regression runs:
https://reportportal-ocs4.apps.ocp-c1.prod.psi.redhat.com/ui/#ocs/launches/465/12972/594265/594266/594269/log?item0Params=filter.eq.hasStats%3Dtrue%26filter.eq.hasChildren%3Dfalse%26filter.in.issueType%3Dti001%252Cti_1h7tquhpjupuu%252Cti_u7ukrfvrt1yu%252Cti_qxkzvw4t6ipf%252Cti_1h7u8s8jf8tvb
https://reportportal-ocs4.apps.ocp-c1.prod.psi.redhat.com/ui/#ocs/launches/465/12961/593770/593771/593774/log?item0Params=filter.eq.hasStats%3Dtrue%26filter.eq.hasChildren%3Dfalse%26filter.in.issueType%3Dti001%252Cti_1h7tquhpjupuu%252Cti_u7ukrfvrt1yu%252Cti_qxkzvw4t6ipf%252Cti_1h7u8s8jf8tvb
https://reportportal-ocs4.apps.ocp-c1.prod.psi.redhat.com/ui/#ocs/launches/465/12961/593770/593771/593774/log?item0Params=filter.eq.hasStats%3Dtrue%26filter.eq.hasChildren%3Dfalse%26filter.in.issueType%3Dti001%252Cti_1h7tquhpjupuu%252Cti_u7ukrfvrt1yu%252Cti_qxkzvw4t6ipf%252Cti_1h7u8s8jf8tvb

https://bugzilla.redhat.com/show_bug.cgi?id=1779336 was closed as won't fix but this test used to pass with previous version where those metrics were present. (e.g. https://reportportal-ocs4.apps.ocp-c1.prod.psi.redhat.com/ui/#ocs/launches/465/13057/597442/597443/597446/log)

Comment 4 Daniel Osypenko 2023-09-11 15:03:48 UTC
rbd metrics where not enabled. 
ceph config set mgr mgr/prometheus/rbd_stats_pools "*" helped
closing as not a bug


Note You need to log in before you can comment on or make changes to this bug.