Bug 2256899
| Summary: | Duplicate metrics in ocs-metrics-exporter | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Divyansh Kamboj <dkamboj> |
| Component: | ceph-monitoring | Assignee: | Divyansh Kamboj <dkamboj> |
| Status: | CLOSED ERRATA | QA Contact: | Filip Balák <fbalak> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.14 | CC: | ebenahar, muagarwa, nthomas, odf-bz-bot |
| Target Milestone: | --- | ||
| Target Release: | ODF 4.16.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | 4.15.0-123 | Doc Type: | No Doc Update |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2024-07-17 13:11:53 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Divyansh Kamboj
2024-01-05 04:50:45 UTC
After enabling rbd mirroring with command:
oc patch StorageCluster ocs-storagecluster -n openshift-storage --type json --patch '[{ 'op': 'replace', 'path': '/spec/mirroring', 'value': {'enabled': true} }]'
I see new metrics being added but I don't see required metrics from the PR added description added (https://github.com/red-hat-storage/ocs-operator/pull/2380):
ceph_rbd_mirror_snapshot_image_local_timestamp
ceph_rbd_mirror_snapshot_image_remote_timestamp
ceph_rbd_mirror_snapshot_image_last_sync_bytes
Are they changed in odf 4.15? How to reproduce the system state in which those metrics are available?
Tested with odf 4.15.0-134
Metrics ceph_rbd_mirror_snapshot_image_local_timestamp, ceph_rbd_mirror_snapshot_image_remote_timestamp, and ceph_rbd_mirror_snapshot_image_last_sync_bytes are not available after rbd mirroring is enabled. --> ASSIGNED Tested with odf 4.15.0-146 The metrics only show up when images are created and start syncing with the other cluster. moving it back to QA after discussion with Filip Metrics ceph_rbd_mirror_snapshot_image_local_timestamp, ceph_rbd_mirror_snapshot_image_remote_timestamp, and ceph_rbd_mirror_snapshot_image_last_sync_bytes are not available after rbd mirroring is enabled and syncing between 2 clusters starts in Regional DR setup. --> ASSIGNED
There have been ran workload on synced clusters but those metrics never appeared. If this is not sufficient to reproduce then please provide a reproducer.
Tested with odf 4.15.0-150
$ oc get cephblockpool ocs-storagecluster-cephblockpool -n openshift-storage -o json
{
"apiVersion": "ceph.rook.io/v1",
"kind": "CephBlockPool",
"metadata": {
"creationTimestamp": "2024-03-01T12:07:44Z",
"finalizers": [
"cephblockpool.ceph.rook.io"
],
"generation": 2,
"name": "ocs-storagecluster-cephblockpool",
"namespace": "openshift-storage",
"ownerReferences": [
{
"apiVersion": "ocs.openshift.io/v1",
"blockOwnerDeletion": true,
"controller": true,
"kind": "StorageCluster",
"name": "ocs-storagecluster",
"uid": "79fc34f3-e3bf-450e-8375-9b654f04c58b"
}
],
"resourceVersion": "3653985",
"uid": "77caacc8-30ff-494c-8daf-996797da41c5"
},
"spec": {
"enableRBDStats": true,
"erasureCoded": {
"codingChunks": 0,
"dataChunks": 0
},
"failureDomain": "rack",
"mirroring": {
"enabled": true,
"mode": "image",
"peers": {
"secretNames": [
"d2ea70d369f4dbd0ba3120d8033d016ec672f3a"
]
}
},
"quotas": {},
"replicated": {
"replicasPerFailureDomain": 1,
"size": 3,
"targetSizeRatio": 0.49
},
"statusCheck": {
"mirror": {}
}
},
"status": {
"info": {
"rbdMirrorBootstrapPeerSecretName": "pool-peer-token-ocs-storagecluster-cephblockpool"
},
"mirroringInfo": {
"lastChanged": "2024-03-04T10:09:12Z",
"lastChecked": "2024-03-04T10:10:12Z",
"mode": "image",
"peers": [
{
"client_name": "client.rbd-mirror-peer",
"direction": "rx-tx",
"mirror_uuid": "8277bd89-9f38-4356-a9cd-482a473eba2b",
"site_name": "4ee435a6-8a04-4b1c-9fc8-a131945a0f18",
"uuid": "d96d875e-2636-4d4c-befe-5235e6254060"
}
],
"site_name": "06eb0bad-f9b7-4c40-ba60-a2e87418dbf5"
},
"mirroringStatus": {
"lastChecked": "2024-03-04T10:10:12Z",
"summary": {
"daemon_health": "OK",
"health": "OK",
"image_health": "OK",
"states": {
"replaying": 1
}
}
},
"observedGeneration": 2,
"phase": "Ready",
"snapshotScheduleStatus": {}
}
}
Not a 4.15.0 blocker @fbalak i tested it out on a 4.15 cluster, and could see the values. i will give it a go again, to see if i face the issue you're facing In ODF 4.16.0-96 I see metrics from listed PRs available and metrics that should be removed not available as expected. Marking as VERIFIED as this was taken out from 4.15 and is targeted for 4.16 release. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.16.0 security, enhancement & bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2024:4591 |