Bug 2256899 - Duplicate metrics in ocs-metrics-exporter
Summary: Duplicate metrics in ocs-metrics-exporter
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: ceph-monitoring
Version: 4.14
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ODF 4.16.0
Assignee: Divyansh Kamboj
QA Contact: Filip Balák
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-01-05 04:50 UTC by Divyansh Kamboj
Modified: 2024-07-17 13:11 UTC (History)
4 users (show)

Fixed In Version: 4.15.0-123
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-07-17 13:11:53 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github red-hat-storage ocs-operator pull 2380 0 None open remove rbd-mirror metrics already exported by ceph 2024-01-22 08:00:09 UTC
Github red-hat-storage ocs-operator pull 2416 0 None open Bug 2256899: [release-4.15] remove rbd-mirror metrics already exported by ceph 2024-01-23 08:02:56 UTC
Red Hat Product Errata RHSA-2024:4591 0 None None None 2024-07-17 13:11:55 UTC

Description Divyansh Kamboj 2024-01-05 04:50:45 UTC
A lot of metrics exported by ocs-metrics-exporter are also provided by the ceph-exporter.

eg:

ocs_rbd_mirror_image_primary_snapshot_timestamp and ceph_rbd_mirror_snapshot_image_local_timestamp provide the same information

Comment 6 Filip Balák 2024-02-06 12:35:58 UTC
After enabling rbd mirroring with command:

oc patch StorageCluster ocs-storagecluster -n openshift-storage --type json --patch '[{ 'op': 'replace', 'path': '/spec/mirroring', 'value': {'enabled': true} }]'

I see new metrics being added but I don't see required metrics from the PR added description added (https://github.com/red-hat-storage/ocs-operator/pull/2380):

ceph_rbd_mirror_snapshot_image_local_timestamp
ceph_rbd_mirror_snapshot_image_remote_timestamp
ceph_rbd_mirror_snapshot_image_last_sync_bytes

Are they changed in odf 4.15? How to reproduce the system state in which those metrics are available?

Tested with odf 4.15.0-134

Comment 7 Filip Balák 2024-02-20 18:16:06 UTC
Metrics ceph_rbd_mirror_snapshot_image_local_timestamp, ceph_rbd_mirror_snapshot_image_remote_timestamp, and ceph_rbd_mirror_snapshot_image_last_sync_bytes are not available after rbd mirroring is enabled. --> ASSIGNED

Tested with odf 4.15.0-146

Comment 8 Divyansh Kamboj 2024-02-21 12:58:34 UTC
The metrics only show up when images are created and start syncing with the other cluster. moving it back to QA after discussion with Filip

Comment 9 Filip Balák 2024-03-04 10:18:34 UTC
Metrics ceph_rbd_mirror_snapshot_image_local_timestamp, ceph_rbd_mirror_snapshot_image_remote_timestamp, and ceph_rbd_mirror_snapshot_image_last_sync_bytes are not available after rbd mirroring is enabled and syncing between 2 clusters starts in Regional DR setup. --> ASSIGNED

There have been ran workload on synced clusters but those metrics never appeared. If this is not sufficient to reproduce then please provide a reproducer.

Tested with odf 4.15.0-150

$ oc get cephblockpool ocs-storagecluster-cephblockpool -n openshift-storage -o json
{
    "apiVersion": "ceph.rook.io/v1",
    "kind": "CephBlockPool",
    "metadata": {
        "creationTimestamp": "2024-03-01T12:07:44Z",
        "finalizers": [
            "cephblockpool.ceph.rook.io"
        ],
        "generation": 2,
        "name": "ocs-storagecluster-cephblockpool",
        "namespace": "openshift-storage",
        "ownerReferences": [
            {
                "apiVersion": "ocs.openshift.io/v1",
                "blockOwnerDeletion": true,
                "controller": true,
                "kind": "StorageCluster",
                "name": "ocs-storagecluster",
                "uid": "79fc34f3-e3bf-450e-8375-9b654f04c58b"
            }
        ],
        "resourceVersion": "3653985",
        "uid": "77caacc8-30ff-494c-8daf-996797da41c5"
    },
    "spec": {
        "enableRBDStats": true,
        "erasureCoded": {
            "codingChunks": 0,
            "dataChunks": 0
        },
        "failureDomain": "rack",
        "mirroring": {
            "enabled": true,
            "mode": "image",
            "peers": {
                "secretNames": [
                    "d2ea70d369f4dbd0ba3120d8033d016ec672f3a"
                ]
            }
        },
        "quotas": {},
        "replicated": {
            "replicasPerFailureDomain": 1,
            "size": 3,
            "targetSizeRatio": 0.49
        },
        "statusCheck": {
            "mirror": {}
        }
    },
    "status": {
        "info": {
            "rbdMirrorBootstrapPeerSecretName": "pool-peer-token-ocs-storagecluster-cephblockpool"
        },
        "mirroringInfo": {
            "lastChanged": "2024-03-04T10:09:12Z",
            "lastChecked": "2024-03-04T10:10:12Z",
            "mode": "image",
            "peers": [
                {
                    "client_name": "client.rbd-mirror-peer",
                    "direction": "rx-tx",
                    "mirror_uuid": "8277bd89-9f38-4356-a9cd-482a473eba2b",
                    "site_name": "4ee435a6-8a04-4b1c-9fc8-a131945a0f18",
                    "uuid": "d96d875e-2636-4d4c-befe-5235e6254060"
                }
            ],
            "site_name": "06eb0bad-f9b7-4c40-ba60-a2e87418dbf5"
        },
        "mirroringStatus": {
            "lastChecked": "2024-03-04T10:10:12Z",
            "summary": {
                "daemon_health": "OK",
                "health": "OK",
                "image_health": "OK",
                "states": {
                    "replaying": 1
                }
            }
        },
        "observedGeneration": 2,
        "phase": "Ready",
        "snapshotScheduleStatus": {}
    }
}

Comment 10 Mudit Agarwal 2024-03-04 10:28:38 UTC
Not a 4.15.0 blocker

Comment 15 Divyansh Kamboj 2024-04-03 10:51:15 UTC
@fbalak i tested it out on a 4.15 cluster, and could see the values. i will give it a go again, to see if i face the issue you're facing

Comment 17 Filip Balák 2024-05-14 10:07:05 UTC
In ODF 4.16.0-96 I see metrics from listed PRs available and metrics that should be removed not available as expected. Marking as VERIFIED as this was taken out from 4.15 and is targeted for 4.16 release.

Comment 20 errata-xmlrpc 2024-07-17 13:11:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.16.0 security, enhancement & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:4591


Note You need to log in before you can comment on or make changes to this bug.