Bug 2298621 - [RFE] multisite sync observability: tracking sync deltas over time(in Grafana)
Summary: [RFE] multisite sync observability: tracking sync deltas over time(in Grafana)
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Ceph-Dashboard
Version: 8.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 8.0
Assignee: Ankush Behl
QA Contact: Chaithra
Akash Raj
URL:
Whiteboard:
: 2161734 (view as bug list)
Depends On:
Blocks: 2317218
TreeView+ depends on / blocked
 
Reported: 2024-07-18 08:10 UTC by Ankush Behl
Modified: 2025-03-26 04:25 UTC (History)
7 users (show)

Fixed In Version: ceph-19.1.0-3
Doc Type: Enhancement
Doc Text:
.New RGW Sync overview dashboard in Grafana With this release, you can now track replication differences over a time per shard from within the new RGW Sync overview dashboard in Grafana.
Clone Of:
Environment:
Last Closed: 2024-11-25 09:03:09 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHCEPH-9347 0 None None None 2024-07-18 08:15:49 UTC
Red Hat Issue Tracker RHCSDASH-1530 0 None None None 2024-07-18 08:15:53 UTC
Red Hat Product Errata RHBA-2024:10216 0 None None None 2024-11-25 09:03:15 UTC

Description Ankush Behl 2024-07-18 08:10:19 UTC
This bug was initially created as a copy of Bug #2247183

I am copying this bug because: 



Description of problem:

Currently, there is no easy way for an administrator to check the sync replication status between zones.  

Goal:

multisite sync observability: tracking sync deltas over time

Our proposed feature will increase the observability of the RGW multisite sync operations. It will provide administrators with real-time information about the replication health between zones. This will enable the admin to assess if the pending sync replication work is converging as expected or diverging. If it diverges and increases beyond a certain threshold, an alert can be configured in the alert manager to fire a warning.

To present this information to the user, we will use Prometheus to gather data and create a Grafana dashboard with data points representing the oldest incremental change not applied from the sync status command to populate the graph over time. 

The Grafana dashboard will display a slope to help us assess if the pending sync deltas are reducing or increasing over time. The ‘deltas’ will be sent from all the zones replicated in the zone group to Prometheus via the node-exporter.

Ideally, further down the line, we will be able to do similar work to have per-bucket granularity sync information in Prometheus so we can adhere to the bucket sync policy granularity that provides the user with a way to enable/disable bucket sync through the S3 API.

Comment 1 Storage PM bot 2024-07-18 08:10:27 UTC
Please specify the severity of this bug. Severity is defined here:
https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.

Comment 12 errata-xmlrpc 2024-11-25 09:03:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 8.0 security, bug fix, and enhancement updates), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2024:10216

Comment 13 Red Hat Bugzilla 2025-03-26 04:25:43 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.