Bug 1813976

Summary: [RFE] Add support for overriding the read-from-replica policy
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Jason Dillaman <jdillama>
Component: RBDAssignee: Ilya Dryomov <idryomov>
Status: CLOSED ERRATA QA Contact: Gopi <gpatta>
Severity: medium Docs Contact: Amrita <asakthiv>
Priority: unspecified    
Version: 5.0CC: asakthiv, bengland, ceph-eng-bugs, gfarnum, gpatta, hmunjulu, idryomov, kdreyer, rmandyam, vereddy
Target Milestone: ---Keywords: FutureFeature
Target Release: 5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ceph-16.0.0-8633.el8cp Doc Type: Enhancement
Doc Text:
.Overriding read-from-replica policy in librbd clients is supported Previously there was no way to limit the inter-DC/AZ network traffic, as when a cluster is stretched across data centers, the primary OSD may be on a higher latency and cost link in comparison with other OSDs in the PG. With this release, the `rbd_read_from_replica_policy` configuration option is now available and can be used to send reads to a random OSD or to the closest OSD in the PG, as defined by the CRUSH map and the client location in the CRUSH hierarchy. This can be done per-image, per-pool or globally. See the link:{block-dev-guide}#block-device-input-output-options_block[_Block device input and output options_] section in the _{storage-product} Block Device Guide_ for more information.
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-08-30 08:23:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1929671, 1959686    

Description Jason Dillaman 2020-03-16 15:47:08 UTC
librbd-based clients can now set the 'rbd_read_from_replica_policy' configuration option to "default" (i.e. read from the PG's primary OSD), "balance" (send the read to a random OSD), or "localize" (send to the closest OSD as defined by the CRUSH map and the librbd client's "crush_location" config option). The RBD configuration option can be set globally, per-pool, or per-image. The "crush_location" option should be set via "ceph.conf" on a per-node basis.

This feature is useful for stretch clusters where the PG's primary OSD might be across a higher-cost link as compared to other OSDs in the PG.

Comment 1 RHEL Program Management 2020-03-16 15:47:13 UTC
Please specify the severity of this bug. Severity is defined here:
https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.

Comment 2 Ben England 2020-12-02 21:36:58 UTC
This feature is also very useful for OCS in public clouds such as AWS, where by default OCS PGs are spread across "availability zones" (AZs) with higher latency between AZs than within them.

Comment 5 Gopi 2021-03-05 05:05:09 UTC
Feature is working as expected, hence moving this bug to verified state.

Comment 9 Amrita 2021-06-09 06:51:37 UTC
LGTM Ilya , 
I just added the `previously...` before `With this rel` to follow our doc standards.

Comment 11 errata-xmlrpc 2021-08-30 08:23:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 5.0 bug fix and enhancement), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3294