Bug 1800382

Summary: Support 2-site Stretch Clusters in RADOS
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Greg Farnum <gfarnum>
Component: RADOSAssignee: Greg Farnum <gfarnum>
Status: CLOSED ERRATA QA Contact: Manohar Murthy <mmurthy>
Severity: high Docs Contact:
Priority: high    
Version: 5.0CC: asakthiv, assingh, ceph-eng-bugs, ceph-qe-bugs, dzafman, jdurgin, kchai, nojha, pdhiran, rmandyam, tserlin, vereddy, vumrao
Target Milestone: ---Keywords: FutureFeature
Target Release: 4.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ceph-14.2.11-76.el8cp, ceph-14.2.11-76.el7cp Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
: 1874628 (view as bug list) Environment:
Last Closed: 2021-01-12 14:55:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1874628    

Description Greg Farnum 2020-02-07 01:03:47 UTC
We want to support multiple-site stretch clusters in RADOS. In particular, targeting configurations of 2 main sites and a tiebreaker monitor. This means:
* improving the monitors so they can handle netsplits and still make forward progress (without going into infinite election loops)
* being able to identify when one site has disappeared, and keeping the data available out of the surviving site.

We will do this by implementing a monitor heartbeating and a new election algorithm that recognizes netsplits and handles them, and by implementing a "stretch mode" that recognizes multi-site mode when doing OSD peering and requires set members from both sites (until the monitors declare a site dead, and we go into single-site mode).

Comment 1 RHEL Program Management 2020-02-07 01:03:52 UTC
Please specify the severity of this bug. Severity is defined here:
https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.

Comment 7 Neha Ojha 2020-09-22 00:47:35 UTC
*** Bug 1874628 has been marked as a duplicate of this bug. ***

Comment 15 Veera Raghava Reddy 2020-11-16 12:39:58 UTC
Stretch cluster testing is carried out in detail by QE.

Comment 17 Veera Raghava Reddy 2020-11-30 06:49:41 UTC
Ranjini, Greg,
Currently we are providing stretch cluster feature fro OCS and not opening it for standalone RHCS customers. So I think this BZ should not be added to Errata?

Comment 23 errata-xmlrpc 2021-01-12 14:55:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat Ceph Storage 4.2 Security and Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:0081