Bug 1367186 - [RFE] mirroring to multiple secondaries from a single primary
Summary: [RFE] mirroring to multiple secondaries from a single primary
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RBD
Version: 2.0
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: rc
: 3.2
Assignee: Jason Dillaman
QA Contact: Vasishta
Bara Ancincova
URL:
Whiteboard:
Depends On:
Blocks: 1416922 1494421 1629656
TreeView+ depends on / blocked
 
Reported: 2016-08-15 19:40 UTC by Federico Lucifredi
Modified: 2019-01-03 19:01 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
.Support for RBD mirroring to multiple secondary clusters Mirroring RADOS Block Devices (RBD) from one primary cluster to multiple secondary clusters is now fully supported.
Clone Of:
Environment:
Last Closed: 2019-01-03 19:01:20 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 17028 0 None None None 2016-08-15 19:55:29 UTC
Red Hat Product Errata RHBA-2019:0020 0 None None None 2019-01-03 19:01:49 UTC

Description Federico Lucifredi 2016-08-15 19:40:20 UTC
Description of problem:

Currently, we support use of RBD mirroring with a single secondary site.

We have an outstanding support exception for multiple secondaries where synchronization is one-way (meaning only one site is primary for all images). 

We want to test this configuration as we are supporting it for an important customer.

Comment 2 Jason Dillaman 2017-01-04 21:03:05 UTC
This is a QE BZ for creating test cases for this scenario.

Comment 4 Harish NV Rao 2017-01-18 11:02:57 UTC
@Federico and Jason, we have few questions:
1) How many secondary sites are supported?
2) Is the delay between primary and all secondary same or different? Please let us know the delay specification
3) Should this be tested with RHEL OSP? which version of OSP? What specific use cases from OSP need to be tested?
4) how is process wise? all secondary sites will share one process on master or a new process for each secondary site? In case of former how is primary going to distribute time/resources for each site?
5) How an image is determined to be in synced state? is it when all secondary sites have synced or something else?
6) Should this be tested on Ubuntu also?

Comment 5 Harish NV Rao 2017-01-18 11:03:30 UTC
@Jason, please check comment 4

Comment 6 Jason Dillaman 2017-01-18 14:05:28 UTC
@Harish:

1) While there is no technical upper limit imposed, the use-cases I've seen so far are aimed at 2 secondary sites. The OpenStack group has even discussed a ring topology where each region (and Ceph cluster) uses a unique set of pools (i.e. region 1 has r1_images, region 2 has r2_images, ...), and mirroring is configured bidirectionally between two sites in a pair-wise fashion (r1 r1_images <-> r2 r1_images, r2 r2_images <-> r3 r2_images, and r3 r3_images <-> r1 r3_images).

2) Not quite sure what you are asking here, but the two secondaries do not require the same latency between the primary site. However, if throughput is less than IO injected into the image's journal, this will result in journal growth.

3) OpenStack Ocata is gaining the Cinder integration w/ enabling RBD mirroring on a per-image basis. 

4) rbd-mirror is a pull operation, so the daemon would be on the non-primary site(s) and pulling data from the primary site.

5) With rbd-mirroring, there isn't a "synced" state since the goal is just to provide consistency. If the link is fast enough and the primary image isn't being written against, they will be "synced". The best way to create a sync point is to create a snapshot on the primary image -- when the snapshot appears on the non-primary site, they are "synced" up to the creation of the snapshot.

Comment 7 Federico Lucifredi 2017-01-20 18:23:15 UTC
1) let's start with supporting two secondaries.

2) the latency will nearly always be different by virtue of different geographic distances.

3) not until OSP 11. Test Ceph only this time around.

6) do run a few tests on Ubuntu for "smoke testing", but there should not be platform difference of note here.

Comment 8 Harish NV Rao 2017-01-24 09:18:37 UTC
(In reply to Federico Lucifredi from comment #7)
> 1) let's start with supporting two secondaries.

That means there will not be any mirroring happening between the two secondary sites. Right?

Please let us know.

Comment 9 Harish NV Rao 2017-01-24 14:41:23 UTC
(In reply to Harish NV Rao from comment #8)
> (In reply to Federico Lucifredi from comment #7)
> > 1) let's start with supporting two secondaries.
> 
> That means there will not be any mirroring happening between the two
> secondary sites. Right?

To be more specific, does the following scope sound ok?
"There will be one primary and two secondary sites with Secondary sites configured PURELY as back up sites. That is, there will not be any mirroring established between secondary sites and also secondary sites do not host any pool or image for which they are 'primary' "

Comment 11 Federico Lucifredi 2017-01-24 21:27:58 UTC
That is correct - mirroring is from a primary to a secondary, and not between secondaries.

Comment 12 Rachana Patel 2017-01-25 18:29:57 UTC
(In reply to Federico Lucifredi from comment #11)
> That is correct - mirroring is from a primary to a secondary, and not
> between secondaries.

Please let us know if this is valid/expected use case or not?
Pool - data1 ; site A is primary and Site B-Site C  are secondary sites for mirroring
At the same time for Pool - Data2 :- site B is primaray and Site C, Site A are Secondary sites

Comment 15 Federico Lucifredi 2017-02-21 00:34:16 UTC
Rachana, the use case described in #12 is not currently supported, but it will be in future releases.

One primary and one secondary are the key use case for mirroring at the moment. One primary with multiple secondaries is supposed to work, and would be interesting to test.  The intersecting primary/secondary pools in #12 are interesting but not a testing priority.

Comment 35 Vasishta 2018-11-09 06:22:48 UTC
All planned Testcases are completed successfully (No blockers), 
Moving BZ to VERIFIED state.

Regards,
Vasishta shastry
QE, Ceph

Comment 37 errata-xmlrpc 2019-01-03 19:01:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0020


Note You need to log in before you can comment on or make changes to this bug.