User Story: As a storage admin, I want to know that RBD mirroring for multiple secondary sites with one-way synchronization (i.e. one site is primary for all images) is production ready. Content Plan Reference: https://docs.google.com/document/d/1Nxnh6XxpTiDO2TANEw5pvXZ0nYUwf36zTaqxCm0014w/edit#heading=h.nh8311opzbco
A small fine tune we can do in example commands, In step 3 and 4 'Configuring Image Mirroring' (One way mirroring), example command needs to be appended with pool name 'data'. (In second code block of each step)
Hi John, I think we need to address disaster recovery scenario for multi-secondary scenario in section "4.7. Recovering from a Disaster". I've copied you some info in a mail thread, please let me know if you need anything else, Moving back to ASSIGNED state. Regards, Vasishta Shastry QE, Ceph
(In reply to Vasishta from comment #10) > Hi John, > > I think we need to address disaster recovery scenario for multi-secondary > scenario in section "4.7. Recovering from a Disaster". > > I've copied you some info in a mail thread, please let me know if you need > anything else, Moving back to ASSIGNED state. > > > Regards, > Vasishta Shastry > QE, Ceph Thanks Vasishta. I tried to come up with some instructions for how to do this based on the information in the email. [1] I substituted cluter names "local" for DC1 and "remote" for DC2. Please review them and provide suggestions/corrections as needed. 1) Add the new primary (remote) as peer on the original primary (local). $ rbd mirror pool peer add data client.remote@remote --cluster local ^^ I am not sure the "order" of this command is correct. It's not intuitive to me. Is it right? 2) Demote the image on local if it's still listed as primary a. To get status of a mirrored image: rbd mirror image status <pool-name>/<image-name> Example To get the status of the image2 image in the data pool: $ rbd mirror image status data/image2 image2: global_id: 2c928338-4a86-458b-9381-e68158da8970 state: up+replaying description: replaying, master_position=[object_number=6, tag_tid=2, entry_tid=22598], mirror_position=[object_number=6, tag_tid=2, entry_tid=29598], entries_behind_master=0 last_update: 2016-04-28 18:47:39 b. If the state is not up+replaying, demote the image to non-primary: ^^ Is up+replaying equivalent to "primary?" If so, what would it say if it wasn't primary? rbd mirror image demote <pool-name>/<image-name> Example To demote the image2 image in the data pool: $ rbd mirror image demote data/image2 3) initiate a full resync from the new primary (remote) To request a resynchronization to the primary image: rbd mirror image resync <pool-name>/<image-name> Example To request resynchronization of the image2 image in the data pool: $ rbd mirror image resync data/image2 4) Once the resync is complete, demote the image on remote and promote it on local. a. Demote the image to non-primary: rbd mirror image demote <pool-name>/<image-name> Example To demote the image2 image in the data pool: $ rbd mirror image demote data/image2 b. Promote the image to primary: rbd mirror image promote <pool-name>/<image-name> Example To promote the image2 image in the data pool: $ rbd mirror image promote data/image2 1) "You won't be able to perform a traditional failback. Instead, after the failover from DC1 to DC2 or DC3, you would add the new primary DC (DC2 or 3) as peer on DC1, demote the image on DC1 (if it's still listed as primary), and initiate a full resync from the new primary DC (DC2 or 3). Once the resync is complete, you can demote the image on DC2 or 3 and promote it on DC1." - Jason Dillaman
(In reply to John Brier from comment #11) Hi John, (i) All the steps you have come up with are of Failback, Can we add a note in Failover description regarding what a user with multi-secondary need to do ? (ii) Prior to these steps we need to ask users to get rbd-mirroring daemon up in local, For that we need to ask users to follow Step 2, 5 ,6 on local also. Is it okay to add a note ? > > 1) Add the new primary (remote) as peer on the original primary (local). > > $ rbd mirror pool peer add data client.remote@remote --cluster local > > ^^ I am not sure the "order" of this command is correct. It's not intuitive > to me. Is it right? Order had worked for me, So I think it is okay > > 2) Demote the image on local if it's still listed as primary > > a. To get status of a mirrored image: > > rbd mirror image status <pool-name>/<image-name> > > Example > > To get the status of the image2 image in the data pool: > > $ rbd mirror image status data/image2 > image2: > global_id: 2c928338-4a86-458b-9381-e68158da8970 > state: up+replaying > description: replaying, master_position=[object_number=6, tag_tid=2, > entry_tid=22598], mirror_position=[object_number=6, tag_tid=2, > entry_tid=29598], entries_behind_master=0 > last_update: 2016-04-28 18:47:39 > > > b. If the state is not up+replaying, demote the image to non-primary: > > ^^ Is up+replaying equivalent to "primary?" If so, what would it say if it > wasn't primary? > To check whether an image is primary or not, I think the appropriate way would be to check 'rbd info <image-spec>' ('$ rbd info data/image2' in this case) and up+replying is not equivalent to primary. > rbd mirror image demote <pool-name>/<image-name> > > Example > > To demote the image2 image in the data pool: > > $ rbd mirror image demote data/image2 > > 3) initiate a full resync from the new primary (remote) > > To request a resynchronization to the primary image: > > rbd mirror image resync <pool-name>/<image-name> > > Example > > To request resynchronization of the image2 image in the data pool: > > $ rbd mirror image resync data/image2 > > 4) Once the resync is complete, demote the image on remote and promote it on > local. > > a. Demote the image to non-primary: > > rbd mirror image demote <pool-name>/<image-name> > > Example > > To demote the image2 image in the data pool: > > $ rbd mirror image demote data/image2 > > b. Promote the image to primary: > > rbd mirror image promote <pool-name>/<image-name> > > Example > > To promote the image2 image in the data pool: > > $ rbd mirror image promote data/image2 (iii) After these steps we need to ask users to resync image in 2nd secondary (The one which was not promoted during Failover) (iv) In the steps you have formulated, can we add cluster details (local or primary) ? I think it will avoid confusions for users. Please let me know if you need more info on any of my requests/answers from (i) to (iv). Regards, Vasishta Shatsry QE, Ceph
*** Bug 1416136 has been marked as a duplicate of this bug. ***
Looks good to me. Thank you John and Json. Moving to VERIFIED state. Regards, Vasishta Shastry QE, Ceph
Published on the customer portal as part of the RHCS 3.2 GA on 3rd Jan 2019