Bug 1495521 - Rbd-mirror: Re-sync request against the "master" cluster failed to delete the image and sync the image from the "slave" cluster.
Summary: Rbd-mirror: Re-sync request against the "master" cluster failed to delete the...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RBD-Mirror
Version: 3.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: 3.0
Assignee: Jason Dillaman
QA Contact: Parikshith
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-09-26 09:02 UTC by Parikshith
Modified: 2017-12-05 23:45 UTC (History)
6 users (show)

Fixed In Version: RHEL: ceph-12.2.1-7.el7cp Ubuntu: ceph_12.2.1-10redhat1xenial
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-12-05 23:45:31 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 21559 0 None None None 2017-09-26 18:00:22 UTC
Red Hat Product Errata RHBA-2017:3387 0 normal SHIPPED_LIVE Red Hat Ceph Storage 3.0 bug fix and enhancement update 2017-12-06 03:03:45 UTC

Description Parikshith 2017-09-26 09:02:10 UTC
Description of problem:


Version-Release number of selected component (if applicable):
12.2.0-2.el7cp

How reproducible:


Steps to Reproduce:
1. A file is written to master, with a delay set on slave(10 mins).
2. Before the delay the master goes down abruptly (stopped the mirror service).
3. The data written on master is not yet synced to slave.

image-2:
  global_id:   97551f32-2ac3-4df7-93a4-95a73d93b3e8
  state:       up+replaying
  description: replaying, master_position=[object_number=7, tag_tid=2, entry_tid=25599], mirror_position=[], entries_behind_master=25602
  last_update: 2017-09-23 16:15:02

4. Force promoted the slave to become primary.

5. After 10 minutes, data got synced on slave(entries_behind_master became 0)   
    During this time it was in up+replaying' state , after the delay state changed to 'up+stopped'(description: force promoted)

image-2:
  global_id:   97551f32-2ac3-4df7-93a4-95a73d93b3e8
  state:       up+replaying
  description: replaying, master_position=[object_number=7, tag_tid=2, entry_tid=25599], mirror_position=[object_number=7, tag_tid=2, entry_tid=25599], entries_behind_master=0
  last_update: 2017-09-23 16:25:54
  
6. Brought back the master, demoted it and did a re-sync.

$rbd mirror image status data/image-2 --cluster master
image-2:
  global_id:   f84a899f-909e-4e23-9428-ee31c5ca14fa
  state:       up+replaying
  description: replaying, master_position=[object_number=3, tag_tid=3, entry_tid=3], mirror_position=[object_number=3, tag_tid=3, entry_tid=3], entries_behind_master=0
  last_update: 2017-09-23 16:27:58

7. Checked the size of the images on both clusters.(wired snap was created)

Master:
$sudo rbd du -p data --cluster master
warning: fast-diff map is not enabled for image-2. operation may be slow.
NAME                                                                                                                                                               PROVISIONED USED 
image-2@.rbd-mirror.bc392583-5662-4721-8726-55573351bd8f.a96731e9-d9a0-4c7b-8b96-e844e8502421                                       1024M 152M 
image-2                                                                                                                                                                    
                                                         1024M    0 
<TOTAL>                                                                                                                                                               
                                                          1024M 152M 

slave:
$sudo rbd du -p data --cluster slave
warning: fast-diff map is not enabled for image-2. operation may be slow.
NAME                                                                                                                                                           PROVISIONED USED 
image-2@.rbd-mirror.bc392583-5662-4721-8726-55573351bd8f.a96731e9-d9a0-4c7b-8b96-e844e8502421                                           1024M    0 
image-2                                                                                                                                                                      
                                                            1024M    0 
<TOTAL>                                                                                                                                                                    
                                                             1024M    0

Actual results:
After re-sync on "master" cluster it failed delete the image and sync the image from the "slave" cluster

Expected results:
primary and secondary images should have of same size.

Additional info:

Comment 2 Jason Dillaman 2017-09-26 20:04:23 UTC
Upstream master branch PR: https://github.com/ceph/ceph/pull/17979

Comment 13 errata-xmlrpc 2017-12-05 23:45:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3387


Note You need to log in before you can comment on or make changes to this bug.