Description of problem: Version-Release number of selected component (if applicable): 12.2.0-2.el7cp How reproducible: Steps to Reproduce: 1. A file is written to master, with a delay set on slave(10 mins). 2. Before the delay the master goes down abruptly (stopped the mirror service). 3. The data written on master is not yet synced to slave. image-2: global_id: 97551f32-2ac3-4df7-93a4-95a73d93b3e8 state: up+replaying description: replaying, master_position=[object_number=7, tag_tid=2, entry_tid=25599], mirror_position=[], entries_behind_master=25602 last_update: 2017-09-23 16:15:02 4. Force promoted the slave to become primary. 5. After 10 minutes, data got synced on slave(entries_behind_master became 0) During this time it was in up+replaying' state , after the delay state changed to 'up+stopped'(description: force promoted) image-2: global_id: 97551f32-2ac3-4df7-93a4-95a73d93b3e8 state: up+replaying description: replaying, master_position=[object_number=7, tag_tid=2, entry_tid=25599], mirror_position=[object_number=7, tag_tid=2, entry_tid=25599], entries_behind_master=0 last_update: 2017-09-23 16:25:54 6. Brought back the master, demoted it and did a re-sync. $rbd mirror image status data/image-2 --cluster master image-2: global_id: f84a899f-909e-4e23-9428-ee31c5ca14fa state: up+replaying description: replaying, master_position=[object_number=3, tag_tid=3, entry_tid=3], mirror_position=[object_number=3, tag_tid=3, entry_tid=3], entries_behind_master=0 last_update: 2017-09-23 16:27:58 7. Checked the size of the images on both clusters.(wired snap was created) Master: $sudo rbd du -p data --cluster master warning: fast-diff map is not enabled for image-2. operation may be slow. NAME PROVISIONED USED image-2@.rbd-mirror.bc392583-5662-4721-8726-55573351bd8f.a96731e9-d9a0-4c7b-8b96-e844e8502421 1024M 152M image-2 1024M 0 <TOTAL> 1024M 152M slave: $sudo rbd du -p data --cluster slave warning: fast-diff map is not enabled for image-2. operation may be slow. NAME PROVISIONED USED image-2@.rbd-mirror.bc392583-5662-4721-8726-55573351bd8f.a96731e9-d9a0-4c7b-8b96-e844e8502421 1024M 0 image-2 1024M 0 <TOTAL> 1024M 0 Actual results: After re-sync on "master" cluster it failed delete the image and sync the image from the "slave" cluster Expected results: primary and secondary images should have of same size. Additional info:
Upstream master branch PR: https://github.com/ceph/ceph/pull/17979
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:3387