Description of problem: ======================= If the files are deleted from slave volume after the session is deleted between master and slave volume. These files will never again sync after recreating the session. It is because we maintain the information in master for the files that are already sync. Version-Release number of selected component (if applicable): ============================================================= glusterfs-3.6.0.53-1.el6rhs.x86_64 How reproducible: ================= 1/1 Steps to Reproduce: ================== 1. Create and start a georep session between master and slave volume. 2. Create data to the master volume 3. Let the georep sync the data to the slave volume. 4. Once the data is synced to slave volume, stop and delete the session between master and slave. 5. Delete the files from slave volume 6. Re-create and start the session between master and slave volume. 7. The files that were deleted from slave volume doesn't get sync from master
As part of geo-rep delete command, we should remove stime xattrs from Master Brick roots. So that on re-creation it will start syncing from beginning.
Milind, Please consider this scenario while working on stime reset patch. https://bugzilla.redhat.com/show_bug.cgi?id=1329675#c2
Patch http://review.gluster.org/14051 has been posted upstream (mainline) for review.
Added new option for delete command to reset the sync time(reset-sync-time)
Upstream mainline : http://review.gluster.org/14051 Upstream 3.8 : http://review.gluster.org/14953 And the fix is available in rhgs-3.2.0 as part of rebase to GlusterFS 3.8.4.
Verified with build: glusterfs-geo-replication-3.8.4-13.el7rhgs.x86_64 It worked for the data which was initially synced via changelog but failed for the data which was synced via xsync Steps Tested: ============= 1. Create Master and Slave cluster/volume 2. Create geo-rep session between master and slave 3. Create some data on master: crefi -T 10 -n 10 --multi -d 5 -b 5 --random --max=5K --min=1K --f=create /mnt/master/ AND, mkdir data; cd data ; for i in {1..999}; do dd if=/dev/zero of=dd.$i bs=1M count=1 ; done 4. Let the data be synced to slave. 5. Stop and delete the geo-rep session using reset-sync-time 6. remove the data created by crefi from slave mount 7. Append the data on master for the file in data 8. Recreate geo-rep session using force 9. Start the geo-rep session Files do properly get sync to slave and arequal matches. 10. Stop and delete the geo-rep session again using reset-sync-time 11. remove the complete data from slave (rm -rf *) 12. Recreate geo-rep session using force 13. Start the geo-rep session Only the root directories are synced and no subdirectory/files get sync Master: ======= [root@dj ~]# ./scripts/arequal-checksum -p /mnt/master/ Entry counts Regular files : 3821 Directories : 264 Symbolic links : 0 Other : 0 Total : 4085 Metadata checksums Regular files : 489009 Directories : 3e9 Symbolic links : 3e9 Other : 3e9 Checksums Regular files : 8960ba9adedccfccf73a8f5024a4d980 Directories : 4a40163964221b39 Symbolic links : 0 Other : 0 Total : 341a23f39e5a0d75 [root@dj ~]# Slave: ====== [root@dj ~]# ls -lR /mnt/slave/ /mnt/slave/: total 44 drwxr-xr-x. 2 root root 4096 Feb 13 22:25 data drwxr-xr-x. 2 root root 4096 Feb 13 22:19 thread0 drwxr-xr-x. 2 root root 4096 Feb 13 22:19 thread1 drwxr-xr-x. 2 root root 4096 Feb 13 22:19 thread2 drwxr-xr-x. 2 root root 4096 Feb 13 22:19 thread3 drwxr-xr-x. 2 root root 4096 Feb 13 22:19 thread4 drwxr-xr-x. 2 root root 4096 Feb 13 22:19 thread5 drwxr-xr-x. 2 root root 4096 Feb 13 22:19 thread6 drwxr-xr-x. 2 root root 4096 Feb 13 22:19 thread7 drwxr-xr-x. 2 root root 4096 Feb 13 22:19 thread8 drwxr-xr-x. 2 root root 4096 Feb 13 22:19 thread9 /mnt/slave/data: total 0 /mnt/slave/thread0: total 0 /mnt/slave/thread1: total 0 /mnt/slave/thread2: total 0 /mnt/slave/thread3: total 0 /mnt/slave/thread4: total 0 /mnt/slave/thread5: total 0 /mnt/slave/thread6: total 0 /mnt/slave/thread7: total 0 /mnt/slave/thread8: total 0 /mnt/slave/thread9: total 0 [root@dj ~]# Since it is not syncing. Moving the bug back to assigned state
Upstream patch : https://review.gluster.org/#/c/16629
Upstream Patch: https://review.gluster.org/#/c/16629/ (master) https://review.gluster.org/#/c/16641/ (3.8) https://review.gluster.org/#/c/16642/ (3.9) https://review.gluster.org/#/c/16644/ (3.10) Downstream Patch: https://code.engineering.redhat.com/gerrit/#/c/97943/
Verified with build: glusterfs-geo-replication-3.8.4-15.el7rhgs.x86_64 Scenario mentioned in comment 10 works, moving this bug to verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2017-0486.html