Bug 1205162
| Summary: | [georep]: If a georep session is recreated the existing files which are deleted from slave doesn't get sync again from master | |||
|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Rahul Hinduja <rhinduja> | |
| Component: | geo-replication | Assignee: | Kotresh HR <khiremat> | |
| Status: | CLOSED ERRATA | QA Contact: | Rahul Hinduja <rhinduja> | |
| Severity: | medium | Docs Contact: | ||
| Priority: | medium | |||
| Version: | rhgs-3.0 | CC: | amukherj, avishwan, chrisw, csaba, khiremat, mchangir, nlevinki, rcyriac, sankarshan, sarumuga | |
| Target Milestone: | --- | |||
| Target Release: | RHGS 3.2.0 | |||
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | glusterfs-3.8.4-15 | Doc Type: | Bug Fix | |
| Doc Text: |
When a geo-replication session was deleted, the sync time attribute on the root directory of the brick was not reset to zero. This meant that when a new geo-replication session was created, the stale sync time attribute caused the sync process to ignore all files created up until the stale sync time, and start syncing from that time. A new reset-sync-time option has been added to the session delete command so that administrators can reset the sync time attribute is to zero if required.
|
Story Points: | --- | |
| Clone Of: | ||||
| : | 1311926 (view as bug list) | Environment: | ||
| Last Closed: | 2017-03-23 05:21:31 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | 1422760 | |||
| Bug Blocks: | 1311926, 1351522, 1351530, 1357772, 1357773 | |||
|
Description
Rahul Hinduja
2015-03-24 11:07:12 UTC
As part of geo-rep delete command, we should remove stime xattrs from Master Brick roots. So that on re-creation it will start syncing from beginning. Milind, Please consider this scenario while working on stime reset patch. https://bugzilla.redhat.com/show_bug.cgi?id=1329675#c2 Patch http://review.gluster.org/14051 has been posted upstream (mainline) for review. Added new option for delete command to reset the sync time(reset-sync-time) Upstream mainline : http://review.gluster.org/14051 Upstream 3.8 : http://review.gluster.org/14953 And the fix is available in rhgs-3.2.0 as part of rebase to GlusterFS 3.8.4. Verified with build: glusterfs-geo-replication-3.8.4-13.el7rhgs.x86_64
It worked for the data which was initially synced via changelog but failed for the data which was synced via xsync
Steps Tested:
=============
1. Create Master and Slave cluster/volume
2. Create geo-rep session between master and slave
3. Create some data on master:
crefi -T 10 -n 10 --multi -d 5 -b 5 --random --max=5K --min=1K --f=create /mnt/master/
AND,
mkdir data; cd data ; for i in {1..999}; do dd if=/dev/zero of=dd.$i bs=1M count=1 ; done
4. Let the data be synced to slave.
5. Stop and delete the geo-rep session using reset-sync-time
6. remove the data created by crefi from slave mount
7. Append the data on master for the file in data
8. Recreate geo-rep session using force
9. Start the geo-rep session
Files do properly get sync to slave and arequal matches.
10. Stop and delete the geo-rep session again using reset-sync-time
11. remove the complete data from slave (rm -rf *)
12. Recreate geo-rep session using force
13. Start the geo-rep session
Only the root directories are synced and no subdirectory/files get sync
Master:
=======
[root@dj ~]# ./scripts/arequal-checksum -p /mnt/master/
Entry counts
Regular files : 3821
Directories : 264
Symbolic links : 0
Other : 0
Total : 4085
Metadata checksums
Regular files : 489009
Directories : 3e9
Symbolic links : 3e9
Other : 3e9
Checksums
Regular files : 8960ba9adedccfccf73a8f5024a4d980
Directories : 4a40163964221b39
Symbolic links : 0
Other : 0
Total : 341a23f39e5a0d75
[root@dj ~]#
Slave:
======
[root@dj ~]# ls -lR /mnt/slave/
/mnt/slave/:
total 44
drwxr-xr-x. 2 root root 4096 Feb 13 22:25 data
drwxr-xr-x. 2 root root 4096 Feb 13 22:19 thread0
drwxr-xr-x. 2 root root 4096 Feb 13 22:19 thread1
drwxr-xr-x. 2 root root 4096 Feb 13 22:19 thread2
drwxr-xr-x. 2 root root 4096 Feb 13 22:19 thread3
drwxr-xr-x. 2 root root 4096 Feb 13 22:19 thread4
drwxr-xr-x. 2 root root 4096 Feb 13 22:19 thread5
drwxr-xr-x. 2 root root 4096 Feb 13 22:19 thread6
drwxr-xr-x. 2 root root 4096 Feb 13 22:19 thread7
drwxr-xr-x. 2 root root 4096 Feb 13 22:19 thread8
drwxr-xr-x. 2 root root 4096 Feb 13 22:19 thread9
/mnt/slave/data:
total 0
/mnt/slave/thread0:
total 0
/mnt/slave/thread1:
total 0
/mnt/slave/thread2:
total 0
/mnt/slave/thread3:
total 0
/mnt/slave/thread4:
total 0
/mnt/slave/thread5:
total 0
/mnt/slave/thread6:
total 0
/mnt/slave/thread7:
total 0
/mnt/slave/thread8:
total 0
/mnt/slave/thread9:
total 0
[root@dj ~]#
Since it is not syncing. Moving the bug back to assigned state
Upstream patch : https://review.gluster.org/#/c/16629 Upstream Patch: https://review.gluster.org/#/c/16629/ (master) https://review.gluster.org/#/c/16641/ (3.8) https://review.gluster.org/#/c/16642/ (3.9) https://review.gluster.org/#/c/16644/ (3.10) Downstream Patch: https://code.engineering.redhat.com/gerrit/#/c/97943/ Verified with build: glusterfs-geo-replication-3.8.4-15.el7rhgs.x86_64 Scenario mentioned in comment 10 works, moving this bug to verified state. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2017-0486.html |