Bug 1125918

Summary: Dist-geo-rep : geo-rep fails to do first history crawl after the volume restored from snap.
Product: [Community] GlusterFS Reporter: Kotresh HR <khiremat>
Component: geo-replicationAssignee: Kotresh HR <khiremat>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: high    
Version: mainlineCC: aavati, avishwan, bugs, csaba, david.macdonald, gluster-bugs, khiremat, nlevinki, nsathyan, ssamanta
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.7.0beta1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1108588 Environment:
Last Closed: 2015-05-14 17:26:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1108588    
Bug Blocks:    

Description Kotresh HR 2014-08-01 10:49:38 UTC
+++ This bug was initially created as a clone of Bug #1108588 +++

Description of problem:  geo-rep fails to do first time history crawl after the volume restored from snap.

logs snippet from  change.log in working_dir 

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
[2014-06-12 07:08:33.017140] I [gf-changelog.c:186:gf_changelog_notification_init] 0-glusterfs: connecting to changelog socket: /var/run/gluster/changelog-764717cae0147062b6b692f6b5d13a72.sock (brick: /var/run/gluster/snaps/fa41455574c54d
53b708e3394d3bae2c/brick3/master_b3)
[2014-06-12 07:08:33.017203] W [gf-changelog.c:196:gf_changelog_notification_init] 0-glusterfs: connection attempt 1/5...
[2014-06-12 07:08:33.017277] I [gf-changelog.c:216:gf_changelog_notification_init] 0-glusterfs: connection successful
[2014-06-12 07:08:44.070462] D [gf-changelog-process.c:597:gf_changelog_ext_change] 0-glusterfs: processing changelog: /var/run/gluster/snaps/fa41455574c54d53b708e3394d3bae2c/brick3/master_b3/.glusterfs/changelogs/CHANGELOG.1402556924
[2014-06-12 07:08:59.098946] D [gf-changelog-process.c:597:gf_changelog_ext_change] 0-glusterfs: processing changelog: /var/run/gluster/snaps/fa41455574c54d53b708e3394d3bae2c/brick3/master_b3/.glusterfs/changelogs/CHANGELOG.1402556939
[2014-06-12 07:09:14.152214] D [gf-changelog-process.c:597:gf_changelog_ext_change] 0-glusterfs: processing changelog: /var/run/gluster/snaps/fa41455574c54d53b708e3394d3bae2c/brick3/master_b3/.glusterfs/changelogs/CHANGELOG.1402556954
[2014-06-12 07:09:29.169742] D [gf-changelog-process.c:597:gf_changelog_ext_change] 0-glusterfs: processing changelog: /var/run/gluster/snaps/fa41455574c54d53b708e3394d3bae2c/brick3/master_b3/.glusterfs/changelogs/CHANGELOG.1402556969
[2014-06-12 07:09:44.177457] D [gf-changelog-process.c:597:gf_changelog_ext_change] 0-glusterfs: processing changelog: /var/run/gluster/snaps/fa41455574c54d53b708e3394d3bae2c/brick3/master_b3/.glusterfs/changelogs/CHANGELOG.1402556984
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

which actually should be like
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
[2014-06-12 08:58:37.878532] I [gf-changelog.c:186:gf_changelog_notification_init] 0-glusterfs: connecting to changelog socket: /var/run/gluster/changelog-764717cae0147062b6b692f6b5d13a72.sock (brick: /var/run/gluster/snaps/fa41455574c54d53b708e3394d3bae2c/brick3/master_b3)
[2014-06-12 08:58:37.878598] W [gf-changelog.c:196:gf_changelog_notification_init] 0-glusterfs: connection attempt 1/5...
[2014-06-12 08:58:37.878709] I [gf-changelog.c:216:gf_changelog_notification_init] 0-glusterfs: connection successful
[2014-06-12 08:58:38.098239] I [gf-history-changelog.c:722:gf_changelog_extract_min_max] 0-glusterfs: MIN: 1402553166, MAX: 1402554141, TOTAL CHANGELOGS: 64
[2014-06-12 08:58:38.098341] I [gf-history-changelog.c:722:gf_changelog_extract_min_max] 0-glusterfs: MIN: 1402554151, MAX: 1402556735, TOTAL CHANGELOGS: 173
[2014-06-12 08:58:38.098381] I [gf-history-changelog.c:722:gf_changelog_extract_min_max] 0-glusterfs: MIN: 1402556756, MAX: 1402563505, TOTAL CHANGELOGS: 450
[2014-06-12 08:58:38.098476] I [gf-history-changelog.c:864:gf_history_changelog] 0-glusterfs: FINAL: from: 1402563460, to: 1402563505, changes: 4
[2014-06-12 08:58:38.099505] D [gf-history-changelog.c:286:gf_history_changelog_scan] 0-glusterfs: hist_done 1, is_last_scan: 0
[2014-06-12 08:58:39.099858] D [gf-history-changelog.c:286:gf_history_changelog_scan] 0-glusterfs: hist_done 0, is_last_scan: 1
[2014-06-12 08:58:40.475710] D [gf-changelog-process.c:597:gf_changelog_ext_change] 0-glusterfs: processing changelog: /var/run/gluster/snaps/fa41455574c54d53b708e3394d3bae2c/brick3/master_b3/.glusterfs/changelogs/CHANGELOG.1402563520
[2014-06-12 08:58:55.765644] D [gf-changelog-process.c:597:gf_changelog_ext_change] 0-glusterfs: processing changelog: /var/run/gluster/snaps/fa41455574c54d53b708e3394d3bae2c/brick3/master_b3/.glusterfs/changelogs/CHANGELOG.1402563535

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>



Version-Release number of selected component (if applicable):


How reproducible: Didn't try to reproduce.


Steps to Reproduce:
1. create and start a geo-rep relationship between master and slave.
2. create 10K files on master, "crefi -T 10 -n 10 --multi -b 10 -d 10 --random --min=1K --max=10K /mnt/master/"
3. let it complete the creation, and pause geo-rep before it sync all the data.
4. Take the snap-shot on slave first and then on master and resume the geo-rep. 
5. create some more data on master and let it sync to slave.
6. stop geo-rep and also master and slave volumes.
7. restore snapped volume. 
8. start master and slave volume and also start geo-rep rep.
9. check, if geo-rep used the histroy crawl to sync data.


Actual results: geo-rep failed to do first history crawl after the restore of snaps.


Expected results: geo-rep should do first history crawl after the restore of snaps.


Additional info:

Comment 1 Kotresh HR 2014-08-01 10:50:51 UTC
Cause: 
On geo-rep start, when requested for stime, it is giving -1 because a check whether (stime < first geo-rep start time) fails and hence not going for history crawl, a complete xsync is being done. geo-rep start time is recorded, the very first time geo-rep session is started for that master between any of it slaves. geo-rep start time is fetched by doing getxattr on 'trusted.glusterfs.volume-mark' on master's gluster root.

Analysis:
Apart from 'trusted.glusterfs.volume-mark.<uuid>' xattr, 'trusted.glusterfs.volume-mark' xattr is also maintained virtually. Virtually in the sense, the xattr is not set on the gluster root, but when requested, marker gives it.

'trusted.glusterfs.volume-mark' holds the combination of 'node uuid', 'return value', 'geo-rep start time' and others as the value. 

On geo-rep start, marker creates 'marker.tstamp' file if indexing is turned on.
On getxattr for the key 'trusted.glusterfs.volume-mark' on gluster root, marker
stats 'marker.tstamp' file and modification time is returned along with others.

On snapshot creation, brick vol files are freshly created and hence marker.tstamp file also. So marker.tstamp file records the snapshot creation time, instead of first geo-rep start time. On geo-rep start after master volume restoration, when requested for xtime, the comparison (xtime < first geo-rep start time) obviously fails and returns -1.

Comment 2 Anand Avati 2014-08-01 10:52:42 UTC
REVIEW: http://review.gluster.org/8401 (feature/geo-rep: Keep marker.tstamp's mtime unchangeable during snapshot.) posted (#1) for review on master by Kotresh HR (khiremat)

Comment 3 Anand Avati 2014-08-04 08:05:00 UTC
REVIEW: http://review.gluster.org/8401 (feature/geo-rep: Keep marker.tstamp's mtime unchangeable during snapshot.) posted (#2) for review on master by Kotresh HR (khiremat)

Comment 4 Anand Avati 2014-08-04 10:12:37 UTC
REVIEW: http://review.gluster.org/8401 (feature/geo-rep: Keep marker.tstamp's mtime unchangeable during snapshot.) posted (#3) for review on master by Kotresh HR (khiremat)

Comment 5 Anand Avati 2014-08-04 10:32:17 UTC
REVIEW: http://review.gluster.org/8401 (feature/geo-rep: keep marker.tstamp's mtime unchangeable during snapshot.) posted (#4) for review on master by Kotresh HR (khiremat)

Comment 6 Anand Avati 2014-08-04 10:49:02 UTC
REVIEW: http://review.gluster.org/8401 (feature/geo-rep: Keep marker.tstamp's mtime unchangeable during snapshot.) posted (#5) for review on master by Kotresh HR (khiremat)

Comment 7 Anand Avati 2014-08-04 12:46:25 UTC
REVIEW: http://review.gluster.org/8401 (feature/geo-rep: Keep marker.tstamp's mtime unchangeable during snapshot.) posted (#6) for review on master by Kotresh HR (khiremat)

Comment 8 Anand Avati 2014-08-04 13:05:27 UTC
REVIEW: http://review.gluster.org/8401 (feature/geo-rep: Keep marker.tstamp's mtime unchangeable during snapshot.) posted (#7) for review on master by Kotresh HR (khiremat)

Comment 9 Anand Avati 2014-08-04 14:51:18 UTC
COMMIT: http://review.gluster.org/8401 committed in master by Vijay Bellur (vbellur) 
------
commit 0e8c537d6f48857b0f3c0ef10ce1c4458e303be8
Author: Kotresh H R <khiremat>
Date:   Fri Aug 1 16:12:38 2014 +0530

    feature/geo-rep: Keep marker.tstamp's mtime unchangeable during snapshot.
    
    Problem:
    
    Geo-replicatoin does a full xsync crawl after snapshot
    restoration of slave and master. It does not do history crawl.
    
    Analysis:
    
    Marker creates 'marker.tstamp' file when geo-rep is started
    for the first time. The virtual extended attribute
    'trusted.glusterfs.volume-mark' is maintained and whenever
    it is queried on gluster mount point, marker fills it on
    the fly and returns the combination of uuid, ctime of
    marker.tstamp and others. So ctime of marker.tstamp, in other
    sense 'volume-mark'  marks the geo-rep start time when the
    session is freshly created.
    
    From the above, after the first filesystem crawl(xsync) is
    done during first geo-rep start, stime should always be less
    than 'volume-mark'. So whenever stime is less than volume-mark,
    it does full filesystem crawl (xsync).
    
    Root Cause:
    
    When snapshot is restored, marker.tstamp file is freshly
    created losing the timestamps, it was originally created with.
    
    Solution:
    
    1. Change is made to depend on mtime instead of ctime.
    2. mtime and atime of marker.tstamp is restored back when
       snapshot is created and restored.
    
    Change-Id: I4891b112f4aedc50cfae402832c50c5145807d7a
    BUG: 1125918
    Signed-off-by: Kotresh H R <khiremat>
    Reviewed-on: http://review.gluster.org/8401
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 10 Niels de Vos 2015-05-14 17:26:06 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 11 Niels de Vos 2015-05-14 17:28:08 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 12 Niels de Vos 2015-05-14 17:35:13 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user