Bug 1108588

Summary: Dist-geo-rep : geo-rep fails to do first history crawl after the volume restored from snap.
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Vijaykumar Koppad <vkoppad>
Component: geo-replicationAssignee: Kotresh HR <khiremat>
Status: CLOSED ERRATA QA Contact: Bhaskar Bandari <bbandari>
Severity: high Docs Contact:
Priority: high    
Version: rhgs-3.0CC: aavati, avishwan, bbandari, csaba, david.macdonald, khiremat, nlevinki, nsathyan, ssamanta, vagarwal
Target Milestone: ---   
Target Release: RHGS 3.0.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.6.0.27-1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1125918 (view as bug list) Environment:
Last Closed: 2014-09-22 19:41:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1125918    

Description Vijaykumar Koppad 2014-06-12 09:47:14 UTC
Description of problem:  geo-rep fails to do first time history crawl after the volume restored from snap.

logs snippet from  change.log in working_dir 

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
[2014-06-12 07:08:33.017140] I [gf-changelog.c:186:gf_changelog_notification_init] 0-glusterfs: connecting to changelog socket: /var/run/gluster/changelog-764717cae0147062b6b692f6b5d13a72.sock (brick: /var/run/gluster/snaps/fa41455574c54d
53b708e3394d3bae2c/brick3/master_b3)
[2014-06-12 07:08:33.017203] W [gf-changelog.c:196:gf_changelog_notification_init] 0-glusterfs: connection attempt 1/5...
[2014-06-12 07:08:33.017277] I [gf-changelog.c:216:gf_changelog_notification_init] 0-glusterfs: connection successful
[2014-06-12 07:08:44.070462] D [gf-changelog-process.c:597:gf_changelog_ext_change] 0-glusterfs: processing changelog: /var/run/gluster/snaps/fa41455574c54d53b708e3394d3bae2c/brick3/master_b3/.glusterfs/changelogs/CHANGELOG.1402556924
[2014-06-12 07:08:59.098946] D [gf-changelog-process.c:597:gf_changelog_ext_change] 0-glusterfs: processing changelog: /var/run/gluster/snaps/fa41455574c54d53b708e3394d3bae2c/brick3/master_b3/.glusterfs/changelogs/CHANGELOG.1402556939
[2014-06-12 07:09:14.152214] D [gf-changelog-process.c:597:gf_changelog_ext_change] 0-glusterfs: processing changelog: /var/run/gluster/snaps/fa41455574c54d53b708e3394d3bae2c/brick3/master_b3/.glusterfs/changelogs/CHANGELOG.1402556954
[2014-06-12 07:09:29.169742] D [gf-changelog-process.c:597:gf_changelog_ext_change] 0-glusterfs: processing changelog: /var/run/gluster/snaps/fa41455574c54d53b708e3394d3bae2c/brick3/master_b3/.glusterfs/changelogs/CHANGELOG.1402556969
[2014-06-12 07:09:44.177457] D [gf-changelog-process.c:597:gf_changelog_ext_change] 0-glusterfs: processing changelog: /var/run/gluster/snaps/fa41455574c54d53b708e3394d3bae2c/brick3/master_b3/.glusterfs/changelogs/CHANGELOG.1402556984
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

which actually should be like
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
[2014-06-12 08:58:37.878532] I [gf-changelog.c:186:gf_changelog_notification_init] 0-glusterfs: connecting to changelog socket: /var/run/gluster/changelog-764717cae0147062b6b692f6b5d13a72.sock (brick: /var/run/gluster/snaps/fa41455574c54d53b708e3394d3bae2c/brick3/master_b3)
[2014-06-12 08:58:37.878598] W [gf-changelog.c:196:gf_changelog_notification_init] 0-glusterfs: connection attempt 1/5...
[2014-06-12 08:58:37.878709] I [gf-changelog.c:216:gf_changelog_notification_init] 0-glusterfs: connection successful
[2014-06-12 08:58:38.098239] I [gf-history-changelog.c:722:gf_changelog_extract_min_max] 0-glusterfs: MIN: 1402553166, MAX: 1402554141, TOTAL CHANGELOGS: 64
[2014-06-12 08:58:38.098341] I [gf-history-changelog.c:722:gf_changelog_extract_min_max] 0-glusterfs: MIN: 1402554151, MAX: 1402556735, TOTAL CHANGELOGS: 173
[2014-06-12 08:58:38.098381] I [gf-history-changelog.c:722:gf_changelog_extract_min_max] 0-glusterfs: MIN: 1402556756, MAX: 1402563505, TOTAL CHANGELOGS: 450
[2014-06-12 08:58:38.098476] I [gf-history-changelog.c:864:gf_history_changelog] 0-glusterfs: FINAL: from: 1402563460, to: 1402563505, changes: 4
[2014-06-12 08:58:38.099505] D [gf-history-changelog.c:286:gf_history_changelog_scan] 0-glusterfs: hist_done 1, is_last_scan: 0
[2014-06-12 08:58:39.099858] D [gf-history-changelog.c:286:gf_history_changelog_scan] 0-glusterfs: hist_done 0, is_last_scan: 1
[2014-06-12 08:58:40.475710] D [gf-changelog-process.c:597:gf_changelog_ext_change] 0-glusterfs: processing changelog: /var/run/gluster/snaps/fa41455574c54d53b708e3394d3bae2c/brick3/master_b3/.glusterfs/changelogs/CHANGELOG.1402563520
[2014-06-12 08:58:55.765644] D [gf-changelog-process.c:597:gf_changelog_ext_change] 0-glusterfs: processing changelog: /var/run/gluster/snaps/fa41455574c54d53b708e3394d3bae2c/brick3/master_b3/.glusterfs/changelogs/CHANGELOG.1402563535

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>



Version-Release number of selected component (if applicable): glusterfs-3.6.0.15-1.el6rhs


How reproducible: Didn't try to reproduce.


Steps to Reproduce:
1. create and start a geo-rep relationship between master and slave.
2. create 10K files on master, "crefi -T 10 -n 10 --multi -b 10 -d 10 --random --min=1K --max=10K /mnt/master/"
3. let it complete the creation, and pause geo-rep before it sync all the data.
4. Take the snap-shot on slave first and then on master and resume the geo-rep. 
5. create some more data on master and let it sync to slave.
6. stop geo-rep and also master and slave volumes.
7. restore snapped volume. 
8. start master and slave volume and also start geo-rep rep.
9. check, if geo-rep used the histroy crawl to sync data.


Actual results: geo-rep failed to do first history crawl after the restore of snaps.


Expected results: geo-rep should do first history crawl after the restore of snaps.


Additional info:

Comment 2 Aravinda VK 2014-06-12 11:00:45 UTC
Please update the xattrs dump of bricks root(backend) before snap and after restore.

Comment 3 Vijaykumar Koppad 2014-07-10 13:02:44 UTC
This is happening even in the build, glusterfs-3.6.0.24-1.el6rhs. 

xattrs of the snapped bricks before restoration. 
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
[root@redlake geo-automation]# getfattr -d -m . -e hex /var/run/gluster/snaps/7f96ee01b5f7483aa9e6c3feefce95c8/brick1/master_b1/
getfattr: Removing leading '/' from absolute path names
# file: var/run/gluster/snaps/7f96ee01b5f7483aa9e6c3feefce95c8/brick1/master_b1/
trusted.afr.master-client-0=0x000000000000000000000000
trusted.afr.master-client-1=0x000000000000000000000000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x0000000100000000555555547ffffffd
trusted.glusterfs.e7906916-9db4-4aad-bf52-b744cfd52794.d52240de-e128-4e1f-9ef3-1cab1d657a61.stime=0x53bd30eb00000000
trusted.glusterfs.e7906916-9db4-4aad-bf52-b744cfd52794.xtime=0x53bd36780006bfa4
trusted.glusterfs.volume-id=0x7f96ee01b5f7483aa9e6c3feefce95c8

[root@redlake geo-automation]# getfattr -d -m . -e hex /var/run/gluster/snaps/7f96ee01b5f7483aa9e6c3feefce95c8/brick5/master_b5/
getfattr: Removing leading '/' from absolute path names
# file: var/run/gluster/snaps/7f96ee01b5f7483aa9e6c3feefce95c8/brick5/master_b5/
trusted.afr.master-client-4=0x000000000000000000000000
trusted.afr.master-client-5=0x000000000000000000000000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x0000000100000000aaaaaaa8d5555551
trusted.glusterfs.e7906916-9db4-4aad-bf52-b744cfd52794.d52240de-e128-4e1f-9ef3-1cab1d657a61.stime=0x53bd30ed00000000
trusted.glusterfs.e7906916-9db4-4aad-bf52-b744cfd52794.xtime=0x53bd367800058eb5
trusted.glusterfs.volume-id=0x7f96ee01b5f7483aa9e6c3feefce95c8

[root@redlake geo-automation]# getfattr -d -m . -e hex /var/run/gluster/snaps/7f96ee01b5f7483aa9e6c3feefce95c8/brick9/master_b9/
getfattr: Removing leading '/' from absolute path names
# file: var/run/gluster/snaps/7f96ee01b5f7483aa9e6c3feefce95c8/brick9/master_b9/
trusted.afr.master-client-8=0x000000000000000000000000
trusted.afr.master-client-9=0x000000000000000000000000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x0000000100000000000000002aaaaaa9
trusted.glusterfs.e7906916-9db4-4aad-bf52-b744cfd52794.d52240de-e128-4e1f-9ef3-1cab1d657a61.stime=0x53bd30eb00000000
trusted.glusterfs.e7906916-9db4-4aad-bf52-b744cfd52794.xtime=0x53bd367800053ebf
trusted.glusterfs.volume-id=0x7f96ee01b5f7483aa9e6c3feefce95c8

:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::


It looks like huge difference between stime and xtime.

Comment 4 Aravinda VK 2014-07-11 06:11:42 UTC
As requested earlier, please provide xattr dump after restore also.

Comment 5 Vijaykumar Koppad 2014-07-11 06:29:05 UTC
xattr dump after restore of the same snap provided in the comment 3.

[root@redlake ~]# getfattr -d -m . -e hex /var/run/gluster/snaps/7f96ee01b5f7483aa9e6c3feefce95c8/brick1/master_b1/
getfattr: Removing leading '/' from absolute path names
# file: var/run/gluster/snaps/7f96ee01b5f7483aa9e6c3feefce95c8/brick1/master_b1/
trusted.afr.master-client-0=0x000000000000000000000000
trusted.afr.master-client-1=0x000000000000000000000000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x0000000100000000555555547ffffffd
trusted.glusterfs.e7906916-9db4-4aad-bf52-b744cfd52794.d52240de-e128-4e1f-9ef3-1cab1d657a61.stime=0x53bd30eb00000000
trusted.glusterfs.e7906916-9db4-4aad-bf52-b744cfd52794.xtime=0x53bd36780006bfa4
trusted.glusterfs.volume-id=0xe79069169db44aadbf52b744cfd52794

[root@redlake ~]# getfattr -d -m . -e hex /var/run/gluster/snaps/7f96ee01b5f7483aa9e6c3feefce95c8/brick5/master_b5/
getfattr: Removing leading '/' from absolute path names
# file: var/run/gluster/snaps/7f96ee01b5f7483aa9e6c3feefce95c8/brick5/master_b5/
trusted.afr.master-client-4=0x000000000000000000000000
trusted.afr.master-client-5=0x000000000000000000000000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x0000000100000000aaaaaaa8d5555551
trusted.glusterfs.e7906916-9db4-4aad-bf52-b744cfd52794.d52240de-e128-4e1f-9ef3-1cab1d657a61.stime=0x53bd30ed00000000
trusted.glusterfs.e7906916-9db4-4aad-bf52-b744cfd52794.xtime=0x53bd367800058eb5
trusted.glusterfs.volume-id=0xe79069169db44aadbf52b744cfd52794

[root@redlake ~]# getfattr -d -m . -e hex /var/run/gluster/snaps/7f96ee01b5f7483aa9e6c3feefce95c8/brick9/master_b9/
getfattr: Removing leading '/' from absolute path names
# file: var/run/gluster/snaps/7f96ee01b5f7483aa9e6c3feefce95c8/brick9/master_b9/
trusted.afr.master-client-8=0x000000000000000000000000
trusted.afr.master-client-9=0x000000000000000000000000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x0000000100000000000000002aaaaaa9
trusted.glusterfs.e7906916-9db4-4aad-bf52-b744cfd52794.d52240de-e128-4e1f-9ef3-1cab1d657a61.stime=0x53bd30eb00000000
trusted.glusterfs.e7906916-9db4-4aad-bf52-b744cfd52794.xtime=0x53bd367800053ebf
trusted.glusterfs.volume-id=0xe79069169db44aadbf52b744cfd52794

Comment 6 Kotresh HR 2014-07-28 05:31:07 UTC
After much analysis and debugging, I figured out following things.

Cause: 
On geo-rep start, when requested for stime, it is giving -1 because a check whether (stime < first geo-rep start time) fails and hence not going for history crawl, a complete xsync is being done. geo-rep start time is recorded, the very first time geo-rep session is started for that master between any of it slaves. geo-rep start time is fetched by doing getxattr on 'trusted.glusterfs.volume-mark' on master's gluster root.

Analysis:
Apart from 'trusted.glusterfs.volume-mark.<uuid>' xattr, 'trusted.glusterfs.volume-mark' xattr is also maintained virtually. Virtually in the sense, the xattr is not set on the gluster root, but when requested, marker gives it.

'trusted.glusterfs.volume-mark' holds the combination of 'node uuid', 'return value', 'geo-rep start time' and others as the value. 

On geo-rep start, marker creates 'marker.tstamp' file if indexing is turned on.
On getxattr for the key 'trusted.glusterfs.volume-mark' on gluster root, marker
stats 'marker.tstamp' file and modification time is returned along with others.

On snapshot creation, brick vol files are freshly created and hence marker.tstamp file also. So marker.tstamp file records the snapshot creation time, instead of first geo-rep start time. On geo-rep start after master volume restoration, when requested for xtime, the comparison (xtime < first geo-rep start time) obviously fails and returns -1.

Comment 7 Kotresh HR 2014-08-04 12:57:11 UTC
Upstream Patch:
http://review.gluster.org/#/c/8401/

Downstream Patch:
https://code.engineering.redhat.com/gerrit/#/c/30264/

Comment 9 Vijaykumar Koppad 2014-08-06 11:24:33 UTC
verified  on the build glusterfs-3.6.0.27-1

Comment 13 errata-xmlrpc 2014-09-22 19:41:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-1278.html