Bug 1279306

Summary: Dist-geo-rep : checkpoint doesn't reach even though all the files have been synced through hybrid crawl.
Product: [Community] GlusterFS Reporter: Aravinda VK <avishwan>
Component: geo-replicationAssignee: Aravinda VK <avishwan>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: 3.7.5CC: annair, avishwan, bugs, chrisw, csaba, david.macdonald, rhinduja, vkoppad, vshankar
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard: checkpoint
Fixed In Version: glusterfs-3.7.7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1247536 Environment:
Last Closed: 2016-02-15 06:25:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1044645, 1064309, 1247536, 1285196    
Bug Blocks: 1202842, 1223636    

Description Aravinda VK 2015-11-09 05:57:02 UTC
+++ This bug was initially created as a clone of Bug #1247536 +++

+++ This bug was initially created as a clone of Bug #1044645 +++

Description of problem: geo-rep status checkpoint doesn't reach even though all the files have been synced through hybrid crawl.


Version-Release number of selected component (if applicable):glusterfs-3.4.0.51geo-1

How reproducible: didn't try to reproduce, but seems like consistently reproducible. 


Steps to Reproduce:
1.create and start a geo-rep relationship between master and slave. 
2.stop geo-rep 
3.create some data on master.
4.set the checkpoint.
5.start geo-rep 
6. wait for the geo-rep to sync data. 
7. check geo-rep status whether checkpoint has reached or not. 

Actual results: checkpoint doesn't reach at all. 


Expected results: checkpoint should reach when all the files are synced. 


--- Additional comment from Aravinda VK on 2013-12-20 03:05:42 EST ---

During start of hybrid crawl, crawler stores masters xtime in memory. After completion of crawl and sync, it will update the same xtime for slave. 

If files created after crawler started, then checkpoint time will be more than the last saved xtime in memory, so even after completion it shows checkpoint is not reached.

This is expected behavior, if we update the latest xtime instead of xtime stored in memory, their are chances of data loss.

--- Additional comment from Venky Shankar on 2013-12-20 03:41:07 EST ---

Vijaykumar,

was I/O done on the mount after checkpoint was set? If yes, then isn't this the expected behaviour?

--- Additional comment from Rahul Hinduja on 2015-07-07 06:58:30 EDT ---

Verified with build: glusterfs-3.7.1-7.el6rhs.x86_64

Tried both the below scenarios:

a. Have the files before creation of geo-rep session so as to use HYBRID CRAWL
b. Change the change_detector to xsync to use HYBRID CRAWL

In both the above cases, the last sync is not update. In the first case, Last sync is N/A and in the second case, last sync shows when the last changelog was synced.

Eventually in Hybrid Crawl, the checkpoint completed Remains always as NO even when the files are synced to slave. Moving this bug to Assigned state.

--- Additional comment from Rahul Hinduja on 2015-07-07 07:00:25 EDT ---

For Scenario A:
===============

[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.101::slave status detail
 
MASTER NODE    MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE      STATUS     CRAWL STATUS    LAST_SYNCED    ENTRY    DATA    META    FAILURES    CHECKPOINT TIME        CHECKPOINT COMPLETED    CHECKPOINT COMPLETION TIME   
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
georep1        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.101    Active     Hybrid Crawl    N/A            0        3567    7797    0           2015-07-07 15:39:13    No                      N/A                          
georep1        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.101    Active     Hybrid Crawl    N/A            0        3611    7845    0           2015-07-07 15:39:13    No                      N/A                          
georep3        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.101    Active     Hybrid Crawl    N/A            0        3441    7611    0           2015-07-07 15:39:13    No                      N/A                          
georep3        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.101    Active     Hybrid Crawl    N/A            0        3550    7726    0           2015-07-07 15:39:13    No                      N/A                          
georep2        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.103    Passive    N/A             N/A            N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
georep2        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.103    Passive    N/A             N/A            N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
georep4        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.103    Passive    N/A             N/A            N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
georep4        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.103    Passive    N/A             N/A            N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
[root@georep1 scripts]# 



[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.101::slave status detail
 
MASTER NODE    MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE      STATUS     CRAWL STATUS       LAST_SYNCED    ENTRY    DATA    META    FAILURES    CHECKPOINT TIME        CHECKPOINT COMPLETED    CHECKPOINT COMPLETION TIME   
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
georep1        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.101    Active     Changelog Crawl    N/A            0        7798    0       0           2015-07-07 15:39:13    No                      N/A                          
georep1        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.101    Active     Changelog Crawl    N/A            0        7847    0       0           2015-07-07 15:39:13    No                      N/A                          
georep3        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.101    Active     Hybrid Crawl       N/A            0        3441    0       0           2015-07-07 15:39:13    No                      N/A                          
georep3        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.101    Active     Hybrid Crawl       N/A            0        3550    0       0           2015-07-07 15:39:13    No                      N/A                          
georep2        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.103    Passive    N/A                N/A            N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
georep2        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.103    Passive    N/A                N/A            N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
georep4        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.103    Passive    N/A                N/A            N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
georep4        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.103    Passive    N/A                N/A            N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
[root@georep1 scripts]# 



[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.101::slave status detail
 
MASTER NODE    MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE      STATUS     CRAWL STATUS       LAST_SYNCED            ENTRY    DATA    META    FAILURES    CHECKPOINT TIME        CHECKPOINT COMPLETED    CHECKPOINT COMPLETION TIME   
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
georep1        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.101    Active     Changelog Crawl    2015-07-07 15:39:27    0        0       0       0           2015-07-07 15:39:13    Yes                     2015-07-07 15:53:41          
georep1        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.101    Active     Changelog Crawl    2015-07-07 15:39:27    0        0       0       0           2015-07-07 15:39:13    Yes                     2015-07-07 15:52:54          
georep2        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.103    Passive    N/A                N/A                    N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
georep2        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.103    Passive    N/A                N/A                    N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
georep3        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.101    Active     Changelog Crawl    2015-07-07 15:39:33    0        0       0       0           2015-07-07 15:39:13    Yes                     2015-07-07 15:53:12          
georep3        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.101    Active     Changelog Crawl    2015-07-07 15:39:33    0        0       0       0           2015-07-07 15:39:13    Yes                     2015-07-07 15:53:14          
georep4        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.103    Passive    N/A                N/A                    N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
georep4        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.103    Passive    N/A                N/A                    N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
[root@georep1 scripts]# 

[rahinuj@rahul Desktop]$

--- Additional comment from Rahul Hinduja on 2015-07-07 07:02:30 EDT ---

For Scenario B:
===============

[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.101::slave config change_detector
changelog
[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.101::slave config change_detector xsync
geo-replication config updated successfully
[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.101::slave config change_detector
xsync
[root@georep1 scripts]#


[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.101::slave status detail
 
MASTER NODE    MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE      STATUS     CRAWL STATUS    LAST_SYNCED            ENTRY    DATA    META    FAILURES    CHECKPOINT TIME        CHECKPOINT COMPLETED    CHECKPOINT COMPLETION TIME   
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
georep1        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.101    Active     Hybrid Crawl    2015-07-07 15:39:27    0        154     0       0           2015-07-07 16:16:07    No                      N/A                          
georep1        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.101    Active     Hybrid Crawl    2015-07-07 15:39:27    0        160     0       0           2015-07-07 16:16:07    No                      N/A                          
georep3        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.101    Active     Hybrid Crawl    2015-07-07 15:39:33    0        156     0       0           2015-07-07 16:16:07    No                      N/A                          
georep3        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.101    Active     Hybrid Crawl    2015-07-07 15:39:33    0        179     0       0           2015-07-07 16:16:07    No                      N/A                          
georep4        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.103    Passive    N/A             N/A                    N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
georep4        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.103    Passive    N/A             N/A                    N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
georep2        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.103    Passive    N/A             N/A                    N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
georep2        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.103    Passive    N/A             N/A                    N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
[root@georep1 scripts]#

[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.101::slave status detail
 
MASTER NODE    MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE      STATUS     CRAWL STATUS    LAST_SYNCED            ENTRY    DATA    META    FAILURES    CHECKPOINT TIME        CHECKPOINT COMPLETED    CHECKPOINT COMPLETION TIME   
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
georep1        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.101    Active     Hybrid Crawl    2015-07-07 15:39:27    0        0       0       0           2015-07-07 16:16:07    No                      N/A                          
georep1        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.101    Active     Hybrid Crawl    2015-07-07 15:39:27    0        0       0       0           2015-07-07 16:16:07    No                      N/A                          
georep2        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.103    Passive    N/A             N/A                    N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
georep2        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.103    Passive    N/A             N/A                    N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
georep3        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.101    Active     Hybrid Crawl    2015-07-07 15:39:33    0        0       0       0           2015-07-07 16:16:07    No                      N/A                          
georep3        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.101    Active     Hybrid Crawl    2015-07-07 15:39:33    0        0       0       0           2015-07-07 16:16:07    No                      N/A                          
georep4        master        /rhs/brick1/b1    root          10.70.46.101::slave    10.70.46.103    Passive    N/A             N/A                    N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
georep4        master        /rhs/brick2/b2    root          10.70.46.101::slave    10.70.46.103    Passive    N/A             N/A                    N/A      N/A     N/A     N/A         N/A                    N/A                     N/A                          
[root@georep1 scripts]#

--- Additional comment from Anand Avati on 2015-07-28 05:26:01 EDT ---

REVIEW: http://review.gluster.org/11771 (geo-rep: Update last_synced_time in XSync) posted (#1) for review on master by Aravinda VK (avishwan)

--- Additional comment from Anand Avati on 2015-08-05 00:52:30 EDT ---

REVIEW: http://review.gluster.org/11771 (geo-rep: Update last_synced_time in XSync) posted (#2) for review on master by Aravinda VK (avishwan)

--- Additional comment from Anand Avati on 2015-08-12 05:57:41 EDT ---

REVIEW: http://review.gluster.org/11771 (geo-rep: Update last_synced_time in XSync) posted (#3) for review on master by Aravinda VK (avishwan)

--- Additional comment from Anand Avati on 2015-08-19 01:59:19 EDT ---

REVIEW: http://review.gluster.org/11771 (geo-rep: Update last_synced_time in XSync) posted (#4) for review on master by Aravinda VK (avishwan)

--- Additional comment from Anand Avati on 2015-08-26 01:56:59 EDT ---

REVIEW: http://review.gluster.org/11771 (geo-rep: Update last_synced_time in XSync) posted (#5) for review on master by Aravinda VK (avishwan)

--- Additional comment from Vijay Bellur on 2015-09-08 13:17:45 EDT ---

REVIEW: http://review.gluster.org/11771 (geo-rep: Update last_synced_time in XSync) posted (#6) for review on master by Aravinda VK (avishwan)

--- Additional comment from Vijay Bellur on 2015-11-03 15:23:31 EST ---

REVIEW: http://review.gluster.org/11771 (geo-rep: Update last_synced_time in XSync) posted (#7) for review on master by Jeff Darcy (jdarcy)

Comment 1 Aravinda VK 2015-11-09 06:01:25 UTC
Patch sent for review
http://review.gluster.org/#/c/12545/

Comment 2 Kaushal 2016-04-19 07:47:28 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.7, please open a new bug report.

glusterfs-3.7.7 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://www.gluster.org/pipermail/gluster-users/2016-February/025292.html
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user