Bug 1364420 - [RFE] History Crawl performance improvement
Summary: [RFE] History Crawl performance improvement
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: geo-replication
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Aravinda VK
QA Contact:
URL:
Whiteboard:
: 1365119 (view as bug list)
Depends On:
Blocks: 1364421 1365119 1374153 1503170
TreeView+ depends on / blocked
 
Reported: 2016-08-05 10:16 UTC by Aravinda VK
Modified: 2017-10-17 13:25 UTC (History)
1 user (show)

Fixed In Version: glusterfs-3.10.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1364421 1374153 1503170 (view as bug list)
Environment:
Last Closed: 2017-03-06 17:21:32 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Aravinda VK 2016-08-05 10:16:38 UTC
Description of problem:
If History changelogs backlog to be processed is more then Geo-rep takes lot of time to reach to current state because of rsync retries.

Issue and possible solution is discussed in upstream
http://www.gluster.org/pipermail/gluster-devel/2016-August/050372.html

Comment 1 Aravinda VK 2016-08-08 13:00:13 UTC
*** Bug 1365119 has been marked as a duplicate of this bug. ***

Comment 2 Vijay Bellur 2016-08-08 13:01:47 UTC
REVIEW: http://review.gluster.org/15110 (geo-rep: Post process Data and Meta Changelogs) posted (#1) for review on master by Aravinda VK (avishwan)

Comment 3 Vijay Bellur 2016-08-10 14:10:05 UTC
REVIEW: http://review.gluster.org/15110 (geo-rep: Post process Data and Meta Changelogs) posted (#2) for review on master by Aravinda VK (avishwan)

Comment 4 Vijay Bellur 2016-08-11 09:03:28 UTC
REVIEW: http://review.gluster.org/15110 (geo-rep: Post process Data and Meta Changelogs) posted (#3) for review on master by Aravinda VK (avishwan)

Comment 5 Vijay Bellur 2016-08-11 09:54:40 UTC
REVIEW: http://review.gluster.org/15110 (geo-rep: Post process Data and Meta Changelogs) posted (#4) for review on master by Aravinda VK (avishwan)

Comment 6 Vijay Bellur 2016-08-18 14:56:49 UTC
REVIEW: http://review.gluster.org/15110 (geo-rep: Post process Data and Meta Changelogs) posted (#5) for review on master by Aravinda VK (avishwan)

Comment 7 Vijay Bellur 2016-08-20 13:53:37 UTC
REVIEW: http://review.gluster.org/15110 (geo-rep: Post process Data and Meta Changelogs) posted (#6) for review on master by Aravinda VK (avishwan)

Comment 8 Vijay Bellur 2016-08-20 16:05:27 UTC
REVIEW: http://review.gluster.org/15110 (geo-rep: Post process Data and Meta Changelogs) posted (#7) for review on master by Aravinda VK (avishwan)

Comment 9 Worker Ant 2016-08-24 05:48:06 UTC
REVIEW: http://review.gluster.org/15110 (geo-rep: Post process Data and Meta Changelogs) posted (#8) for review on master by Aravinda VK (avishwan)

Comment 10 Worker Ant 2016-08-24 16:55:24 UTC
REVIEW: http://review.gluster.org/15110 (geo-rep: Post process Data and Meta Changelogs) posted (#9) for review on master by Aravinda VK (avishwan)

Comment 11 Worker Ant 2016-08-25 12:06:27 UTC
REVIEW: http://review.gluster.org/15110 (geo-rep: Post process Data and Meta Changelogs) posted (#10) for review on master by Aravinda VK (avishwan)

Comment 12 Worker Ant 2016-08-26 06:06:54 UTC
REVIEW: http://review.gluster.org/15110 (geo-rep: Post process Data and Meta Changelogs) posted (#11) for review on master by Aravinda VK (avishwan)

Comment 13 Worker Ant 2016-08-26 17:46:03 UTC
COMMIT: http://review.gluster.org/15110 committed in master by Aravinda VK (avishwan) 
------
commit 6c283f107b646405936520e2549510115bf2ef64
Author: Aravinda VK <avishwan>
Date:   Mon Aug 8 17:02:37 2016 +0530

    geo-rep: Post process Data and Meta Changelogs
    
    With this patch, Data and Meta GFIDs are post processed. If Changelog has
    UNLINK entry then remove from Data and Meta GFIDs list(If stat on GFID is
    ENOENT in Master).
    
    While processing Changelogs,
    
    - Collect all the data and meta operations in a temporary database
    - Delete all Data and Meta GFIDs which are already unlinked as per Changelogs
      (unlink only if stat on GFID is ENOENT)
    - Process all Entry operations as usual
    - Process data and meta operations in batch(Fetch from Db in batch)
    - Data sync is again batched based on number of changelogs(Default 1day
      changelogs). Once the sync is complete, Update last Changelog's time as last_synced
      time as usual.
    
    Additionally maintain entry_stime on Brick root, ignore Entry ops if changelog
    suffix time is less than entry_stime. If data stime is more than entry_stime,
    this can happen only when passive worker updates stime by itself by getting
    mount point stime. Use entry_stime = data_stime in this case.
    
    New configurations:
    
    max-rsync-retries - Default Value is 10
    max-data-changelogs-in-batch - Max number of changelogs to be considered in a
    batch for syncing. Default value is 5760(4 changelogs per min * 60 min *
    24 hours)
    max-history-changelogs-in-batch - Max number of history changelogs to be
    processed at once. Default value 86400(4 changelogs per min * 60 min * 24
    hours * 15 days)
    
    BUG: 1364420
    Change-Id: I7b665895bf4806035c2a8573d361257cbadbea17
    Signed-off-by: Aravinda VK <avishwan>
    Reviewed-on: http://review.gluster.org/15110
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Kotresh HR <khiremat>
    CentOS-regression: Gluster Build System <jenkins.org>

Comment 14 Worker Ant 2016-08-31 10:12:11 UTC
REVIEW: http://review.gluster.org/15371 (geo-rep: Fix History post process) posted (#1) for review on master by Aravinda VK (avishwan)

Comment 15 Aravinda VK 2016-08-31 11:41:50 UTC
Performance not gaining much as expected. Moving back to post to send fix to remove db changes.

Comment 16 Worker Ant 2016-09-01 06:04:52 UTC
REVIEW: http://review.gluster.org/15371 (geo-rep: Fix History post process) posted (#2) for review on master by Aravinda VK (avishwan)

Comment 17 Worker Ant 2016-09-05 05:54:20 UTC
REVIEW: http://review.gluster.org/15371 (geo-rep: Fix History post process) posted (#3) for review on master by Aravinda VK (avishwan)

Comment 18 Worker Ant 2016-09-08 06:11:22 UTC
COMMIT: http://review.gluster.org/15371 committed in master by Aravinda VK (avishwan) 
------
commit 5de500cd0116796ff797099c60d33258bd48ce3c
Author: Aravinda VK <avishwan>
Date:   Wed Aug 31 11:53:06 2016 +0530

    geo-rep: Fix History post process
    
    This patch removes changelogsdb part of post processing since
    not got much performance advantage as expected.
    
    Entry stime and other logging improvements retained.
    
    BUG: 1364420
    Change-Id: Ib99d23f09d96c14bc28225b47d9134260f5551bf
    Signed-off-by: Aravinda VK <avishwan>
    Reviewed-on: http://review.gluster.org/15371
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Kotresh HR <khiremat>
    Smoke: Gluster Build System <jenkins.org>

Comment 19 Shyamsundar 2017-03-06 17:21:32 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.0, please open a new bug report.

glusterfs-3.10.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2017-February/030119.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.