Bug 1266875 - geo-replication: [RFE] Geo-replication + Tiering
Summary: geo-replication: [RFE] Geo-replication + Tiering
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: geo-replication
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Saravanakumar
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1267185 1275173
TreeView+ depends on / blocked
 
Reported: 2015-09-28 10:45 UTC by Saravanakumar
Modified: 2016-06-16 13:38 UTC (History)
3 users (show)

Fixed In Version: glusterfs-3.8rc2
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1267185 1275173 (view as bug list)
Environment:
Last Closed: 2016-06-16 13:38:14 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Saravanakumar 2015-09-28 10:45:21 UTC
Description of problem:

Race conditions can occur while processing changelogs(by geo-replication) in Tiering based volume, where rebalanace can cause frequent movement of files between hot and cold tiers.
 
Also, rebalance operations are internal operations which needs to be avoided.
(Deletion of the file and Creation of the file while carrying out rebalance).

Version-Release number of selected component (if applicable):


How reproducible:

Following is one such example:

==================================
Brick1 			Brick2
==================================
Create file		(file moved due to rebalance).
			Data file
			Delete file
==================================

If Brick2 changelogs processed first followed by Brick1, file may be created.
But, we expect the file to be deleted (as per the last operation)

Steps to Reproduce:
1.
2.
3.

Actual results:
Files are not synced properly.

Expected results:
Files are synced as expected.

Additional info:

Comment 1 Saravanakumar 2015-10-09 15:39:16 UTC
Patches :
http://review.gluster.org/#/c/12239/
http://review.gluster.org/#/c/12326/

Ignoring cold brick namespace related fops is pending.

Comment 2 Vijay Bellur 2015-10-14 06:28:13 UTC
REVIEW: http://review.gluster.org/12355 (geo-rep: Avoid cold tier bricks during ENTRY operation) posted (#1) for review on master by Saravanakumar Arumugam (sarumuga)

Comment 3 Vijay Bellur 2015-10-14 14:32:36 UTC
REVIEW: http://review.gluster.org/12326 (geo-rep: Add data operation if mknod with tier attribute) posted (#2) for review on master by Saravanakumar Arumugam (sarumuga)

Comment 4 Vijay Bellur 2015-10-15 06:28:36 UTC
REVIEW: http://review.gluster.org/12326 (geo-rep: Add data operation if mknod with tier attribute) posted (#3) for review on master by Saravanakumar Arumugam (sarumuga)

Comment 5 Vijay Bellur 2015-10-15 10:25:07 UTC
REVIEW: http://review.gluster.org/12239 (geo-rep: ignore recording tiering rebalance fops) posted (#4) for review on master by Saravanakumar Arumugam (sarumuga)

Comment 6 Vijay Bellur 2015-10-15 10:36:55 UTC
REVIEW: http://review.gluster.org/12355 (geo-rep: Avoid cold tier bricks during ENTRY operation) posted (#2) for review on master by Saravanakumar Arumugam (sarumuga)

Comment 7 Vijay Bellur 2015-10-16 05:41:17 UTC
REVIEW: http://review.gluster.org/12326 (geo-rep: Add data operation if mknod with tier attribute) posted (#4) for review on master by Saravanakumar Arumugam (sarumuga)

Comment 8 Vijay Bellur 2015-10-16 05:44:17 UTC
REVIEW: http://review.gluster.org/12239 (geo-rep: ignore recording tiering rebalance fops) posted (#5) for review on master by Saravanakumar Arumugam (sarumuga)

Comment 9 Vijay Bellur 2015-10-16 08:50:09 UTC
REVIEW: http://review.gluster.org/12326 (geo-rep: Add data operation if mknod with tier attribute) posted (#5) for review on master by Saravanakumar Arumugam (sarumuga)

Comment 10 Vijay Bellur 2015-10-16 09:52:05 UTC
REVIEW: http://review.gluster.org/12326 (geo-rep: Add data operation if mknod with tier attribute) posted (#6) for review on master by Saravanakumar Arumugam (sarumuga)

Comment 11 Vijay Bellur 2015-10-19 05:05:34 UTC
REVIEW: http://review.gluster.org/12326 (geo-rep: Add data operation if mknod with tier attribute) posted (#7) for review on master by Saravanakumar Arumugam (sarumuga)

Comment 12 Vijay Bellur 2015-10-19 06:41:37 UTC
REVIEW: http://review.gluster.org/12326 (geo-rep: Add data operation if mknod with tier attribute) posted (#8) for review on master by Saravanakumar Arumugam (sarumuga)

Comment 13 Vijay Bellur 2015-10-22 11:49:32 UTC
REVIEW: http://review.gluster.org/12239 (features/changelog: ignore recording tiering rebalance fops) posted (#6) for review on master by Saravanakumar Arumugam (sarumuga)

Comment 14 Vijay Bellur 2015-10-23 06:03:36 UTC
REVIEW: http://review.gluster.org/12239 (features/changelog: ignore recording tiering rebalance fops) posted (#7) for review on master by Saravanakumar Arumugam (sarumuga)

Comment 15 Vijay Bellur 2015-10-23 06:32:08 UTC
REVIEW: http://review.gluster.org/12417 (    features/changelog: Add data operation if mknod with tier attribute) posted (#1) for review on master by Saravanakumar Arumugam (sarumuga)

Comment 16 Vijay Bellur 2015-10-23 06:34:18 UTC
REVIEW: http://review.gluster.org/12326 (geo-rep: Add data operation if mknod with tier attribute) posted (#9) for review on master by Saravanakumar Arumugam (sarumuga)

Comment 17 Vijay Bellur 2015-10-23 06:35:55 UTC
REVIEW: http://review.gluster.org/12417 (features/changelog: Add data operation if mknod with tier attribute) posted (#2) for review on master by Saravanakumar Arumugam (sarumuga)

Comment 18 Vijay Bellur 2015-10-23 10:55:18 UTC
REVIEW: http://review.gluster.org/12355 (geo-rep: Avoid cold tier bricks during ENTRY operation) posted (#3) for review on master by Saravanakumar Arumugam (sarumuga)

Comment 19 Vijay Bellur 2015-10-23 10:58:10 UTC
REVIEW: http://review.gluster.org/12417 (features/changelog: Add data operation if mknod with tier attribute) posted (#3) for review on master by Saravanakumar Arumugam (sarumuga)

Comment 20 Vijay Bellur 2015-10-23 11:01:54 UTC
REVIEW: http://review.gluster.org/12417 (features/changelog: Add data operation if mknod with tier attribute) posted (#4) for review on master by Saravanakumar Arumugam (sarumuga)

Comment 21 Vijay Bellur 2015-10-26 05:35:01 UTC
REVIEW: http://review.gluster.org/12239 (features/changelog: ignore recording tiering rebalance fops) posted (#8) for review on master by Saravanakumar Arumugam (sarumuga)

Comment 22 Vijay Bellur 2015-10-26 05:35:25 UTC
REVIEW: http://review.gluster.org/12417 (features/changelog: Add data operation if mknod with tier attribute) posted (#5) for review on master by Saravanakumar Arumugam (sarumuga)

Comment 23 Vijay Bellur 2015-10-26 07:21:24 UTC
REVIEW: http://review.gluster.org/12417 (features/changelog: record mknod if tier-dht linkto is set) posted (#6) for review on master by Saravanakumar Arumugam (sarumuga)

Comment 24 Vijay Bellur 2015-10-26 08:30:05 UTC
REVIEW: http://review.gluster.org/12425 (features/changelog: ignore recording tiering rebalance fops) posted (#1) for review on release-3.7 by Saravanakumar Arumugam (sarumuga)

Comment 25 Vijay Bellur 2015-10-26 10:34:06 UTC
COMMIT: http://review.gluster.org/12326 committed in master by Venky Shankar (vshankar) 
------
commit ffc39c9d8807464b5c78959bc43dc12b22f5a37b
Author: Saravanakumar Arumugam <sarumuga>
Date:   Fri Oct 9 20:29:30 2015 +0530

    geo-rep: Add data operation if mknod with tier attribute
    
    This is a series of patches which aims to fix geo-replication
    in a Tiering Volume.
    
    Problem:
    Consider, a file is placed in volume initially and then hot tier is
    attached. During any operation on the file, due to lookup a linkto
    file is created in hot tier.
    
    Now, any namespace operation carried out on the file is recorded in
    both cold and hot tier.
    There is a room for races when both changelogs are replayed.
    
    Solution:
    So, We are going to replay (namespace related)operations
    only in the hot tier.
    
    Why?
    a. If the file is directly placed in Hot tier, all fops will be
    recorded in HOT tier.
    
    b. If  the file is already present in Cold tier, and if any fop is
    carried out, it creates linkto file in Hot tier.
    Now, operations like UNLINK, RENAME are captured in Hot tier(by means of linkto file).
    
    This way, we can get both tier's operation in HOT tier itself.
    
    But, We may miss initial Data sync immediately after creating the
    file as it is only recording MKNOD. So, if MKNOD encountered
    with sticky bit set, queue DATA operation for the corresponding gfid.
    (This is addressed here in this patch)
    
    So, If tier-gfid linkto is set, we need to record the corresponding
    MKNOD. Earlier this was avoided as it was set as INTERNAL fop.
    (This changelog related changes are addressed in the patch:
     - http://review.gluster.org/12417)
    
    Change-Id: I2fa84cfa2b0f86506c3d15d484138ab9651e4f83
    BUG: 1266875
    Signed-off-by: Saravanakumar Arumugam <sarumuga>
    Reviewed-on: http://review.gluster.org/12326
    Tested-by: NetBSD Build System <jenkins.org>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Kotresh HR <khiremat>
    Reviewed-by: Aravinda VK <avishwan>

Comment 26 Vijay Bellur 2015-10-26 11:39:25 UTC
REVIEW: http://review.gluster.org/12417 (features/changelog: record mknod if tier-dht linkto is set) posted (#7) for review on master by Saravanakumar Arumugam (sarumuga)

Comment 27 Vijay Bellur 2015-10-26 11:56:30 UTC
REVIEW: http://review.gluster.org/12427 (geo-rep: Add data operation if mknod with tier attribute) posted (#1) for review on release-3.7 by Saravanakumar Arumugam (sarumuga)

Comment 28 Vijay Bellur 2015-10-26 12:00:28 UTC
COMMIT: http://review.gluster.org/12355 committed in master by Venky Shankar (vshankar) 
------
commit 6188b5fcebc56b3d8af1956beeec9988f3e8f268
Author: Saravanakumar Arumugam <sarumuga>
Date:   Wed Oct 14 11:49:49 2015 +0530

    geo-rep: Avoid cold tier bricks during ENTRY operation
    
    This is a series of patch which aims to fix geo-replication
    in a Tiering Volume.
    
    Problem:
    Consider, a file is placed in volume initially and then hot tier is
    attached. During any operation on the file, due to lookup a linkto
    file is created in hot tier.
    
    Now, any namespace operation carried out on the file is recorded in
    both cold and hot tier.
    There is a room for races when both changelogs are replayed.
    
    Solution:
    So, We are going to replay (namespace related)operations
    only in the hot tier.
    
    Why?
    a. If the file is directly placed in Hot tier , all fops will be
    recorded in HOT tier.
    b. If  the file is already present in Cold tier, and if any fop is
    carried out, it creates linkto file in Hot tier.
    Now, operations like UNLINK, RENAME are captured in Hot
    tier(by means of linkto file).
    
    This way, we can get both tier's operation in HOT tier itself.
    
    Now, once the file is demoted to COLD tier, any namespace operation
    carried out on the cold tier can be avoided as we directly RECORD
    the same in HOT tier.
    
    How?
    1. Check whether the brick is cold tier and skip ENTRY operation.
    2. Also, if it is cold tier brick, use Xsync(which is used during initial run).
       This will help in getting all cold tier bricks changes using File System crawl
       and helps in avoiding races with hot tier brick(which can happen
       if historychangelog used in cold tier brick).
    
    Dependent patches:
    1. http://review.gluster.org/12239
    2. http://review.gluster.org/12326
    
    Change-Id: I7692b1dbb8813a7e253451bca02f8f09a5782dde
    BUG: 1266875
    Signed-off-by: Saravanakumar Arumugam <sarumuga>
    Reviewed-on: http://review.gluster.org/12355
    Tested-by: NetBSD Build System <jenkins.org>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Aravinda VK <avishwan>

Comment 29 Vijay Bellur 2015-10-26 12:01:06 UTC
REVIEW: http://review.gluster.org/12428 (features/changelog: record mknod if tier-dht linkto is set) posted (#1) for review on release-3.7 by Saravanakumar Arumugam (sarumuga)

Comment 30 Vijay Bellur 2015-10-26 12:03:31 UTC
REVIEW: http://review.gluster.org/12429 (geo-rep: Avoid cold tier bricks during ENTRY operation) posted (#1) for review on release-3.7 by Saravanakumar Arumugam (sarumuga)

Comment 31 Vijay Bellur 2015-10-26 14:09:30 UTC
REVIEW: http://review.gluster.org/12417 (features/changelog: record mknod if tier-dht linkto is set) posted (#8) for review on master by Saravanakumar Arumugam (sarumuga)

Comment 32 Vijay Bellur 2015-10-27 03:23:20 UTC
REVIEW: http://review.gluster.org/12239 (features/changelog: ignore recording tiering rebalance fops) posted (#9) for review on master by Venky Shankar (vshankar)

Comment 33 Vijay Bellur 2015-10-27 05:55:33 UTC
REVIEW: http://review.gluster.org/12417 (features/changelog: record mknod if tier-dht linkto is set) posted (#9) for review on master by Saravanakumar Arumugam (sarumuga)

Comment 34 Vijay Bellur 2015-10-27 07:13:30 UTC
REVIEW: http://review.gluster.org/12417 (features/changelog: record mknod if tier-dht linkto is set) posted (#10) for review on master by Saravanakumar Arumugam (sarumuga)

Comment 35 Vijay Bellur 2015-10-27 07:52:36 UTC
REVIEW: http://review.gluster.org/12239 (features/changelog: ignore recording tiering rebalance fops) posted (#10) for review on master by Saravanakumar Arumugam (sarumuga)

Comment 36 Vijay Bellur 2015-10-27 17:13:42 UTC
COMMIT: http://review.gluster.org/12239 committed in master by Venky Shankar (vshankar) 
------
commit 66b0caa639a179cfd699616d1fcae01c26ae6425
Author: Saravanakumar Arumugam <sarumuga>
Date:   Mon Sep 28 16:31:54 2015 +0530

    features/changelog: ignore recording tiering rebalance fops
    
    Recording of tiering rebalance process's fops like Creation
    and Deletion of file must be avoided.
    Ignore the fops using corresponding pid.
    
    Change-Id: Ifdc7765598d04d033f93e6339e9b188f7566cb65
    BUG: 1266875
    Signed-off-by: Saravanakumar Arumugam <sarumuga>
    Reviewed-on: http://review.gluster.org/12239
    Reviewed-by: Aravinda VK <avishwan>
    Reviewed-by: Venky Shankar <vshankar>

Comment 37 Vijay Bellur 2015-10-27 17:14:33 UTC
COMMIT: http://review.gluster.org/12417 committed in master by Venky Shankar (vshankar) 
------
commit 17e95cb81776650a2f68be00298c4f85b41e4242
Author: Saravanakumar Arumugam <sarumuga>
Date:   Fri Oct 23 11:57:42 2015 +0530

    features/changelog: record mknod if tier-dht linkto is set
    
    This is a series of patches which aims to fix geo-replication
    in a Tiering Volume.
    
    Problem:
    Consider, a file is placed in volume initially and then hot tier is
    attached. During any operation on the file, due to lookup a linkto
    file is created in hot tier.
    
    Now, any namespace operation carried out on the file is recorded in
    both cold and hot tier.
    There is a room for races when both changelogs are replayed.
    
    Solution:
    So, We are going to replay (namespace related)operations
    only in the hot tier.
    
    Why?
    a. If the file is directly placed in Hot tier, all fops will be
    recorded in HOT tier.
    
    b. If  the file is already present in Cold tier, and if any fop is
    carried out, it creates linkto file in Hot tier.
    Now, operations like UNLINK, RENAME are captured in Hot tier(by means of linkto file).
    
    This way, we can get both tier's operation in HOT tier itself.
    
    But, We may miss initial Data sync immediately after creating the
    file as it is only recording MKNOD. So, if MKNOD encountered
    with sticky bit set, queue DATA operation for the corresponding gfid.
    ( This geo-rep related changes are addressed in this patch: http://review.gluster.org/12326/ )
    
    So, If tier-dht linkto is set, we need to record the corresponding
    MKNOD. Earlier this was avoided as it was set as INTERNAL fop.
    (This is addressed here in this patch)
    
    Change-Id: I25514fe3e25f68592a8d6361507f8c8a4fcb70b1
    BUG: 1266875
    Signed-off-by: Saravanakumar Arumugam <sarumuga>
    Reviewed-on: http://review.gluster.org/12417
    Reviewed-by: Aravinda VK <avishwan>
    Reviewed-by: Kotresh HR <khiremat>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Venky Shankar <vshankar>

Comment 38 Niels de Vos 2016-06-16 13:38:14 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.