Bug 1292084 - [georep+tiering]: Geo-replication sync is broken if cold tier is EC
Summary: [georep+tiering]: Geo-replication sync is broken if cold tier is EC
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: tiering
Version: mainline
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: ---
Assignee: Satish Mohan
QA Contact: bugs@gluster.org
URL:
Whiteboard:
Depends On: 1291195
Blocks: 1293309
TreeView+ depends on / blocked
 
Reported: 2015-12-16 12:40 UTC by Gaurav Kumar Garg
Modified: 2016-06-16 13:50 UTC (History)
12 users (show)

Fixed In Version: glusterfs-3.8rc2
Doc Type: Bug Fix
Doc Text:
Clone Of: 1291195
: 1293309 (view as bug list)
Environment:
Last Closed: 2016-06-16 13:50:45 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Comment 1 Vijay Bellur 2015-12-16 12:44:04 UTC
REVIEW: http://review.gluster.org/12982 (cli/xml: display correct xml output of tier volume) posted (#1) for review on master by Gaurav Kumar Garg (ggarg)

Comment 2 Kotresh HR 2015-12-17 12:29:14 UTC
Description of problem:
=======================

If cold tier is Distributed-Disperse (2x{4+2}) and hot tier is distributed-replicate(2x2), then the total subvolumes in the system are 4. But the lock files created under shared storage are 3 and hence only 3 bricks from 3 subvolume acquires the lock and participate in syncing. While the remaining one subvolume never participates in syncing. 

But if both cold and hot tier are Distributed-Replicate (2x2), then the lock files created are 4 and all 4 subvolume participates in syncing

I am suspecting an issue with the xml output generation of a volume file. 

A) If cold tier is Distributed-Disperse and hot tier is Distributed-Replicate, the xml output wrongly shows hot tier as REPLICATE: Example:

            <hotBrickType>Replicate</hotBrickType>
            <numberOfBricks>0 x 6 = 4</numberOfBricks>

B) If cold tier and hot tier both are Distributed-Replicate, the xml output is correctly shows hot tier as Distributed-Replicate: Example:

            <hotBrickType>Distributed-Replicate</hotBrickType>
            <numberOfBricks>2 x 2 = 4</numberOfBricks>

Version-Release number of selected component (if applicable):
=============================================================
mainline


How reproducible:
=================
2/2

Steps to Reproduce:
===================
1. Create master and slave cluster
2. Create Master volume (Cold Tier as distributed-disperse and Hot tier as Distributed-Replicate)
3. Create Slave volume (Distributed-Replicate)
4. Create and Start geo-rep session between master and slave

Actual results:
===============

Only brick from one subvolume in hot tier becomes ACTIVE


Expected results:
================

One brick from each subvolume in hot tier should become ACTIVE

Comment 3 Vijay Bellur 2015-12-17 12:32:29 UTC
REVIEW: http://review.gluster.org/12994 (geo-rep: Fix getting subvol number) posted (#1) for review on master by Kotresh HR (khiremat)

Comment 4 Kotresh HR 2015-12-17 12:34:37 UTC
Two patches one from cli xml and other from geo-rep is needed to fix this issue.

1. cli xml: http://review.gluster.org/12982
2. Geo-rep: http://review.gluster.org/12994

Comment 5 Vijay Bellur 2015-12-17 12:34:54 UTC
REVIEW: http://review.gluster.org/12982 (cli/xml: display correct xml output of tier volume) posted (#2) for review on master by Gaurav Kumar Garg (ggarg)

Comment 6 Vijay Bellur 2015-12-17 13:41:30 UTC
REVIEW: http://review.gluster.org/12994 (geo-rep: Fix getting subvol number) posted (#2) for review on master by Kotresh HR (khiremat)

Comment 7 Vijay Bellur 2015-12-18 08:54:59 UTC
REVIEW: http://review.gluster.org/12994 (geo-rep: Fix getting subvol number) posted (#3) for review on master by Kotresh HR (khiremat)

Comment 8 Vijay Bellur 2015-12-21 06:58:49 UTC
REVIEW: http://review.gluster.org/12982 (cli/xml: display correct xml output of tier volume) posted (#3) for review on master by Gaurav Kumar Garg (ggarg)

Comment 9 Vijay Bellur 2015-12-21 08:28:32 UTC
COMMIT: http://review.gluster.org/12994 committed in master by Venky Shankar (vshankar) 
------
commit d677e195cb85bef28fcd9e2f45e487c9ea792311
Author: Kotresh HR <khiremat>
Date:   Thu Dec 17 12:39:30 2015 +0530

    geo-rep: Fix getting subvol number
    
    Fix getting subvol number if the volume
    type is tier. If the volume type was tier,
    the subvol number was calculated incorrectly
    and hence few of workers didn't become ACTIVE
    resulting in files not being replicated from
    corresponding brick. This patch addresses
    the same.
    
    Change-Id: Ic10ad7f09a0fa91b4bf2aa361dea3bd48be74853
    BUG: 1292084
    Signed-off-by: Kotresh HR <khiremat>
    Reviewed-on: http://review.gluster.org/12994
    Tested-by: NetBSD Build System <jenkins.org>
    Reviewed-by: Aravinda VK <avishwan>
    Tested-by: Gluster Build System <jenkins.com>

Comment 10 Vijay Bellur 2015-12-21 11:46:10 UTC
COMMIT: http://review.gluster.org/12982 committed in master by Atin Mukherjee (amukherj) 
------
commit b0e126d0edf10946701c2fd4f0f1cf8c7b07eda1
Author: Gaurav Kumar Garg <garg.gaurav52>
Date:   Wed Dec 16 18:04:55 2015 +0530

    cli/xml: display correct xml output of tier volume
    
    Currently When hot tier type is distributed-replicate and cold tier
    type is disperse volume then #gluster volume info --xml command is
    not giving its correct output. In case of HOT tier case its displaying
    wrong volume type.
    
    With this fix it will show correct xml output for tier volume
    irrespective of all the type of the volume's.
    
    Change-Id: If1de8d52d1e0ef3d0523163abed37b2b571715e8
    BUG: 1292084
    Signed-off-by: Gaurav Kumar Garg <ggarg>
    Reviewed-on: http://review.gluster.org/12982
    Tested-by: NetBSD Build System <jenkins.org>
    Reviewed-by: mohammed rafi  kc <rkavunga>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Kotresh HR <khiremat>
    Reviewed-by: Atin Mukherjee <amukherj>

Comment 11 Vijay Bellur 2015-12-22 07:06:33 UTC
REVIEW: http://review.gluster.org/13062 (geo-rep: Fix getting subvol count) posted (#1) for review on master by Kotresh HR (khiremat)

Comment 12 Vijay Bellur 2015-12-22 15:06:49 UTC
COMMIT: http://review.gluster.org/13062 committed in master by Venky Shankar (vshankar) 
------
commit 074158e7081ff0118c719aac7cf1bcde92ee8f7d
Author: Kotresh HR <khiremat>
Date:   Tue Dec 22 12:29:32 2015 +0530

    geo-rep: Fix getting subvol count
    
    Tiering doesn't support disperse volume as hot tier,
    hence xml output doesn't give 'hotdisperseCount'.
    Remove the usage of 'hotdisperseCount' in geo-rep
    and return 0 instead.
    
    Change-Id: I736e29257de085a25e38eb02959caad3465ebcda
    BUG: 1292084
    Signed-off-by: Kotresh HR <khiremat>
    Reviewed-on: http://review.gluster.org/13062
    Tested-by: NetBSD Build System <jenkins.org>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Vivek
    Reviewed-by: Aravinda VK <avishwan>

Comment 15 Niels de Vos 2016-06-16 13:50:45 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.