Bug 1285488

Summary: [geo-rep]: Recommended Shared volume use on geo-replication is broken
Product: [Community] GlusterFS Reporter: Kotresh HR <khiremat>
Component: geo-replicationAssignee: Kotresh HR <khiremat>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: urgent Docs Contact:
Priority: unspecified    
Version: mainlineCC: bugs, byarlaga, chrisw, csaba, khiremat, nlevinki, rhinduja, storage-qa-internal
Target Milestone: ---Keywords: Regression, ZStream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.8rc2 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1285295
: 1287456 (view as bug list) Environment:
Last Closed: 2016-06-16 13:46:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1285295    
Bug Blocks: 1287456    

Description Kotresh HR 2015-11-25 17:55:26 UTC
+++ This bug was initially created as a clone of Bug #1285295 +++

Description of problem:
=======================

Using shared volume for geo-replicaion is recommended so that only one worker from a subvolume becomes ACTIVE and participates in syncing. Now all the bricks in a subvolume becomes ACTIVE as:

[root@dhcp37-165 ~]# gluster volume geo-replication master 10.70.37.99::slave status
 
MASTER NODE                          MASTER VOL    MASTER BRICK         SLAVE USER    SLAVE                 SLAVE NODE      STATUS    CRAWL STATUS       LAST_SYNCED                  
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
dhcp37-165.lab.eng.blr.redhat.com    master        /rhs/brick1/ct-b1    root          10.70.37.99::slave    10.70.37.99     Active    Changelog Crawl    2015-11-25 14:09:19          
dhcp37-165.lab.eng.blr.redhat.com    master        /rhs/brick2/ct-b7    root          10.70.37.99::slave    10.70.37.99     Active    Changelog Crawl    2015-11-25 14:09:19          
dhcp37-110.lab.eng.blr.redhat.com    master        /rhs/brick1/ct-b5    root          10.70.37.99::slave    10.70.37.112    Active    Changelog Crawl    2015-11-25 14:09:27          
dhcp37-160.lab.eng.blr.redhat.com    master        /rhs/brick1/ct-b3    root          10.70.37.99::slave    10.70.37.162    Active    Changelog Crawl    2015-11-25 14:09:27          
dhcp37-158.lab.eng.blr.redhat.com    master        /rhs/brick1/ct-b4    root          10.70.37.99::slave    10.70.37.87     Active    Changelog Crawl    2015-11-25 14:51:50          
dhcp37-155.lab.eng.blr.redhat.com    master        /rhs/brick1/ct-b6    root          10.70.37.99::slave    10.70.37.88     Active    Changelog Crawl    2015-11-25 14:51:47          
dhcp37-133.lab.eng.blr.redhat.com    master        /rhs/brick1/ct-b2    root          10.70.37.99::slave    10.70.37.199    Active    Changelog Crawl    2015-11-25 14:51:50          
dhcp37-133.lab.eng.blr.redhat.com    master        /rhs/brick2/ct-b8    root          10.70.37.99::slave    10.70.37.199    Active    Changelog Crawl    2015-11-25 14:51:48          
[root@dhcp37-165 ~]# 

[root@dhcp37-165 geo-rep]# ls
cbe0236c-db59-48eb-b3eb-2e436a505e11_32530124-055f-4dd8-a7cc-d8c8ebeb91bb_subvol_1.lock
cbe0236c-db59-48eb-b3eb-2e436a505e11_32530124-055f-4dd8-a7cc-d8c8ebeb91bb_subvol_2.lock
cbe0236c-db59-48eb-b3eb-2e436a505e11_32530124-055f-4dd8-a7cc-d8c8ebeb91bb_subvol_3.lock
cbe0236c-db59-48eb-b3eb-2e436a505e11_32530124-055f-4dd8-a7cc-d8c8ebeb91bb_subvol_4.lock
[root@dhcp37-165 geo-rep]# 


[root@dhcp37-165 syncdaemon]# gluster volume geo-replication master 10.70.37.99::slave config use_meta_volumetrue
[root@dhcp37-165 syncdaemon]# 


Version-Release number of selected component (if applicable):
=============================================================



How reproducible:
=================

1/1


Steps to Reproduce:
===================
1. Create Master and Slave Cluster
2. Create Master and Slave volume
3. Enable shared storage on master volume "gluster v set all cluster.enable-shared-storage enable"
4. Mount Master volume and create some data 
5. Create Geo-Rep session between master and Slave volume 
6. Enable Meta volume
7. Start the Geo-Rep session between master and slave


Actual results:
===============

All bricks on a subvolume becomes ACTIVE 

Expected results:
=================

Only 1 brick from each subvolume should become ACTIVE

Comment 1 Vijay Bellur 2015-11-25 17:57:33 UTC
REVIEW: http://review.gluster.org/12752 (geo-rep: fd close and fcntl issue) posted (#1) for review on master by Kotresh HR (khiremat)

Comment 2 Vijay Bellur 2015-11-26 05:34:05 UTC
REVIEW: http://review.gluster.org/12752 (geo-rep: fd close and fcntl issue) posted (#2) for review on master by Kotresh HR (khiremat)

Comment 3 Vijay Bellur 2015-11-27 07:11:22 UTC
REVIEW: http://review.gluster.org/12752 (geo-rep: fd close and fcntl issue) posted (#3) for review on master by Kotresh HR (khiremat)

Comment 4 Vijay Bellur 2015-11-27 09:17:07 UTC
REVIEW: http://review.gluster.org/12752 (geo-rep: fd close and fcntl issue) posted (#4) for review on master by Kotresh HR (khiremat)

Comment 5 Vijay Bellur 2015-11-30 06:10:14 UTC
REVIEW: http://review.gluster.org/12752 (geo-rep: fd close and fcntl issue) posted (#5) for review on master by Kotresh HR (khiremat)

Comment 6 Vijay Bellur 2015-12-02 06:42:19 UTC
COMMIT: http://review.gluster.org/12752 committed in master by Venky Shankar (vshankar) 
------
commit 189a2302822f20ec9741002ebc9b18ea80c2837f
Author: Aravinda VK <avishwan>
Date:   Wed Nov 25 18:34:29 2015 +0530

    geo-rep: fd close and fcntl issue
    
    When any of the open fd of a file is closed
    on which fcntl lock is taken even though another
    fd of the same file is open on which lock is taken,
    all fcntl locks will be released. This causes both
    replica workers to be ACTIVE sometimes. This patche
    fixes that issue.
    
    Change-Id: I1e203ab0e29442275338276deb56d09e5679329c
    BUG: 1285488
    Original-Author: Aravinda VK <avishwan>
    Signed-off-by: Kotresh HR <khiremat>
    Reviewed-on: http://review.gluster.org/12752
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Milind Changire <mchangir>
    Tested-by: NetBSD Build System <jenkins.org>
    Reviewed-by: Aravinda VK <avishwan>
    Reviewed-by: Venky Shankar <vshankar>

Comment 7 Niels de Vos 2016-06-16 13:46:40 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user