Bug 1285295

Summary: [geo-rep]: Recommended Shared volume use on geo-replication is broken in latest build
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Rahul Hinduja <rhinduja>
Component: geo-replicationAssignee: Kotresh HR <khiremat>
Status: CLOSED ERRATA QA Contact: Rahul Hinduja <rhinduja>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: rhgs-3.1CC: asrivast, avishwan, byarlaga, chrisw, csaba, khiremat, nlevinki, rcyriac, sankarshan
Target Milestone: ---Keywords: Regression, ZStream
Target Release: RHGS 3.1.2   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.7.5-9 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1285488 (view as bug list) Environment:
Last Closed: 2016-03-01 05:58:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1224928, 1260783, 1285488, 1287456    

Description Rahul Hinduja 2015-11-25 11:25:22 UTC
Description of problem:
=======================

Using shared volume for geo-replicaion is recommended so that only one worker from a subvolume becomes ACTIVE and participates in syncing. With the latest build all the bricks in a subvolume becomes ACTIVE as:

[root@dhcp37-165 ~]# gluster volume geo-replication master 10.70.37.99::slave status
 
MASTER NODE                          MASTER VOL    MASTER BRICK         SLAVE USER    SLAVE                 SLAVE NODE      STATUS    CRAWL STATUS       LAST_SYNCED                  
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
dhcp37-165.lab.eng.blr.redhat.com    master        /rhs/brick1/ct-b1    root          10.70.37.99::slave    10.70.37.99     Active    Changelog Crawl    2015-11-25 14:09:19          
dhcp37-165.lab.eng.blr.redhat.com    master        /rhs/brick2/ct-b7    root          10.70.37.99::slave    10.70.37.99     Active    Changelog Crawl    2015-11-25 14:09:19          
dhcp37-110.lab.eng.blr.redhat.com    master        /rhs/brick1/ct-b5    root          10.70.37.99::slave    10.70.37.112    Active    Changelog Crawl    2015-11-25 14:09:27          
dhcp37-160.lab.eng.blr.redhat.com    master        /rhs/brick1/ct-b3    root          10.70.37.99::slave    10.70.37.162    Active    Changelog Crawl    2015-11-25 14:09:27          
dhcp37-158.lab.eng.blr.redhat.com    master        /rhs/brick1/ct-b4    root          10.70.37.99::slave    10.70.37.87     Active    Changelog Crawl    2015-11-25 14:51:50          
dhcp37-155.lab.eng.blr.redhat.com    master        /rhs/brick1/ct-b6    root          10.70.37.99::slave    10.70.37.88     Active    Changelog Crawl    2015-11-25 14:51:47          
dhcp37-133.lab.eng.blr.redhat.com    master        /rhs/brick1/ct-b2    root          10.70.37.99::slave    10.70.37.199    Active    Changelog Crawl    2015-11-25 14:51:50          
dhcp37-133.lab.eng.blr.redhat.com    master        /rhs/brick2/ct-b8    root          10.70.37.99::slave    10.70.37.199    Active    Changelog Crawl    2015-11-25 14:51:48          
[root@dhcp37-165 ~]# 

[root@dhcp37-165 geo-rep]# ls
cbe0236c-db59-48eb-b3eb-2e436a505e11_32530124-055f-4dd8-a7cc-d8c8ebeb91bb_subvol_1.lock
cbe0236c-db59-48eb-b3eb-2e436a505e11_32530124-055f-4dd8-a7cc-d8c8ebeb91bb_subvol_2.lock
cbe0236c-db59-48eb-b3eb-2e436a505e11_32530124-055f-4dd8-a7cc-d8c8ebeb91bb_subvol_3.lock
cbe0236c-db59-48eb-b3eb-2e436a505e11_32530124-055f-4dd8-a7cc-d8c8ebeb91bb_subvol_4.lock
[root@dhcp37-165 geo-rep]# 


[root@dhcp37-165 syncdaemon]# gluster volume geo-replication master 10.70.37.99::slave config use_meta_volumetrue
[root@dhcp37-165 syncdaemon]# 


Version-Release number of selected component (if applicable):
=============================================================

glusterfs-3.7.5-7.el7rhgs.x86_64


How reproducible:
=================

1/1


Steps to Reproduce:
===================
1. Create Master and Slave Cluster
2. Create Master and Slave volume
3. Enable shared storage on master volume "gluster v set all cluster.enable-shared-storage enable"
4. Mount Master volume and create some data 
5. Create Geo-Rep session between master and Slave volume 
6. Enable Meta volume
7. Start the Geo-Rep session between master and slave


Actual results:
===============

All bricks on a subvolume becomes ACTIVE 

Expected results:
=================

Only 1 brick from each subvolume should become ACTIVE

Comment 3 Kotresh HR 2015-11-26 06:18:53 UTC
Patch sent upstream:
http://review.gluster.org/#/c/12752/

Comment 4 Kotresh HR 2015-11-26 06:19:36 UTC
The workaround to go ahead with the testing is to comment following lines
in /usr/libexec/master.py

471             # Close the previously acquired lock so that
 472             # fd will not leak. Reset fd to None
 473             if gconf.mgmt_lock_fd:
 474                 os.close(gconf.mgmt_lock_fd)
 475                 gconf.mgmt_lock_fd = None
 476 
 477             # Save latest FD for future use
 478             gconf.mgmt_lock_fd = fd


484             # When previously Active becomes Passive, Close the
 485             # fd of previously acquired lock
 486             if gconf.mgmt_lock_fd:
 487                 os.close(gconf.mgmt_lock_fd)
 488                 gconf.mgmt_lock_fd = None

Comment 5 Aravinda VK 2015-12-02 09:46:45 UTC
Downstream patch https://code.engineering.redhat.com/gerrit/#/c/62769/

Comment 6 Rahul Hinduja 2015-12-04 09:19:09 UTC
Verified with build: glusterfs-geo-replication-3.7.5-9.el7rhgs.x86_64

One brick from each subvolume becomes ACTIVE. Moving bug to verified state.

Comment 9 errata-xmlrpc 2016-03-01 05:58:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0193.html