Description of problem: ======================= Using shared volume for geo-replicaion is recommended so that only one worker from a subvolume becomes ACTIVE and participates in syncing. With the latest build all the bricks in a subvolume becomes ACTIVE as: [root@dhcp37-165 ~]# gluster volume geo-replication master 10.70.37.99::slave status MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- dhcp37-165.lab.eng.blr.redhat.com master /rhs/brick1/ct-b1 root 10.70.37.99::slave 10.70.37.99 Active Changelog Crawl 2015-11-25 14:09:19 dhcp37-165.lab.eng.blr.redhat.com master /rhs/brick2/ct-b7 root 10.70.37.99::slave 10.70.37.99 Active Changelog Crawl 2015-11-25 14:09:19 dhcp37-110.lab.eng.blr.redhat.com master /rhs/brick1/ct-b5 root 10.70.37.99::slave 10.70.37.112 Active Changelog Crawl 2015-11-25 14:09:27 dhcp37-160.lab.eng.blr.redhat.com master /rhs/brick1/ct-b3 root 10.70.37.99::slave 10.70.37.162 Active Changelog Crawl 2015-11-25 14:09:27 dhcp37-158.lab.eng.blr.redhat.com master /rhs/brick1/ct-b4 root 10.70.37.99::slave 10.70.37.87 Active Changelog Crawl 2015-11-25 14:51:50 dhcp37-155.lab.eng.blr.redhat.com master /rhs/brick1/ct-b6 root 10.70.37.99::slave 10.70.37.88 Active Changelog Crawl 2015-11-25 14:51:47 dhcp37-133.lab.eng.blr.redhat.com master /rhs/brick1/ct-b2 root 10.70.37.99::slave 10.70.37.199 Active Changelog Crawl 2015-11-25 14:51:50 dhcp37-133.lab.eng.blr.redhat.com master /rhs/brick2/ct-b8 root 10.70.37.99::slave 10.70.37.199 Active Changelog Crawl 2015-11-25 14:51:48 [root@dhcp37-165 ~]# [root@dhcp37-165 geo-rep]# ls cbe0236c-db59-48eb-b3eb-2e436a505e11_32530124-055f-4dd8-a7cc-d8c8ebeb91bb_subvol_1.lock cbe0236c-db59-48eb-b3eb-2e436a505e11_32530124-055f-4dd8-a7cc-d8c8ebeb91bb_subvol_2.lock cbe0236c-db59-48eb-b3eb-2e436a505e11_32530124-055f-4dd8-a7cc-d8c8ebeb91bb_subvol_3.lock cbe0236c-db59-48eb-b3eb-2e436a505e11_32530124-055f-4dd8-a7cc-d8c8ebeb91bb_subvol_4.lock [root@dhcp37-165 geo-rep]# [root@dhcp37-165 syncdaemon]# gluster volume geo-replication master 10.70.37.99::slave config use_meta_volumetrue [root@dhcp37-165 syncdaemon]# Version-Release number of selected component (if applicable): ============================================================= glusterfs-3.7.5-7.el7rhgs.x86_64 How reproducible: ================= 1/1 Steps to Reproduce: =================== 1. Create Master and Slave Cluster 2. Create Master and Slave volume 3. Enable shared storage on master volume "gluster v set all cluster.enable-shared-storage enable" 4. Mount Master volume and create some data 5. Create Geo-Rep session between master and Slave volume 6. Enable Meta volume 7. Start the Geo-Rep session between master and slave Actual results: =============== All bricks on a subvolume becomes ACTIVE Expected results: ================= Only 1 brick from each subvolume should become ACTIVE
Patch sent upstream: http://review.gluster.org/#/c/12752/
The workaround to go ahead with the testing is to comment following lines in /usr/libexec/master.py 471 # Close the previously acquired lock so that 472 # fd will not leak. Reset fd to None 473 if gconf.mgmt_lock_fd: 474 os.close(gconf.mgmt_lock_fd) 475 gconf.mgmt_lock_fd = None 476 477 # Save latest FD for future use 478 gconf.mgmt_lock_fd = fd 484 # When previously Active becomes Passive, Close the 485 # fd of previously acquired lock 486 if gconf.mgmt_lock_fd: 487 os.close(gconf.mgmt_lock_fd) 488 gconf.mgmt_lock_fd = None
Downstream patch https://code.engineering.redhat.com/gerrit/#/c/62769/
Verified with build: glusterfs-geo-replication-3.7.5-9.el7rhgs.x86_64 One brick from each subvolume becomes ACTIVE. Moving bug to verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-0193.html