Bug 1234898 - [geo-rep]: Feature fan-out fails with the use of meta volume config
Summary: [geo-rep]: Feature fan-out fails with the use of meta volume config
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: geo-replication
Version: 3.7.0
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Kotresh HR
QA Contact:
URL:
Whiteboard:
Depends On: 1234419 1234882
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-06-23 13:11 UTC by Kotresh HR
Modified: 2015-07-30 09:48 UTC (History)
9 users (show)

Fixed In Version: glusterfs-3.7.3
Doc Type: Bug Fix
Doc Text:
Clone Of: 1234882
Environment:
Last Closed: 2015-07-30 09:48:55 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Description Kotresh HR 2015-06-23 13:11:00 UTC
+++ This bug was initially created as a clone of Bug #1234882 +++

+++ This bug was initially created as a clone of Bug #1234419 +++

Description of problem:
=======================

When the geo-rep session was created between 2 slaves, one slaves bricks all becomes PASSIVE. It is only with the use of meta volume config set to true. 

Slave volumes: slave1 and slave2


Creating geo-rep Session between master volume and slave volumes (slave1,slave2)

[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.154::slave1 create push-pem force
Creating geo-replication session between master & 10.70.46.154::slave1 has been successful
[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.154::slave2 create push-pem force
Creating geo-replication session between master & 10.70.46.154::slave2 has been successful
[root@georep1 scripts]# 

Setting the use-meta-volume for slave1 and slave2 volume:

[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.154::slave1 config use_meta_volume true
geo-replication config updated successfully
[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.154::slave2 config use_meta_volume true
geo-replication config updated successfully
[root@georep1 scripts]# 


Starting geo-rep session for slave volumes slave1, slave2

[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.154::slave1 start
Starting geo-replication session between master & 10.70.46.154::slave1 has been successful
[root@georep1 scripts]#
[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.154::slave2 start
Starting geo-replication session between master & 10.70.46.154::slave2 has been successful
[root@georep1 scripts]# 

Status:
=======
[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.154::slave1 status
 
MASTER NODE    MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                   SLAVE NODE      STATUS     CRAWL STATUS       LAST_SYNCED                  
-----------------------------------------------------------------------------------------------------------------------------------------------------
georep1        master        /rhs/brick1/b1    root          10.70.46.154::slave1    10.70.46.101    Active     Changelog Crawl    2015-06-23 00:46:12          
georep1        master        /rhs/brick2/b2    root          10.70.46.154::slave1    10.70.46.101    Active     Changelog Crawl    2015-06-23 00:46:12          
georep3        master        /rhs/brick1/b1    root          10.70.46.154::slave1    10.70.46.154    Passive    N/A                N/A                          
georep3        master        /rhs/brick2/b2    root          10.70.46.154::slave1    10.70.46.154    Passive    N/A                N/A                          
georep2        master        /rhs/brick1/b1    root          10.70.46.154::slave1    10.70.46.103    Passive    N/A                N/A                          
georep2        master        /rhs/brick2/b2    root          10.70.46.154::slave1    10.70.46.103    Passive    N/A                N/A                          
[root@georep1 scripts]# 
[root@georep1 scripts]# 
[root@georep1 scripts]# gluster volume geo-replication master 10.70.46.154::slave2 status
 
MASTER NODE    MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                   SLAVE NODE      STATUS     CRAWL STATUS    LAST_SYNCED          
------------------------------------------------------------------------------------------------------------------------------------------
georep1        master        /rhs/brick1/b1    root          10.70.46.154::slave2    10.70.46.101    Passive    N/A             N/A                  
georep1        master        /rhs/brick2/b2    root          10.70.46.154::slave2    10.70.46.101    Passive    N/A             N/A                  
georep3        master        /rhs/brick1/b1    root          10.70.46.154::slave2    10.70.46.154    Passive    N/A             N/A                  
georep3        master        /rhs/brick2/b2    root          10.70.46.154::slave2    10.70.46.154    Passive    N/A             N/A                  
georep2        master        /rhs/brick1/b1    root          10.70.46.154::slave2    10.70.46.103    Passive    N/A             N/A                  
georep2        master        /rhs/brick2/b2    root          10.70.46.154::slave2    10.70.46.103    Passive    N/A             N/A                  
[root@georep1 scripts]# 


The second slave volume slave2 has all the passive bricks, and hence the sync never happens to the slave2 volume.

Meta volume bricks:

[root@georep1 scripts]# ls /var/run/gluster/ss_brick/geo-rep/
6f023fd5-49a5-4af7-a68a-b7071a8b9ff0_subvol_1.lock  6f023fd5-49a5-4af7-a68a-b7071a8b9ff0_subvol_2.lock
[root@georep1 scripts]# 



Version-Release number of selected component (if applicable):
==============================================================


How reproducible:
=================
1/1


Master:
=======

[root@georep1 scripts]# gluster volume info
 
Volume Name: gluster_shared_storage
Type: Replicate
Volume ID: 102b304d-494a-40cc-84e0-3eca89b3e559
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.70.46.97:/var/run/gluster/ss_brick
Brick2: 10.70.46.93:/var/run/gluster/ss_brick
Brick3: 10.70.46.96:/var/run/gluster/ss_brick
Options Reconfigured:
performance.readdir-ahead: on
cluster.enable-shared-storage: enable
 
Volume Name: master
Type: Distributed-Replicate
Volume ID: 6f023fd5-49a5-4af7-a68a-b7071a8b9ff0
Status: Started
Number of Bricks: 2 x 3 = 6
Transport-type: tcp
Bricks:
Brick1: 10.70.46.96:/rhs/brick1/b1
Brick2: 10.70.46.97:/rhs/brick1/b1
Brick3: 10.70.46.93:/rhs/brick1/b1
Brick4: 10.70.46.96:/rhs/brick2/b2
Brick5: 10.70.46.97:/rhs/brick2/b2
Brick6: 10.70.46.93:/rhs/brick2/b2
Options Reconfigured:
changelog.changelog: on
geo-replication.ignore-pid-check: on
geo-replication.indexing: on
performance.readdir-ahead: on
cluster.enable-shared-storage: enable
[root@georep1 scripts]# 


Slave:
======

[root@georep4 scripts]# gluster volume info
 
Volume Name: slave1
Type: Replicate
Volume ID: fc1e64c2-2028-4977-844a-678f4cc31351
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.70.46.154:/rhs/brick1/b1
Brick2: 10.70.46.101:/rhs/brick1/b1
Brick3: 10.70.46.103:/rhs/brick1/b1
Options Reconfigured:
performance.readdir-ahead: on
 
Volume Name: slave2
Type: Replicate
Volume ID: 800f46c8-2708-48e5-9256-df8dbbdc5906
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.70.46.154:/rhs/brick2/b2
Brick2: 10.70.46.101:/rhs/brick2/b2
Brick3: 10.70.46.103:/rhs/brick2/b2
Options Reconfigured:
performance.readdir-ahead: on
[root@georep4 scripts]#

Comment 1 Anand Avati 2015-06-23 13:11:43 UTC
REVIEW: http://review.gluster.org/11366 (geo-rep: Fix geo-rep fanout setup with meta volume) posted (#2) for review on release-3.7 by Kotresh HR (khiremat@redhat.com)

Comment 2 Anand Avati 2015-06-25 17:27:34 UTC
COMMIT: http://review.gluster.org/11366 committed in release-3.7 by Venky Shankar (vshankar@redhat.com) 
------
commit 90f5a6cd669660785213a187c9fa7a587cd15257
Author: Kotresh HR <khiremat@redhat.com>
Date:   Tue Jun 23 18:28:56 2015 +0530

    geo-rep: Fix geo-rep fanout setup with meta volume
    
    Lock filename was formed with 'master volume id'
    and 'subvol number'. Hence multiple slaves try
    acquiring lock on same file and become PASSIVE
    ending up not syncing data. Using 'slave volume id'
    in lock filename will fix the issue making lock
    file unique across different slaves.
    
    Change-Id: I64c84670a5d9e1b0dfbdeb4479ee6b8e0c6b829e
    BUG: 1234898
    Reviewed-On: http://review.gluster.org/11367
    Signed-off-by: Kotresh HR <khiremat@redhat.com>
    Reviewed-on: http://review.gluster.org/11366
    Tested-by: Gluster Build System <jenkins@build.gluster.com>
    Tested-by: NetBSD Build System <jenkins@build.gluster.org>
    Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com>
    Reviewed-by: Venky Shankar <vshankar@redhat.com>

Comment 3 Kaushal 2015-07-30 09:48:55 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.3, please open a new bug report.

glusterfs-3.7.3 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/12078
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.