Bug 1287519 - [geo-rep+tiering]: symlinks are not getting synced to slave on tiered master setup
Summary: [geo-rep+tiering]: symlinks are not getting synced to slave on tiered master ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: geo-replication
Version: mainline
Hardware: x86_64
OS: Linux
high
urgent
Target Milestone: ---
Assignee: Saravanakumar
QA Contact:
URL:
Whiteboard:
Depends On: 1286637
Blocks: 1288027
TreeView+ depends on / blocked
 
Reported: 2015-12-02 09:32 UTC by Saravanakumar
Modified: 2016-06-16 13:47 UTC (History)
10 users (show)

Fixed In Version: glusterfs-3.8rc2
Doc Type: Bug Fix
Doc Text:
Clone Of: 1286637
: 1288027 (view as bug list)
Environment:
Last Closed: 2016-06-16 13:47:58 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Saravanakumar 2015-12-02 09:32:30 UTC
+++ This bug was initially created as a clone of Bug #1286637 +++

Description of problem:
=======================

Symlink creation on master volume which is tiered volume is not synced to slave volume. For example:

File {a,b,c} is created on master volume: 

[root@dj master]# touch a b c 
[root@dj master]# ls
a  b  c  etc.1  etc.2  etc.3  etc.4  etc.5
[root@dj master]#

Files get successfully synced to slave: 

[root@mia slave]# ls
a  b  c  etc.1  etc.2  etc.3  etc.4  etc.5
[root@mia slave]# 

Create a symlink from master volume:

[root@dj master]# ln -s a d
[root@dj master]# ls -la
total 243
drwxr-xr-x.  9 root root   650 Nov 30  2015 .
drwxr-xr-x.  4 root root    31 Nov 30 03:10 ..
-rw-r--r--.  1 root root     0 Nov 30 03:49 a
-rw-r--r--.  1 root root     0 Nov 30 03:49 b
-rw-r--r--.  1 root root     0 Nov 30 03:49 c
lrwxrwxrwx.  1 root root     1 Nov 30  2015 d -> a
drwxr-xr-x. 80 root root 49152 Nov 30  2015 etc.1
drwxr-xr-x. 80 root root 49152 Nov 30  2015 etc.2
drwxr-xr-x. 80 root root 49152 Nov 30  2015 etc.3
drwxr-xr-x. 80 root root 49152 Nov 30  2015 etc.4
drwxr-xr-x. 80 root root 49152 Nov 30  2015 etc.5
drwxr-xr-x.  3 root root   144 Nov 30  2015 .trashcan
[root@dj master]# 


On Slave it never gets synced:

[root@mia slave]# ls -la
total 168
drwxr-xr-x.  9 root root  8497 Nov 30  2015 .
drwxr-xr-x.  4 root root    31 Nov 30 08:40 ..
-rw-r--r--.  1 root root     0 Nov 30 03:49 a
-rw-r--r--.  1 root root     0 Nov 30 03:49 b
-rw-r--r--.  1 root root     0 Nov 30 03:49 c
drwxr-xr-x. 80 root root 32768 Nov 30  2015 etc.1
drwxr-xr-x. 80 root root 32768 Nov 30  2015 etc.2
drwxr-xr-x. 80 root root 32768 Nov 30  2015 etc.3
drwxr-xr-x. 80 root root 32768 Nov 30  2015 etc.4
drwxr-xr-x. 80 root root 32768 Nov 30  2015 etc.5
drwxr-xr-x.  3 root root    96 Nov 30  2015 .trashcan
[root@mia slave]# 


Hardlinks are synced properly:

On Master:
==========

[root@dj master]# touch {1..10}
[root@dj master]# for i in {1..10}; do ln $i hl.$i ; done 
[root@dj master]# ls
1   2  4  6  8  a  c  e      etc.2  etc.4  hl.1   hl.2  hl.4  hl.6  hl.8
10  3  5  7  9  b  d  etc.1  etc.3  etc.5  hl.10  hl.3  hl.5  hl.7  hl.9
[root@dj master]#


On Slave:
=========

[root@mia slave]# ls
1   2  4  6  8  a  c  etc.1  etc.3  etc.5  hl.10  hl.3  hl.5  hl.7  hl.9
10  3  5  7  9  b  e  etc.2  etc.4  hl.1   hl.2   hl.4  hl.6  hl.8
[root@mia slave]#

Version-Release number of selected component (if applicable):
=============================================================



How reproducible:
=================

1/1

Steps to Reproduce:
===================
1. Create and Start Tier master volume
2. Create slave volume (DR)
3. Create and start geo-rep session between master and slave volume
4. Mount master and slave volume
5. Create a file on master 
6. Let it sync to slave
7. Create a symlink file on master
8. Check on slave

Actual results:
===============

Symlink never gets syncs to slave while the hardlinks, and new file created are synced


Expected results:
=================

Symlink should get synced too

--- Additional comment from Rahul Hinduja on 2015-11-30 06:57:00 EST ---

sosreports are at: http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1286637/

--- Additional comment from Rahul Hinduja on 2015-11-30 07:04:56 EST ---

Volume information:

[root@dhcp37-165 brick1]# gluster volume info 
 
Volume Name: gluster_shared_storage
Type: Replicate
Volume ID: 20b2d2d6-4562-4290-be46-c649f3160b75
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.70.37.133:/var/lib/glusterd/ss_brick
Brick2: 10.70.37.160:/var/lib/glusterd/ss_brick
Brick3: dhcp37-165.lab.eng.blr.redhat.com:/var/lib/glusterd/ss_brick
Options Reconfigured:
performance.readdir-ahead: on
cluster.enable-shared-storage: enable
 
Volume Name: master
Type: Tier
Volume ID: 2dd3e0f5-25cb-4738-99ac-5da7a3d25412
Status: Started
Number of Bricks: 12
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick1: 10.70.37.155:/rhs/brick3/hot-b4
Brick2: 10.70.37.110:/rhs/brick3/hot-b3
Brick3: 10.70.37.158:/rhs/brick3/hot-b2
Brick4: 10.70.37.160:/rhs/brick3/hot-b1
Cold Tier:
Cold Tier Type : Distributed-Replicate
Number of Bricks: 4 x 2 = 8
Brick5: 10.70.37.165:/rhs/brick1/ct-b1
Brick6: 10.70.37.133:/rhs/brick1/ct-b2
Brick7: 10.70.37.160:/rhs/brick1/ct-b3
Brick8: 10.70.37.158:/rhs/brick1/ct-b4
Brick9: 10.70.37.110:/rhs/brick1/ct-b5
Brick10: 10.70.37.155:/rhs/brick1/ct-b6
Brick11: 10.70.37.165:/rhs/brick2/ct-b7
Brick12: 10.70.37.133:/rhs/brick2/ct-b8
Options Reconfigured:
changelog.changelog: on
geo-replication.ignore-pid-check: on
geo-replication.indexing: on
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
cluster.tier-mode: test
features.ctr-enabled: on
performance.readdir-ahead: on
cluster.enable-shared-storage: enable
[root@dhcp37-165 brick1]#  



Geo-Rep session:
================

[root@dhcp37-165 ct-b1]# gluster volume geo-replication master 10.70.37.99::slave status
 
MASTER NODE                          MASTER VOL    MASTER BRICK          SLAVE USER    SLAVE                 SLAVE NODE      STATUS     CRAWL STATUS       LAST_SYNCED                  
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
dhcp37-165.lab.eng.blr.redhat.com    master        /rhs/brick1/ct-b1     root          10.70.37.99::slave    10.70.37.112    Active     Changelog Crawl    2015-11-30 16:41:30          
dhcp37-165.lab.eng.blr.redhat.com    master        /rhs/brick2/ct-b7     root          10.70.37.99::slave    10.70.37.112    Active     Changelog Crawl    2015-11-30 16:56:14          
dhcp37-133.lab.eng.blr.redhat.com    master        /rhs/brick1/ct-b2     root          10.70.37.99::slave    10.70.37.88     Passive    N/A                N/A                          
dhcp37-133.lab.eng.blr.redhat.com    master        /rhs/brick2/ct-b8     root          10.70.37.99::slave    10.70.37.88     Passive    N/A                N/A                          
dhcp37-160.lab.eng.blr.redhat.com    master        /rhs/brick3/hot-b1    root          10.70.37.99::slave    10.70.37.87     Passive    N/A                N/A                          
dhcp37-160.lab.eng.blr.redhat.com    master        /rhs/brick1/ct-b3     root          10.70.37.99::slave    10.70.37.99     Passive    N/A                N/A                          
dhcp37-155.lab.eng.blr.redhat.com    master        /rhs/brick3/hot-b4    root          10.70.37.99::slave    10.70.37.99     Passive    N/A                N/A                          
dhcp37-155.lab.eng.blr.redhat.com    master        /rhs/brick1/ct-b6     root          10.70.37.99::slave    10.70.37.87     Active     Changelog Crawl    2015-11-30 16:56:22          
dhcp37-110.lab.eng.blr.redhat.com    master        /rhs/brick3/hot-b3    root          10.70.37.99::slave    10.70.37.199    Active     Changelog Crawl    2015-11-30 16:41:53          
dhcp37-110.lab.eng.blr.redhat.com    master        /rhs/brick1/ct-b5     root          10.70.37.99::slave    10.70.37.162    Passive    N/A                N/A                          
dhcp37-158.lab.eng.blr.redhat.com    master        /rhs/brick3/hot-b2    root          10.70.37.99::slave    10.70.37.162    Active     Changelog Crawl    2015-11-30 16:56:08          
dhcp37-158.lab.eng.blr.redhat.com    master        /rhs/brick1/ct-b4     root          10.70.37.99::slave    10.70.37.199    Active     Changelog Crawl    2015-11-30 16:41:54          
[root@dhcp37-165 ct-b1]#

--- Additional comment from Rahul Hinduja on 2015-11-30 07:13:30 EST ---

If tiered volume is not involved in geo-rep setup. The symlinks too gets synced. For example:

Create a geo-rep session between non-tier volumes {both master and slave were DR}:

[root@dhcp37-165 scripts]# gluster volume geo-replication master 10.70.37.99::slave status
 
MASTER NODE                          MASTER VOL    MASTER BRICK         SLAVE USER    SLAVE                 SLAVE NODE      STATUS     CRAWL STATUS       LAST_SYNCED          
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
dhcp37-165.lab.eng.blr.redhat.com    master        /rhs/brick1/ct-b1    root          10.70.37.99::slave    10.70.37.99     Active     Changelog Crawl    N/A                  
dhcp37-165.lab.eng.blr.redhat.com    master        /rhs/brick2/ct-b7    root          10.70.37.99::slave    10.70.37.99     Active     Changelog Crawl    N/A                  
dhcp37-155.lab.eng.blr.redhat.com    master        /rhs/brick1/ct-b6    root          10.70.37.99::slave    10.70.37.88     Passive    N/A                N/A                  
dhcp37-160.lab.eng.blr.redhat.com    master        /rhs/brick1/ct-b3    root          10.70.37.99::slave    10.70.37.162    Passive    N/A                N/A                  
dhcp37-158.lab.eng.blr.redhat.com    master        /rhs/brick1/ct-b4    root          10.70.37.99::slave    10.70.37.87     Active     Changelog Crawl    N/A                  
dhcp37-110.lab.eng.blr.redhat.com    master        /rhs/brick1/ct-b5    root          10.70.37.99::slave    10.70.37.112    Active     Changelog Crawl    N/A                  
dhcp37-133.lab.eng.blr.redhat.com    master        /rhs/brick1/ct-b2    root          10.70.37.99::slave    10.70.37.199    Passive    N/A                N/A                  
dhcp37-133.lab.eng.blr.redhat.com    master        /rhs/brick2/ct-b8    root          10.70.37.99::slave    10.70.37.199    Passive    N/A                N/A                  
[root@dhcp37-165 scripts]# gluster volume info 
 
Volume Name: gluster_shared_storage
Type: Replicate
Volume ID: 162614e4-7f56-42c6-9e83-5800fd3f251e
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.70.37.133:/var/lib/glusterd/ss_brick
Brick2: 10.70.37.160:/var/lib/glusterd/ss_brick
Brick3: dhcp37-165.lab.eng.blr.redhat.com:/var/lib/glusterd/ss_brick
Options Reconfigured:
performance.readdir-ahead: on
cluster.enable-shared-storage: enable
 
Volume Name: master
Type: Distributed-Replicate
Volume ID: 8cf9d396-06a3-480f-8f00-3821c8e43dcb
Status: Started
Number of Bricks: 4 x 2 = 8
Transport-type: tcp
Bricks:
Brick1: 10.70.37.165:/rhs/brick1/ct-b1
Brick2: 10.70.37.133:/rhs/brick1/ct-b2
Brick3: 10.70.37.160:/rhs/brick1/ct-b3
Brick4: 10.70.37.158:/rhs/brick1/ct-b4
Brick5: 10.70.37.110:/rhs/brick1/ct-b5
Brick6: 10.70.37.155:/rhs/brick1/ct-b6
Brick7: 10.70.37.165:/rhs/brick2/ct-b7
Brick8: 10.70.37.133:/rhs/brick2/ct-b8
Options Reconfigured:
changelog.changelog: on
geo-replication.ignore-pid-check: on
geo-replication.indexing: on
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
performance.readdir-ahead: on
cluster.enable-shared-storage: enable
[root@dhcp37-165 scripts]# 


From Client:
============

Master:
=======

[root@dj master]# touch a
[root@dj master]# 
[root@dj master]# ls
a
[root@dj master]# ln -s a b
[root@dj master]# ls
a  b
[root@dj master]# ls -la
total 2
drwxr-xr-x. 4 root root 172 Nov 30  2015 .
drwxr-xr-x. 4 root root  31 Nov 30 04:55 ..
-rw-r--r--. 1 root root   0 Nov 30 04:56 a
lrwxrwxrwx. 1 root root   1 Nov 30  2015 b -> a
drwxr-xr-x. 3 root root  96 Nov 30  2015 .trashcan
[root@dj master]# 


Slave:
======

[root@mia slave]# ls
a  b
[root@mia slave]# ls -la
total 0
drwxr-xr-x. 4 root root 172 Nov 30  2015 .
drwxr-xr-x. 4 root root  31 Nov 30 10:25 ..
-rw-r--r--. 1 root root   0 Nov 30 04:56 a
lrwxrwxrwx. 1 root root   1 Nov 30  2015 b -> a
drwxr-xr-x. 3 root root  96 Nov 30  2015 .trashcan
[root@mia slave]#

[root@mia slave]# mount | grep fuse
fusectl on /sys/fs/fuse/connections type fusectl (rw,relatime)
10.70.37.165:/master on /mnt/master type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
10.70.37.99:/slave on /mnt/slave type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
[root@mia slave]# 



Proposing blocker as it breaks the geo-rep functionality of syncing symlink if tier volume is involved.

--- Additional comment from Saravanakumar on 2015-11-30 09:32:50 EST ---

RCA:

Changes carried out in Geo-replication are under the assumption that HASH is HOT tier and Cache is COLD tier, so that all namespace operations are avoided in COLD tier.

This is no longer the case, (Currently changed as: HASH is COLD tier and CACHE is HOT tier).

SYMLINK operations are carried out only in COLD tier and as we avoid cold bricks(in current implementation), all symlink creation operation are not replayed at slave end.

Changes will be done such at all namespace operations will be captured only in COLD brick and avoided in HOT bricks.

Comment 1 Vijay Bellur 2015-12-02 09:33:48 UTC
REVIEW: http://review.gluster.org/12844 (geo-rep: use cold tier bricks for namespace operations) posted (#3) for review on master by Saravanakumar Arumugam (sarumuga)

Comment 2 Vijay Bellur 2015-12-02 12:29:17 UTC
REVIEW: http://review.gluster.org/12844 (geo-rep: use cold tier bricks for namespace operations) posted (#4) for review on master by Saravanakumar Arumugam (sarumuga)

Comment 3 Vijay Bellur 2015-12-03 10:34:57 UTC
COMMIT: http://review.gluster.org/12844 committed in master by Venky Shankar (vshankar) 
------
commit 93f31189ce8f6e2980a39b02568ed17088e0a667
Author: Saravanakumar Arumugam <sarumuga>
Date:   Wed Dec 2 14:26:47 2015 +0530

    geo-rep: use cold tier bricks for namespace operations
    
    Problem:
    symlinks are not getting synced to slave in a Tiering based volume.
    
    Solution:
    Now, symlinks are created directly in cold tier bricks( in the backend).
    
    Earlier, cold tier was avoided for namespace operations and only
    hot tier was used while processing changelogs.
    
    Now, cold tier is HASH subvolume in a Tiering volume.
    So, carry out namespace operation only in cold tier subvolume and
    avoid hot tier subvolume to avoid any races.
    
    Earlier, XSYNC was used(and changeloghistory avoided) during initial sync
    in order to avoid race while processing historychangelog in Hot tier.
    This is no longer required as there is no race from Hot tier.
    
    Also, avoid both live and history changelog ENTRY operations from Hot tier to avoid any race with cold tier.
    
    Change-Id: Ia8fbb7ae037f5b6cb683f36c0df5c3fc2894636e
    BUG: 1287519
    Signed-off-by: Saravanakumar Arumugam <sarumuga>
    Reviewed-on: http://review.gluster.org/12844
    Tested-by: NetBSD Build System <jenkins.org>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Kotresh HR <khiremat>
    Reviewed-by: Venky Shankar <vshankar>

Comment 4 Niels de Vos 2016-06-16 13:47:58 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.