Bug 1302979 - [georep+tiering]: Hardlink sync is broken if master volume is tiered
[georep+tiering]: Hardlink sync is broken if master volume is tiered
Status: CLOSED CURRENTRELEASE
Product: GlusterFS
Classification: Community
Component: geo-replication (Show other bugs)
3.7.7
x86_64 Linux
unspecified Severity urgent
: ---
: ---
Assigned To: Saravanakumar
: ZStream
Depends On: 1300682 1301032
Blocks: glusterfs-3.7.9
  Show dependency treegraph
 
Reported: 2016-01-29 03:23 EST by Saravanakumar
Modified: 2016-04-19 03:21 EDT (History)
8 users (show)

See Also:
Fixed In Version: glusterfs-3.7.9
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1301032
Environment:
Last Closed: 2016-03-22 04:14:22 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Saravanakumar 2016-01-29 03:23:06 EST
Description of problem:
=======================

In a geo-replication setup where master is tiered and slave is non-tiered volume, some of the hardlinks are synced as T file while some are synced properly. 

For example:
============

Master volume hardlink files:

[root@dj scripts]# ls -l /mnt/vol0/thread0/level00/hardlink_to_files/
total 26
-rw-r--r--. 2 root root 10707 Jan 21 04:11 56a002ca%%IUUZUI3ZFV
-rw-r--r--. 2 root root 15094 Jan 21  2016 56a002ca%%VLS8G8PZ11
[root@dj scripts]# 

Synced to slave as:

[root@dj scripts]# ls -l /mnt/vol1/thread0/level00/hardlink_to_files/
total 11
-rw-r--r--. 1 root root 10707 Jan 21 04:11 56a002ca%%IUUZUI3ZFV
---------T. 1 root root     0 Jan 21  2016 56a002ca%%VLS8G8PZ11
[root@dj scripts]# 
[root@dj scripts]# cat /mnt/vol1/thread0/level00/hardlink_to_files/56a002ca%%VLS8G8PZ11
[root@dj scripts]# 


For the files which is synced properly, the changelogs records are as follows:
==============================================================================

Active Hot Tier changelog records in processed:
-----------------------------------------------

[root@dhcp37-165 .processed]# grep -i "03c926fd-4cc0-4d41-a8be-50efa70916a4" *
Binary file archive_201601.tar matches
CHANGELOG.1453329701:E 03c926fd-4cc0-4d41-a8be-50efa70916a4 CREATE 33188 0 0 84eae93e-208b-4890-aa09-1ee223d0f313%2F56a00268%25%25S5CGBPUMYP
CHANGELOG.1453329701:D 03c926fd-4cc0-4d41-a8be-50efa70916a4
CHANGELOG.1453329791:E 03c926fd-4cc0-4d41-a8be-50efa70916a4 LINK b8bb1da7-1d1b-4600-982d-d6243d6e952b%2F56a002ca%25%25IUUZUI3ZFV
[root@dhcp37-165 .processed]# 


Active Cold Tier changelog records in processed:
------------------------------------------------

[root@dhcp37-165 .processed]# grep -i "03c926fd-4cc0-4d41-a8be-50efa70916a4" *
Binary file archive_201601.tar matches
CHANGELOG.1453329693:E 03c926fd-4cc0-4d41-a8be-50efa70916a4 MKNOD 33280 0 0 84eae93e-208b-4890-aa09-1ee223d0f313%2F56a00268%25%25S5CGBPUMYP
CHANGELOG.1453329800:E 03c926fd-4cc0-4d41-a8be-50efa70916a4 MKNOD 33280 0 0 b8bb1da7-1d1b-4600-982d-d6243d6e952b%2F56a002ca%25%25IUUZUI3ZFV
[root@dhcp37-165 .processed]# 

For the files which is not synced properly, the changelogs records are as follows:
==============================================================================

Active Hot Tier changelog records in processed:
-----------------------------------------------

[root@dhcp37-165 .processed]# grep -i "85f08137-c61a-4b81-8264-23e9510a203a" *
Binary file archive_201601.tar matches
CHANGELOG.1453329701:E 85f08137-c61a-4b81-8264-23e9510a203a CREATE 33188 0 0 84eae93e-208b-4890-aa09-1ee223d0f313%2F56a00268%25%25S0KZN4J05C
CHANGELOG.1453329701:D 85f08137-c61a-4b81-8264-23e9510a203a
CHANGELOG.1453329791:E 85f08137-c61a-4b81-8264-23e9510a203a LINK b8bb1da7-1d1b-4600-982d-d6243d6e952b%2F56a002ca%25%25VLS8G8PZ11
[root@dhcp37-165 .processed]# 

Active Cold Tier changelog records in processed:
------------------------------------------------

[root@dhcp37-165 .processed]# grep -i "85f08137-c61a-4b81-8264-23e9510a203a" *
Binary file archive_201601.tar matches
CHANGELOG.1453329693:E 85f08137-c61a-4b81-8264-23e9510a203a MKNOD 33280 0 0 84eae93e-208b-4890-aa09-1ee223d0f313%2F56a00268%25%25S0KZN4J05C
[root@dhcp37-165 .processed]# 

Another Active Cold Tier changelog recorded followin in processed:
------------------------------------------------------------------

[root@dhcp37-155 .processed]# grep -i "85f08137-c61a-4b81-8264-23e9510a203a" *
Binary file archive_201601.tar matches
CHANGELOG.1453329789:E 85f08137-c61a-4b81-8264-23e9510a203a MKNOD 33280 0 0 b8bb1da7-1d1b-4600-982d-d6243d6e952b%2F56a002ca%25%25VLS8G8PZ11
[root@dhcp37-155 .processed]# 


Above test is done by disabling quick-read on slave to avoid client side crash.

Version-Release number of selected component (if applicable):
=============================================================

glusterfs-3.7.5-16.el7rhgs.x86_64


How reproducible:
=================

Always


Steps Carried:
==============
1. Create master and slave cluster
2. Create master as tiered volume (HT: 2x2, CT: 2x(4+2)
3. Create slave volume (2x2) and disable quick-read
4. Create geo-replication session between master and slave
5. Mount master volume 
6. Create set of data on master volume, after a while promotion demotion should start
7. Create hardlinks of the files
8. Check for the hardlink files entry created on slave
9. Check the permissions of the hardlinks files synced to slave

Actual results:
===============

Many hardlink files are synced as T files to slave


Expected results:
=================

Files should be synced as regular link files

--- Additional comment from Rahul Hinduja on 2016-01-21 08:40:36 EST ---

sosreports are @ http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1300682/

Master:
=======
[root@dhcp37-165 ~]# gluster volume info vol0
 
Volume Name: vol0
Type: Tier
Volume ID: 7e7f42ef-350f-4d7c-bd2b-9c8f5ea8b647
Status: Started
Number of Bricks: 16
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick1: 10.70.37.158:/rhs/brick3/master_tier3
Brick2: 10.70.37.160:/rhs/brick3/master_tier2
Brick3: 10.70.37.133:/rhs/brick3/master_tier1
Brick4: 10.70.37.165:/rhs/brick3/master_tier0
Cold Tier:
Cold Tier Type : Distributed-Disperse
Number of Bricks: 2 x (4 + 2) = 12
Brick5: 10.70.37.165:/rhs/brick1/master_brick0
Brick6: 10.70.37.133:/rhs/brick1/master_brick1
Brick7: 10.70.37.160:/rhs/brick1/master_brick2
Brick8: 10.70.37.158:/rhs/brick1/master_brick3
Brick9: 10.70.37.110:/rhs/brick1/master_brick4
Brick10: 10.70.37.155:/rhs/brick1/master_brick5
Brick11: 10.70.37.165:/rhs/brick2/master_brick6
Brick12: 10.70.37.133:/rhs/brick2/master_brick7
Brick13: 10.70.37.160:/rhs/brick2/master_brick8
Brick14: 10.70.37.158:/rhs/brick2/master_brick9
Brick15: 10.70.37.110:/rhs/brick2/master_brick10
Brick16: 10.70.37.155:/rhs/brick2/master_brick11
Options Reconfigured:
changelog.changelog: on
geo-replication.ignore-pid-check: on
geo-replication.indexing: on
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
cluster.watermark-hi: 50
cluster.watermark-low: 5
cluster.tier-mode: cache
features.ctr-enabled: on
performance.readdir-ahead: on
cluster.enable-shared-storage: enable
[root@dhcp37-165 ~]# 


Slave:
======
[root@dhcp37-99 ~]# gluster volume info 
 
Volume Name: vol1
Type: Distributed-Replicate
Volume ID: ee540d43-a2f4-45af-b1ab-a44bdd12a571
Status: Started
Number of Bricks: 4 x 2 = 8
Transport-type: tcp
Bricks:
Brick1: 10.70.37.99:/rhs/brick1/ct-b1
Brick2: 10.70.37.88:/rhs/brick1/ct-b2
Brick3: 10.70.37.112:/rhs/brick1/ct-b3
Brick4: 10.70.37.199:/rhs/brick1/ct-b4
Brick5: 10.70.37.162:/rhs/brick1/ct-b5
Brick6: 10.70.37.87:/rhs/brick1/ct-b6
Brick7: 10.70.37.99:/rhs/brick2/ct-b7
Brick8: 10.70.37.88:/rhs/brick2/ct-b8
Options Reconfigured:
performance.quick-read: off
performance.readdir-ahead: on
cluster.enable-shared-storage: enable
[root@dhcp37-99 ~]#

--- Additional comment from RHEL Product and Program Management on 2016-01-21 08:54:05 EST ---

This request has been proposed as a blocker, but a release flag has
not been requested. Please set a release flag to ? to ensure we may
track this bug against the appropriate upcoming release, and reset
the blocker flag to ?.

--- Additional comment from Red Hat Bugzilla Rules Engine on 2016-01-21 18:05:07 EST ---

This bug is automatically being proposed for the current z-stream release of Red Hat Gluster Storage 3 by setting the release flag 'rhgs‑3.1.z' to '?'. 

If this bug should be proposed for a different release, please manually change the proposed release flag.

--- Additional comment from Aravinda VK on 2016-01-22 04:23:18 EST ---

RCA:

Since with Tiering internal MKNOD is recorded, hardlinks are also recorded as MKNOD. Geo-replication will create hardlinks as new files in Slave instead of hardlink. This new file will have the same GFID as original file but not hardlinked. Data may copied to original file alone when sync happens through rsync.

--- Additional comment from Vijay Bellur on 2016-01-22 06:38:14 EST ---

REVIEW: http://review.gluster.org/13281 (geo-rep: Handle hardlink in Tiering based volume) posted (#2) for review on master by Saravanakumar Arumugam (sarumuga@redhat.com)

--- Additional comment from Vijay Bellur on 2016-01-25 01:05:16 EST ---

REVIEW: http://review.gluster.org/13281 (geo-rep: Handle hardlink in Tiering based volume) posted (#3) for review on master by Saravanakumar Arumugam (sarumuga@redhat.com)

--- Additional comment from Vijay Bellur on 2016-01-27 07:40:26 EST ---

REVIEW: http://review.gluster.org/13281 (geo-rep: Handle hardlink in Tiering based volume) posted (#4) for review on master by Saravanakumar Arumugam (sarumuga@redhat.com)

--- Additional comment from Vijay Bellur on 2016-01-28 03:45:40 EST ---

REVIEW: http://review.gluster.org/13281 (geo-rep: Handle hardlink in Tiering based volume) posted (#5) for review on master by Saravanakumar Arumugam (sarumuga@redhat.com)

--- Additional comment from Vijay Bellur on 2016-01-28 04:37:39 EST ---

REVIEW: http://review.gluster.org/13281 (geo-rep: Handle hardlink in Tiering based volume) posted (#6) for review on master by Aravinda VK (avishwan@redhat.com)

--- Additional comment from Vijay Bellur on 2016-01-28 08:22:48 EST ---

REVIEW: http://review.gluster.org/13281 (geo-rep: Handle hardlink in Tiering based volume) posted (#7) for review on master by Venky Shankar (vshankar@redhat.com)

--- Additional comment from Vijay Bellur on 2016-01-29 00:42:29 EST ---

REVIEW: http://review.gluster.org/13281 (geo-rep: Handle hardlink in Tiering based volume) posted (#8) for review on master by Saravanakumar Arumugam (sarumuga@redhat.com)
Comment 1 Vijay Bellur 2016-01-29 03:23:51 EST
REVIEW: http://review.gluster.org/13315 (geo-rep: Handle hardlink in Tiering based volume) posted (#1) for review on release-3.7 by Saravanakumar Arumugam (sarumuga@redhat.com)
Comment 2 Vijay Bellur 2016-02-02 05:37:22 EST
REVIEW: http://review.gluster.org/13315 (geo-rep: Handle hardlink in Tiering based volume) posted (#2) for review on release-3.7 by Saravanakumar Arumugam (sarumuga@redhat.com)
Comment 3 Vijay Bellur 2016-02-15 03:08:36 EST
REVIEW: http://review.gluster.org/13315 (geo-rep: Handle hardlink in Tiering based volume) posted (#3) for review on release-3.7 by Saravanakumar Arumugam (sarumuga@redhat.com)
Comment 4 Vijay Bellur 2016-02-17 04:06:51 EST
REVIEW: http://review.gluster.org/13315 (geo-rep: Handle hardlink in Tiering based volume) posted (#4) for review on release-3.7 by Saravanakumar Arumugam (sarumuga@redhat.com)
Comment 5 Vijay Bellur 2016-03-07 22:13:26 EST
COMMIT: http://review.gluster.org/13315 committed in release-3.7 by Vijay Bellur (vbellur@redhat.com) 
------
commit 478203fd9447cbc67c7cdc2980d6bdf4881984bf
Author: Saravanakumar Arumugam <sarumuga@redhat.com>
Date:   Fri Jan 22 16:58:13 2016 +0530

    geo-rep: Handle hardlink in Tiering based volume
    
    Problem:
    Hardlinks are synced as Sticky bit files to Slave in
    a Tiering based volume.
    In a Tiering based volume, cold tier is hashed subvolume
    and geo-rep captures all namespace operations in cold tier.
    
    While syncing a file and its corresponding hardlink, it is
    recorded as MKNOD in cold tier(for both) and
    We end up creating two different files in Slave.
    
    Solution:
    If MKNOD with Sticky bit set is present, record it as LINK.
    This way it will create a HARDLINK if source file exists (on slave),
    else it will create a new file.
    
    This way, Slave can create Hardlink file itself (instead
    of creating a new file) in case of hardlink.
    
    Change-Id: Ic50dc6e64df9ed01799c30539a33daace0abe6d4
    BUG: 1302979
    Signed-off-by: Saravanakumar Arumugam <sarumuga@redhat.com>
    Signed-off-by: Aravinda VK <avishwan@redhat.com>
    Reviewed-on: http://review.gluster.org/13281
    Reviewed-on: http://review.gluster.org/13315
    Smoke: Gluster Build System <jenkins@build.gluster.com>
    NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
    CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Comment 6 Kaushal 2016-04-19 03:21:27 EDT
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.9, please open a new bug report.

glusterfs-3.7.9 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://www.gluster.org/pipermail/gluster-users/2016-March/025922.html
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Note You need to log in before you can comment on or make changes to this bug.