Bug 1276141

Summary: Data Tiering: Tiering deamon is seeing each part of a file in a Disperse cold volume as a different file
Product: [Community] GlusterFS Reporter: Dan Lambright <dlambrig>
Component: tieringAssignee: Joseph Elwin Fernandes <josferna>
Status: CLOSED CURRENTRELEASE QA Contact: bugs <bugs>
Severity: medium Docs Contact:
Priority: high    
Version: mainlineCC: bugs, dlambrig, josferna, nchilaka, sankarshan
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.8rc2 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1262860 Environment:
Last Closed: 2016-06-16 13:41:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1262860, 1278389    
Bug Blocks:    

Description Dan Lambright 2015-10-28 20:40:09 UTC
+++ This bug was initially created as a clone of Bug #1262860 +++

Description of problem:
========================
On a EC cold volume, when files are promoted or demoted to/from hot tier,
it seems like the tier deamon is seeing each copy or part of the file as a seperate different file. The counter atleast say this.

I had 3 files on a 2 x (4 + 2) = 12 EC cold volume.
When they were promoted or demoted to/from a distrep hot tier,
the stats show each file is counted as 6 times, with 1 time showing the success while the other 5 registering as failure.

Version-Release number of selected component (if applicable):
===========================================================
[root@zod glusterfs]# rpm -qa|grep gluster
glusterfs-client-xlators-3.7.4-0.33.git1d02d4b.el7.centos.x86_64
glusterfs-api-3.7.4-0.33.git1d02d4b.el7.centos.x86_64
glusterfs-fuse-3.7.4-0.33.git1d02d4b.el7.centos.x86_64
glusterfs-debuginfo-3.7.4-0.33.git1d02d4b.el7.centos.x86_64
glusterfs-3.7.4-0.33.git1d02d4b.el7.centos.x86_64
glusterfs-server-3.7.4-0.33.git1d02d4b.el7.centos.x86_64
glusterfs-cli-3.7.4-0.33.git1d02d4b.el7.centos.x86_64
^[[Aglusterfs-libs-3.7.4-0.33.git1d02d4b.el7.centos.x86_64
[root@zod glusterfs]# gluster --version
glusterfs 3.7.4 built on Sep 12 2015 01:35:35
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General Public License.
[root@zod glusterfs]# 




Steps to Reproduce:
====================
1.Create a EC cold vol and distrep hot tier
2.Now set promo/demote freq after enabling ctr
3.Now create a file and wait for it to demote and then make it get promoted.
It can be seen that though it is one file, each copy of the file in each brick is considered as different file and the promote/demote counters show them as failed.


Below was the case where I had 3 files(compare promote/demote numbers with failures)
====================================

[root@zod glusterfs]# gluster v rebal redhat status; gluster v tier redhat status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                9        0Bytes            31            22             0          in progress            1393.00
                                  yarrow                8        0Bytes            36            28             0          in progress            1393.00
volume rebalance: redhat: success: 

Node                 Promoted files       Demoted files        Status              
---------            ---------            ---------            ---------           
localhost            9                    0                    in progress         
yarrow               0                    8                    in progress         
volume rebalance: redhat: success: 


[root@zod glusterfs]# gluster v rebal redhat status; gluster v tier redhat status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                9        0Bytes            33            24             0          in progress            1701.00
                                  yarrow               10        0Bytes            38            28             0          in progress            1701.00
volume rebalance: redhat: success: 
Node                 Promoted files       Demoted files        Status              
---------            ---------            ---------            ---------           
localhost            9                    0                    in progress         
yarrow               0                    10                   in progress

--- Additional comment from nchilaka on 2015-09-14 09:42:37 EDT ---

volume rebalance: redhat: success: 
[root@zod glusterfs]# gluster v info redhat
 
Volume Name: redhat
Type: Tier
Volume ID: ec61f03a-b9c6-4a43-8aae-a1a3ca65e234
Status: Started
Number of Bricks: 16
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick1: yarrow:/rhs/brick6/redhat_hot
Brick2: zod:/rhs/brick6/redhat_hot
Brick3: yarrow:/rhs/brick7/redhat_hot
Brick4: zod:/rhs/brick7/redhat_hot
Cold Tier:
Cold Tier Type : Distributed-Disperse
Number of Bricks: 2 x (4 + 2) = 12
Brick5: zod:/rhs/brick1/redhat
Brick6: yarrow:/rhs/brick1/redhat
Brick7: zod:/rhs/brick2/redhat
Brick8: yarrow:/rhs/brick2/redhat
Brick9: zod:/rhs/brick3/redhat
Brick10: yarrow:/rhs/brick3/redhat
Brick11: zod:/rhs/brick4/redhat
Brick12: yarrow:/rhs/brick4/redhat
Brick13: zod:/rhs/brick5/redhat
Brick14: yarrow:/rhs/brick5/redhat
Brick15: yarrow:/rhs/brick6/redhat
Brick16: zod:/rhs/brick6/redhat
Options Reconfigured:
cluster.tier-demote-frequency: 30
cluster.tier-promote-frequency: 50
features.ctr-enabled: on
performance.io-cache: off
performance.quick-read: off
performance.readdir-ahead: on

Comment 1 Dan Lambright 2015-10-28 20:41:56 UTC
Erasure code chunks were each being accrued to the promotion/demotion counters. Only the node responsible for migrating the file need bump the counter.

Comment 2 Vijay Bellur 2015-10-29 06:47:27 UTC
REVIEW: http://review.gluster.org/12453 (tier/dht: Ignoring replica for migration counting) posted (#1) for review on master by Joseph Fernandes

Comment 3 Vijay Bellur 2015-10-29 14:06:23 UTC
REVIEW: http://review.gluster.org/12453 (tier/dht: Ignoring replica for migration counting) posted (#2) for review on master by Joseph Fernandes

Comment 4 Vijay Bellur 2015-10-29 14:08:27 UTC
REVIEW: http://review.gluster.org/12453 (tier/dht: Ignoring replica for migration counting) posted (#3) for review on master by Joseph Fernandes

Comment 5 Vijay Bellur 2015-10-31 16:08:28 UTC
REVIEW: http://review.gluster.org/12453 (tier/dht: Ignoring replica for migration counting) posted (#4) for review on master by Joseph Fernandes

Comment 6 Vijay Bellur 2015-10-31 17:34:15 UTC
REVIEW: http://review.gluster.org/12453 (tier/dht: Ignoring replica for migration counting) posted (#5) for review on master by Joseph Fernandes

Comment 7 Vijay Bellur 2015-11-03 02:25:15 UTC
REVIEW: http://review.gluster.org/12453 (tier/dht: Ignoring replica for migration counting) posted (#6) for review on master by Dan Lambright (dlambrig)

Comment 8 Vijay Bellur 2015-11-26 08:14:29 UTC
REVIEW: http://review.gluster.org/12758 (tier/tier: Ignoring status of already migrated files) posted (#2) for review on master by Joseph Fernandes

Comment 9 Vijay Bellur 2015-11-26 08:21:01 UTC
REVIEW: http://review.gluster.org/12758 (tier/tier: Ignoring status of already migrated files) posted (#3) for review on master by Joseph Fernandes

Comment 10 Vijay Bellur 2015-11-26 08:22:27 UTC
REVIEW: http://review.gluster.org/12758 (tier/tier: Ignoring status of already migrated files) posted (#4) for review on master by Joseph Fernandes

Comment 11 Vijay Bellur 2015-12-04 04:52:47 UTC
COMMIT: http://review.gluster.org/12758 committed in master by Dan Lambright (dlambrig) 
------
commit be377d4bed954fc8cdbc515329882c1fd0f7ab37
Author: Joseph Fernandes <josferna>
Date:   Thu Nov 26 12:42:17 2015 +0530

    tier/tier: Ignoring status of already migrated files
    
    Ignore the status of already migrated files and in the
    process don't count.
    
    Change-Id: Idba6402508d51a4285ac96742c6edf797ee51b6a
    BUG: 1276141
    Signed-off-by: Joseph Fernandes <josferna>
    Reviewed-on: http://review.gluster.org/12758
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Dan Lambright <dlambrig>
    Tested-by: Dan Lambright <dlambrig>

Comment 12 Niels de Vos 2016-06-16 13:41:47 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user