Bug 1223299 - Data Tiering:Frequency counters not working
Summary: Data Tiering:Frequency counters not working
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: tier
Version: rhgs-3.1
Hardware: Unspecified
OS: Linux
urgent
urgent
Target Milestone: ---
: RHGS 3.1.0
Assignee: Bug Updates Notification Mailing List
QA Contact: Nag Pavan Chilakam
URL:
Whiteboard:
Depends On:
Blocks: 1202842 1223636
TreeView+ depends on / blocked
 
Reported: 2015-05-20 09:43 UTC by Nag Pavan Chilakam
Modified: 2016-09-17 15:39 UTC (History)
4 users (show)

Fixed In Version: 3.7.1-1
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-07-29 04:44:12 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:1495 0 normal SHIPPED_LIVE Important: Red Hat Gluster Storage 3.1 update 2015-07-29 08:26:26 UTC

Description Nag Pavan Chilakam 2015-05-20 09:43:54 UTC
Description of problem:
========================
freq counters are not working on latest downstream build.
I have set following counters
Options Reconfigured:
cluster.tier-demote-frequency: 10
features.record-counters: on
features.ctr-enabled: on

that means all files in hot tier should move to cold after 10sec. But this didnt happen even after an hour


Version-Release number of selected component (if applicable):

[root@zod ~]# gluster --version
glusterfs 3.7.0 built on May 15 2015 01:33:40
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General Public License.
[root@zod ~]# rpm -qa|grep gluster
glusterfs-debuginfo-3.7.0-2.el7rhs.x86_64
glusterfs-geo-replication-3.7.0-2.el7rhs.x86_64
glusterfs-client-xlators-3.7.0-2.el7rhs.x86_64
glusterfs-cli-3.7.0-2.el7rhs.x86_64
glusterfs-libs-3.7.0-2.el7rhs.x86_64
glusterfs-api-3.7.0-2.el7rhs.x86_64
glusterfs-server-3.7.0-2.el7rhs.x86_64
glusterfs-resource-agents-3.7.0-2.el7rhs.noarch
glusterfs-rdma-3.7.0-2.el7rhs.x86_64
glusterfs-devel-3.7.0-2.el7rhs.x86_64
glusterfs-api-devel-3.7.0-2.el7rhs.x86_64
glusterfs-3.7.0-2.el7rhs.x86_64
glusterfs-fuse-3.7.0-2.el7rhs.x86_64
[root@zod ~]# 


Steps to Reproduce:
1.create tiered vol
2.add files, they will go to hot tier
3.now set freq options and the data shud move to cold if demote freq is set





root@zod yum.repos.d]# gluster v create vol1 replica 2 10.70.35.144:/brick_200G_1/vol1 yarrow:/brick_200G_1/vol1 10.70.35.144:/brick_200G_2/vol1 yarrow:/brick_200G_2/vol1 force
volume create: vol1: success: please start the volume to access data
[root@zod yum.repos.d]# gluster v start vol1
gluster volume start: vol1: success
[root@zod yum.repos.d]# gluster v attach-tier vol1  10.70.35.144:/ssdbricks_75G_1/vol1 yarrow:/ssdbricks_75G_1/vol1
Attach tier is recommended only for testing purposes in this release. Do you want to continue? (y/n) y

volume attach-tier: success
gluster v info vol12volume rebalance: vol1: success: Rebalance on vol1 has been started successfully. Use rebalance status command to check status of the rebalance process.
ID: 9dd4b419-4436-493f-9553-c39cc52a6582

[root@zod yum.repos.d]# 
[root@zod yum.repos.d]# gluster v info vol12
Volume vol12 does not exist
[root@zod yum.repos.d]# gluster v info vol1
 
Volume Name: vol1
Type: Tier
Volume ID: 23dba7de-a94a-49c2-80f1-6d97b0ab1309
Status: Started
Number of Bricks: 6
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distribute
Number of Bricks: 2
Brick1: yarrow:/ssdbricks_75G_1/vol1
Brick2: 10.70.35.144:/ssdbricks_75G_1/vol1
Cold Bricks:
Cold Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick3: 10.70.35.144:/brick_200G_1/vol1
Brick4: yarrow:/brick_200G_1/vol1
Brick5: 10.70.35.144:/brick_200G_2/vol1
Brick6: yarrow:/brick_200G_2/vol1
Options Reconfigured:
performance.readdir-ahead: on
[root@zod yum.repos.d]# gluster v set vol1 ^C
[root@zod yum.repos.d]# ls /brick_200G_1/vol1
[root@zod yum.repos.d]# ls /*br*/vol1
/brick_200G_1/vol1:

/brick_200G_2/vol1:

/ssdbricks_75G_1/vol1:
f4  f6  f7  f8  f9
[root@zod yum.repos.d]# rpm -qa|grep sql
sqlite-devel-3.7.17-4.el7.x86_64
sqlite-3.7.17-4.el7.x86_64
[root@zod yum.repos.d]# gluster v set vol1 features.ctr-enabled on
volume set: success
[root@zod yum.repos.d]# glusterv set vol1 features.record-counters on
bash: glusterv: command not found...
Similar command is: 'gluster'
[root@zod yum.repos.d]#  gluster v set vol1  features.record-counters on
volume set: success
[root@zod yum.repos.d]#  gluster v set vol1 cluster.tier-demote-frequency 10s
volume set: failed: invalid number format "10s" in option "tier-demote-frequency"
[root@zod yum.repos.d]#  gluster v set vol1 cluster.tier-demote-frequency 10
volume set: success
[root@zod yum.repos.d]# dqate
bash: dqate: command not found...
[root@zod yum.repos.d]# date
Wed May 20 14:04:31 IST 2015
[root@zod yum.repos.d]# date
Wed May 20 14:06:33 IST 2015
[root@zod yum.repos.d]# date
Wed May 20 14:06:46 IST 2015
[root@zod yum.repos.d]# dqate
bash: dqate: command not found...
[root@zod yum.repos.d]#  ls /*br*/vol1
/brick_200G_1/vol1:

/brick_200G_2/vol1:

/ssdbricks_75G_1/vol1:
f4  f6  f7  f8  f9
[root@zod yum.repos.d]# gluster v info
 
Volume Name: vol1
Type: Tier
Volume ID: 23dba7de-a94a-49c2-80f1-6d97b0ab1309
Status: Started
Number of Bricks: 6
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distribute
Number of Bricks: 2
Brick1: yarrow:/ssdbricks_75G_1/vol1
Brick2: 10.70.35.144:/ssdbricks_75G_1/vol1
Cold Bricks:
Cold Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick3: 10.70.35.144:/brick_200G_1/vol1
Brick4: yarrow:/brick_200G_1/vol1
Brick5: 10.70.35.144:/brick_200G_2/vol1
Brick6: yarrow:/brick_200G_2/vol1
Options Reconfigured:
cluster.tier-demote-frequency: 10
features.record-counters: on
features.ctr-enabled: on
performance.readdir-ahead: on
[root@zod yum.repos.d]#

Comment 2 Joseph Elwin Fernandes 2015-06-10 08:44:08 UTC
Please set cluster.tier-promote-frequency to 0

Comment 3 Triveni Rao 2015-06-12 04:10:33 UTC
this bug is verified and found no issues:


[root@rhsqa14-vm1 ~]# gluster v create moon replica 2 10.70.47.165:/rhs/brick1/m0 10.70.47.163:/rhs/brick1/m0 10.70.47.165:/rhs/brick2/m0 10.70.47.163:/rhs/brick2/m0
volume create: moon: success: please start the volume to access data
[root@rhsqa14-vm1 ~]# gluster start moon
unrecognized word: start (position 0)
[root@rhsqa14-vm1 ~]# gluster v start moon
volume start: moon: success
[root@rhsqa14-vm1 ~]# gluster v info

Volume Name: moon
Type: Distributed-Replicate
Volume ID: 1c6bd29b-2168-4b69-bedd-8b26a89a6eb9
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp   
Bricks:
Brick1: 10.70.47.165:/rhs/brick1/m0
Brick2: 10.70.47.163:/rhs/brick1/m0
Brick3: 10.70.47.165:/rhs/brick2/m0
Brick4: 10.70.47.163:/rhs/brick2/m0
Options Reconfigured: 
performance.readdir-ahead: on
[root@rhsqa14-vm1 ~]# 

[root@rhsqa14-vm1 ~]# gluster v set moon cluster.tier-demote-frequency 20
volume set: success   
[root@rhsqa14-vm1 ~]# glustre v set moon features.record-counters on
bash: glustre: command not found
[root@rhsqa14-vm1 ~]# gluster v set moon features.record-counters on
volume set: success   
[root@rhsqa14-vm1 ~]# gluster v set moon features.ctr-enabled on
volume set: success   
[root@rhsqa14-vm1 ~]#

[root@rhsqa14-vm1 ~]# gluster v info

Volume Name: moon
Type: Distributed-Replicate
Volume ID: 1c6bd29b-2168-4b69-bedd-8b26a89a6eb9
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 10.70.47.165:/rhs/brick1/m0
Brick2: 10.70.47.163:/rhs/brick1/m0
Brick3: 10.70.47.165:/rhs/brick2/m0
Brick4: 10.70.47.163:/rhs/brick2/m0
Options Reconfigured:
features.ctr-enabled: on
features.record-counters: on
cluster.tier-demote-frequency: 20
performance.readdir-ahead: on
[root@rhsqa14-vm1 ~]#
[root@rhsqa14-vm1 ~]# gluster v attach-tier moon replica 2 10.70.47.165:/rhs/brick3/m0 10.70.47.163:/rhs/brick3/m0
Attach tier is recommended only for testing purposes in this release. Do you want to continue? (y/n) y
volume attach-tier: success
volume rebalance: moon: success: Rebalance on moon has been started successfully. Use rebalance status command to check status of the rebalance process.
ID: 07b230dd-3bf4-4c18-b21e-f50cb2254d29

[root@rhsqa14-vm1 ~]# gluster v info

Volume Name: moon
Type: Tier
Volume ID: 1c6bd29b-2168-4b69-bedd-8b26a89a6eb9
Status: Started
Number of Bricks: 6
Transport-type: tcp
Hot Tier :
Hot Tier Type : Replicate
Number of Bricks: 1 x 2 = 2
Brick1: 10.70.47.163:/rhs/brick3/m0
Brick2: 10.70.47.165:/rhs/brick3/m0
Cold Tier:
Cold Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick3: 10.70.47.165:/rhs/brick1/m0
Brick4: 10.70.47.163:/rhs/brick1/m0
Brick5: 10.70.47.165:/rhs/brick2/m0
Brick6: 10.70.47.163:/rhs/brick2/m0
Options Reconfigured:
features.ctr-enabled: on
features.record-counters: on
cluster.tier-demote-frequency: 20
performance.readdir-ahead: on
[root@rhsqa14-vm1 ~]# 
[root@rhsqa14-vm1 ~]# ls -la /rhs/brick1
total 4
drwxr-xr-x. 3 root root   15 Jun 11 23:55 .
drwxr-xr-x. 8 root root 4096 Jun 10 06:37 ..
drwxr-xr-x. 4 root root   78 Jun 12 00:00 m0
[root@rhsqa14-vm1 ~]# ls -la /rhs/brick1/m0
total 20
drwxr-xr-x. 4 root root    78 Jun 12 00:00 .
drwxr-xr-x. 3 root root    15 Jun 11 23:55 ..
-rwxr-xr-x. 2 root root   244 Jun 12 00:00 fw.stop
drw-------. 8 root root   110 Jun 12 00:00 .glusterfs
-rwxr-xr-x. 2 root root 13496 Jun 12 00:00 rhs-system-init.sh
drwxr-xr-x. 3 root root    24 Jun 11 23:58 .trashcan
[root@rhsqa14-vm1 ~]# ls -la /rhs/brick2/m0
total 4
drwxr-xr-x. 4 root root  56 Jun 12 00:00 .
drwxr-xr-x. 3 root root  15 Jun 11 23:55 ..
drw-------. 7 root root 101 Jun 12 00:00 .glusterfs
-rwxr-xr-x. 2 root root 187 Jun 12 00:00 options.sh
drwxr-xr-x. 3 root root  24 Jun 11 23:58 .trashcan
[root@rhsqa14-vm1 ~]# ls -la /rhs/brick3/m0
total 0
drwxr-xr-x. 4 root root  39 Jun 12 00:00 .
drwxr-xr-x. 3 root root  15 Jun 11 23:58 ..
drw-------. 9 root root 119 Jun 12 00:00 .glusterfs
drwxr-xr-x. 3 root root  24 Jun 11 23:58 .trashcan
[root@rhsqa14-vm1 ~]# 


[root@rhsqa14-vm1 ~]# ls -la /rhs/brick1/m0
total 20
drwxr-xr-x. 4 root root    78 Jun 12 00:00 .
drwxr-xr-x. 3 root root    15 Jun 11 23:55 ..
-rwxr-xr-x. 2 root root   244 Jun 12 00:00 fw.stop
drw-------. 8 root root   110 Jun 12 00:00 .glusterfs
-rwxr-xr-x. 2 root root 13496 Jun 12 00:00 rhs-system-init.sh
drwxr-xr-x. 3 root root    24 Jun 11 23:58 .trashcan
[root@rhsqa14-vm1 ~]# ls -la /rhs/brick3/m0
total 0
drwxr-xr-x.  4 root root  53 Jun 12 00:03 .
drwxr-xr-x.  3 root root  15 Jun 11 23:58 ..
drw-------. 10 root root 128 Jun 12 00:03 .glusterfs
drwxr-xr-x.  3 root root  24 Jun 11 23:58 .trashcan
-rw-r--r--.  2 root root   0 Jun 12 00:03 triveni
[root@rhsqa14-vm1 ~]# ls -la /rhs/brick2/m0
total 4
drwxr-xr-x. 4 root root  56 Jun 12 00:00 .
drwxr-xr-x. 3 root root  15 Jun 11 23:55 ..
drw-------. 7 root root 101 Jun 12 00:00 .glusterfs
-rwxr-xr-x. 2 root root 187 Jun 12 00:00 options.sh
drwxr-xr-x. 3 root root  24 Jun 11 23:58 .trashcan
[root@rhsqa14-vm1 ~]# 
[root@rhsqa14-vm1 ~]# date
Fri Jun 12 00:04:12 EDT 2015
[root@rhsqa14-vm1 ~]# ls -la /rhs/brick1/m0
total 20
drwxr-xr-x. 4 root root    78 Jun 12 00:00 .
drwxr-xr-x. 3 root root    15 Jun 11 23:55 ..
-rwxr-xr-x. 2 root root   244 Jun 12 00:00 fw.stop
drw-------. 8 root root   110 Jun 12 00:00 .glusterfs
-rwxr-xr-x. 2 root root 13496 Jun 12 00:00 rhs-system-init.sh
drwxr-xr-x. 3 root root    24 Jun 11 23:58 .trashcan
[root@rhsqa14-vm1 ~]# ls -la /rhs/brick3/m0
total 0
drwxr-xr-x.  4 root root  39 Jun 12 00:04 .
drwxr-xr-x.  3 root root  15 Jun 11 23:58 ..
drw-------. 10 root root 128 Jun 12 00:04 .glusterfs
drwxr-xr-x.  3 root root  24 Jun 11 23:58 .trashcan
[root@rhsqa14-vm1 ~]# ls -la /rhs/brick2/m0
total 4
drwxr-xr-x. 4 root root  70 Jun 12 00:04 .
drwxr-xr-x. 3 root root  15 Jun 11 23:55 ..
drw-------. 8 root root 110 Jun 12 00:04 .glusterfs
-rwxr-xr-x. 2 root root 187 Jun 12 00:00 options.sh
drwxr-xr-x. 3 root root  24 Jun 11 23:58 .trashcan
-rw-r--r--. 2 root root   0 Jun 12 00:03 triveni
[root@rhsqa14-vm1 ~]#

Comment 4 Triveni Rao 2015-06-12 11:13:38 UTC
[root@rhsqa14-vm1 ~]# glusterfs --version
glusterfs 3.7.1 built on Jun  9 2015 02:31:54
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2013 Red Hat, Inc. <http://www.redhat.com/>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
It is licensed to you under your choice of the GNU Lesser
General Public License, version 3 or any later version (LGPLv3
or later), or the GNU General Public License, version 2 (GPLv2),
in all cases as published by the Free Software Foundation.
[root@rhsqa14-vm1 ~]# rpm -qa | grep gluster
glusterfs-3.7.1-1.el6rhs.x86_64
glusterfs-cli-3.7.1-1.el6rhs.x86_64
glusterfs-libs-3.7.1-1.el6rhs.x86_64
glusterfs-client-xlators-3.7.1-1.el6rhs.x86_64
glusterfs-fuse-3.7.1-1.el6rhs.x86_64
glusterfs-server-3.7.1-1.el6rhs.x86_64
glusterfs-api-3.7.1-1.el6rhs.x86_64
[root@rhsqa14-vm1 ~]#

Comment 5 Nag Pavan Chilakam 2015-06-15 12:20:38 UTC
The heat counters are not working as expected.
The files are only getting hashed (-----T).
When promote/demote happens, the file as such is not getting moved.
Also, the tier rebalance status doesnt show the file getting moved at all.

1)Create a regular 4+2 EC vol and started it and mounted it both on nfs and cifs
2)pumping IOs
3)attached a 3x replica tier 
4)Now set freq counters as below:
[root@rhsqa14-vm4 ~]# gluster v info ecvol
 
Volume Name: ecvol
Type: Tier
Volume ID: f25eecf7-a83e-478c-b73d-1d954d7a78fe
Status: Started
Number of Bricks: 12
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distributed-Replicate
Number of Bricks: 2 x 3 = 6
Brick1: 10.70.46.2:/rhs/brick6/ecvol
Brick2: 10.70.47.159:/rhs/brick6/ecvol
Brick3: 10.70.46.2:/rhs/brick5/ecvol
Brick4: 10.70.47.159:/rhs/brick5/ecvol
Brick5: 10.70.46.2:/rhs/brick4/ecvol
Brick6: 10.70.47.159:/rhs/brick4/ecvol
Cold Tier:
Cold Tier Type : Disperse
Number of Bricks: 1 x (4 + 2) = 6
Brick7: 10.70.47.159:/rhs/brick1/ecvol
Brick8: 10.70.46.2:/rhs/brick1/ecvol
Brick9: 10.70.47.159:/rhs/brick2/ecvol
Brick10: 10.70.46.2:/rhs/brick2/ecvol
Brick11: 10.70.46.2:/rhs/brick3/ecvol
Brick12: 10.70.47.159:/rhs/brick3/ecvol
Options Reconfigured:
cluster.tier-promote-frequency: 6
cluster.tier-demote-frequency: 5
features.record-counters: on
features.ctr-enabled: on
performance.readdir-ahead: on



Now. i created a file "newfile", which went to hot tier. But after about 6 sec, the file got hashed to cold tier(as part of demote) but didn't get moved to cold tier completely.
The tieing rebalance status too shows as zero files rebalanced



[root@rhsqa14-vm4 ~]# ls -l /rhs/brick*/ecvol/*file*
---------T. 2 root root 0 Jun 15 06:39 /rhs/brick1/ecvol/newfile
---------T. 2 root root 0 Jun 15 06:39 /rhs/brick2/ecvol/newfile
---------T. 2 root root 0 Jun 15 06:39 /rhs/brick3/ecvol/newfile
-rw-r-Sr-T. 2 root root 0 Jun 15 06:38 /rhs/brick4/ecvol/newfile
-rw-r--r--. 2 root root 0 Jun 15  2015 /rhs/brick5/ecvol/xfs_file
-rw-r--r--. 2 root root 0 Jun 15  2015 /rhs/brick6/ecvol/xfs_file
[root@rhsqa14-vm4 ~]# ls -l /rhs/brick*/ecvol/*file*
---------T. 2 root root 0 Jun 15 06:39 /rhs/brick1/ecvol/newfile
---------T. 2 root root 0 Jun 15 06:39 /rhs/brick2/ecvol/newfile
---------T. 2 root root 0 Jun 15 06:39 /rhs/brick3/ecvol/newfile
-rw-r-Sr-T. 2 root root 0 Jun 15 06:41 /rhs/brick4/ecvol/newfile
-rw-r--r--. 2 root root 0 Jun 15  2015 /rhs/brick5/ecvol/xfs_file
-rw-r--r--. 2 root root 0 Jun 15  2015 /rhs/brick6/ecvol/xfs_file
[root@rhsqa14-vm4 ~]# ls -l /rhs/brick*/ecvol/newfile
---------T. 2 root root 0 Jun 15 06:39 /rhs/brick1/ecvol/newfile
---------T. 2 root root 0 Jun 15 06:39 /rhs/brick2/ecvol/newfile
---------T. 2 root root 0 Jun 15 06:39 /rhs/brick3/ecvol/newfile
-rw-r-Sr-T. 2 root root 0 Jun 15 06:41 /rhs/brick4/ecvol/newfile
[root@rhsqa14-vm4 ~]# ls -l /rhs/brick*/ecvol/newfile
---------T. 2 root root 0 Jun 15 06:39 /rhs/brick1/ecvol/newfile
---------T. 2 root root 0 Jun 15 06:39 /rhs/brick2/ecvol/newfile
---------T. 2 root root 0 Jun 15 06:39 /rhs/brick3/ecvol/newfile
-rw-r-Sr-T. 2 root root 0 Jun 15 06:41 /rhs/brick4/ecvol/newfile
[root@rhsqa14-vm4 ~]# ls -l /rhs/brick*/ecvol/newfile
---------T. 2 root root 0 Jun 15 06:39 /rhs/brick1/ecvol/newfile
---------T. 2 root root 0 Jun 15 06:39 /rhs/brick2/ecvol/newfile
---------T. 2 root root 0 Jun 15 06:39 /rhs/brick3/ecvol/newfile
-rw-r-Sr-T. 2 root root 0 Jun 15 06:41 /rhs/brick4/ecvol/newfile
[root@rhsqa14-vm4 ~]# ls -l /rhs/brick*/ecvol/newfile
---------T. 2 root root 0 Jun 15 06:39 /rhs/brick1/ecvol/newfile
---------T. 2 root root 0 Jun 15 06:39 /rhs/brick2/ecvol/newfile
---------T. 2 root root 0 Jun 15 06:39 /rhs/brick3/ecvol/newfile
-rw-r-Sr-T. 2 root root 0 Jun 15 06:41 /rhs/brick4/ecvol/newfile
[root@rhsqa14-vm4 ~]# ls -l /rhs/brick*/ecvol/newfile
---------T. 2 root root 0 Jun 15 06:39 /rhs/brick1/ecvol/newfile
---------T. 2 root root 0 Jun 15 06:39 /rhs/brick2/ecvol/newfile
---------T. 2 root root 0 Jun 15 06:39 /rhs/brick3/ecvol/newfile
-rw-r-Sr-T. 2 root root 0 Jun 15 06:41 /rhs/brick4/ecvol/newfile

Comment 8 errata-xmlrpc 2015-07-29 04:44:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1495.html


Note You need to log in before you can comment on or make changes to this bug.