1283961 – Data Tiering:Change the default tiering values to optimize tiering settings

Bug 1283961 - Data Tiering:Change the default tiering values to optimize tiering settings

Summary: Data Tiering:Change the default tiering values to optimize tiering settings

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	tier
Sub Component:
Version:	rhgs-3.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	high
Target Milestone:	---
Target Release:	RHGS 3.1.2
Assignee:	Bug Updates Notification Mailing List
QA Contact:	RajeshReddy
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1260783 1300412 1306302
TreeView+	depends on / blocked

Reported:	2015-11-20 11:48 UTC by Nag Pavan Chilakam
Modified:	2019-04-03 09:15 UTC (History)
CC List:	8 users (show)
Fixed In Version:	glusterfs-3.7.5-17
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	1300412 (view as bug list)
Environment:
Last Closed:	2016-03-01 05:56:49 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2016:0193	0	normal	SHIPPED_LIVE	Red Hat Gluster Storage 3.1 update 2	2016-03-01 10:20:36 UTC

Description Nag Pavan Chilakam 2015-11-20 11:48:39 UTC

Description of problem:
======================
Currently tiering has many options like promote/demote freq, read/write freq, max files, max size, watermark hi and low.
All the current default values against these params don't seem to be the likely case customers would use.
We need to keep these default values which would be close to the user's desired setting.
Eg:demote/promte freq is 120s currently, but that would be too agressive and no user in real time would want to demote a file so fast.

Hence we need these values to be corrected:

Follwoing are the current default values:
[root@zod distrep]# gluster v get olala all|grep tier
cluster.tier-promote-frequency          120                                     
cluster.tier-demote-frequency           120                                     
cluster.tier-mode                       test                                    
cluster.tier-max-mb                     1000                                    
cluster.tier-max-files                  5000       
[root@zod distrep]# gluster v get olala all|grep ctr
features.ctr-enabled                    on                                      
features.ctr_link_consistency           off                                     
features.ctr_hardlink_heal_expire_period300                                     
features.ctr_inode_heal_expire_period   300                       
[root@zod distrep]# gluster v get olala all|grep thres
cluster.write-freq-threshold            0                                       
cluster.read-freq-threshold             0                                

Version-Release number of selected component (if applicable):
==========================================================
[root@zod distrep]# rpm -qa|grep gluster|grep server
glusterfs-server-3.7.5-6.el7rhgs.x86_64

Comment 3 Manoj Pillai 2016-01-12 04:49:18 UTC

Most of these tiering tunable parameters are migration-related. Migration had problems in the early builds; e.g. see https://bugzilla.redhat.com/show_bug.cgi?id=1293967#c2.

glusterfs*-3.7.5-14.el7.x86_64 is showing better migration behaviour. Migration speed and impact on application I/O is currently being analyzed with the 3.7.5-14 build. Based on those results we can revisit the migration-related tunable parameters.

Comment 4 Manoj Pillai 2016-01-20 14:09:52 UTC

I had a meeting with the tiering team where I suggested the following settings for the tiering parameter default values, instead of the current values:

cluster.tier-demote-frequency           3600

cluster.tier-max-mb                     4000
cluster.tier-max-files                  10000

[others unchangedd]

I'll add a more detailed comment on the discussions and the rationale for these values in a bit.

Comment 5 Manoj Pillai 2016-01-20 15:17:14 UTC

Explanation for comment #4:

I was able to complete only some of the migration tests referred
to in comment #3. These are fairly complex tests and migration in
gluster-tier is probabilistic in some cases, so not clear yet
whether this is a problem with migration functionality. So the
suggestions here are based on reasoning, not on actual test
results (which is also true of the current values). I have
elaborated on the reasoning behind these suggested values. The
tiering team will have another round of discussion among
themselves, and alter or ignore these suggested values.

There is also a need to revisit migration and migration-related
parameters in future releases to allow more control particularly
between promotion and demotion. Currently, the same parameters
e.g. max-files, read/write-freq-threshold are used to control
both promotion and demotion.

Since this is late in the 3.1.2 release cycle, want to keep
changes to a minimum.

The changes to cluster.tier-max-mb and cluster.tier-max-files are
intended to mitigate problems as reported in bz #1290667 where
migration of files selected as candidates in one cycle are not
completed in that cycle. The apropriate values for these
parameters will depend on the particular configuration and how
fast migration happens on that configuration. But in general,
migration in this release of gluster-tier is slow, and the
default values have been lowered to account for that.

The other change suggested in comment #4 was increasing the
demote-frequency to an hour. Currently promote/demote-frequency
are both set to 120 i.e 2 min. But they work differently.
Candidates for promotion are all files on the cold tier that were
accessed (enough times to meet the threshold) in the last
migration cycle, which will be a smaller set with a smaller
promote-frequency value; in contrast, candidates for demotion are
files on the hot tier that have _not_ been accessed in the last
cycle, which will be larger with a smaller demote-frequency
value. With the current demote-frequency of 2 min, the list of
candidate files to be demoted could be huge, resulting in too
many files getting demoted too soon.

Comment 6 RajeshReddy 2016-01-25 14:31:59 UTC

Tested with 3.7.5-17 and the default values of cluster.tier-demote-frequency,cluster.tier-max-mb and cluster.tier-max-files changed as mentioned in the comment 4 so marking this bug as verified 

[root@dhcp35-231 ~]# rpm -qa | grep glusterfs 
glusterfs-client-xlators-3.7.5-17.el7rhgs.x86_64
glusterfs-server-3.7.5-17.el7rhgs.x86_64
glusterfs-3.7.5-17.el7rhgs.x86_64
glusterfs-api-3.7.5-17.el7rhgs.x86_64
glusterfs-cli-3.7.5-17.el7rhgs.x86_64
glusterfs-geo-replication-3.7.5-17.el7rhgs.x86_64
glusterfs-libs-3.7.5-17.el7rhgs.x86_64
glusterfs-fuse-3.7.5-17.el7rhgs.x86_64
glusterfs-rdma-3.7.5-17.el7rhgs.x86_64



[root@dhcp35-231 ~]# gluster vol get delete all | grep cluster.tier-demote-frequency 
cluster.tier-demote-frequency           3600                                    
[root@dhcp35-231 ~]# gluster vol get delete all | grep cluster.tier-max-mb
cluster.tier-max-mb                     4000                                    
[root@dhcp35-231 ~]# gluster vol get delete all | grep cluster.tier-max-files
cluster.tier-max-files                  10000

Comment 8 errata-xmlrpc 2016-03-01 05:56:49 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0193.html

Note You need to log in before you can comment on or make changes to this bug.