Bug 1286346

Summary: Data Tiering:Don't allow or reset the frequency threshold values to zero when record counter features.record-counter is turned off
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Joseph Elwin Fernandes <josferna>
Component: tierAssignee: Bug Updates Notification Mailing List <rhs-bugs>
Status: CLOSED CURRENTRELEASE QA Contact: Sweta Anandpara <sanandpa>
Severity: urgent Docs Contact:
Priority: urgent    
Version: rhgs-3.1CC: asrivast, dlambrig, josferna, nchilaka, rcyriac, rhs-bugs, sankarshan, storage-qa-internal
Target Milestone: ---Keywords: Reopened, ZStream
Target Release: RHGS 3.1.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.8rc2 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1284067
: 1287560 (view as bug list) Environment:
Last Closed: 2016-06-16 13:47:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1284067    
Bug Blocks: 1260783, 1287560    

Comment 1 Vijay Bellur 2015-11-28 11:42:46 UTC
REVIEW: http://review.gluster.org/12780 (tier/glusterd : Validation for frequency thresholds and record-counters) posted (#1) for review on master by Joseph Fernandes

Comment 2 Vijay Bellur 2015-11-28 11:45:45 UTC
REVIEW: http://review.gluster.org/12780 (tier/glusterd : Validation for frequency thresholds and record-counters) posted (#2) for review on master by Joseph Fernandes

Comment 3 Vijay Bellur 2015-11-28 12:03:59 UTC
REVIEW: http://review.gluster.org/12780 (tier/glusterd : Validation for frequency thresholds and record-counters) posted (#3) for review on master by Joseph Fernandes

Comment 4 Vijay Bellur 2015-11-28 12:39:48 UTC
REVIEW: http://review.gluster.org/12780 (tier/glusterd : Validation for frequency thresholds and record-counters) posted (#4) for review on master by Joseph Fernandes

Comment 5 Vijay Bellur 2015-11-28 14:22:58 UTC
REVIEW: http://review.gluster.org/12780 (tier/glusterd : Validation for frequency thresholds and record-counters) posted (#5) for review on master by Joseph Fernandes

Comment 6 Vijay Bellur 2015-11-28 14:48:21 UTC
REVIEW: http://review.gluster.org/12780 (tier/glusterd : Validation for frequency thresholds and record-counters) posted (#6) for review on master by Joseph Fernandes

Comment 7 Vijay Bellur 2015-11-29 14:39:20 UTC
REVIEW: http://review.gluster.org/12780 (tier/glusterd : Validation for frequency thresholds and record-counters) posted (#7) for review on master by Dan Lambright (dlambrig)

Comment 8 Vijay Bellur 2015-11-29 15:00:31 UTC
REVIEW: http://review.gluster.org/12780 (tier/glusterd : Validation for frequency thresholds and record-counters) posted (#8) for review on master by Dan Lambright (dlambrig)

Comment 9 Vijay Bellur 2015-12-01 16:26:38 UTC
REVIEW: http://review.gluster.org/12780 (tier/glusterd : Validation for frequency thresholds and record-counters) posted (#9) for review on master by Dan Lambright (dlambrig)

Comment 10 Vijay Bellur 2015-12-01 18:47:05 UTC
REVIEW: http://review.gluster.org/12780 (tier/glusterd : Validation for frequency thresholds and record-counters) posted (#10) for review on master by Dan Lambright (dlambrig)

Comment 11 Vijay Bellur 2015-12-01 23:39:44 UTC
COMMIT: http://review.gluster.org/12780 committed in master by Dan Lambright (dlambrig) 
------
commit eea50e82365770dea56cdddca89cd6d3b8bb7a22
Author: Joseph Fernandes <josferna>
Date:   Sat Nov 28 17:03:41 2015 +0530

    tier/glusterd : Validation for frequency thresholds and record-counters
    
    1) if record-counters is set to off
       check if both the frequency thresholds are non-zero, then pop
       an error message, with volume set failed.
    2) if record-counters is set to on
       check if both the frequency thresholds are zero, then pop
       an note, but volume set is not failed.
    3) If any of the frequency thresholds are set to a non-zero value,
       switch record-counters on, if not already on
    4) If both the frequency thresholds are set to zero,
       switch record-counters off, if not already off
    
    NOTE: In this fix we have
    1) removed unnecessary ctr vol set options.
    2) changed  ctr_hardlink_heal_expire_period to ctr_lookupheal_link_timeout
    
    Change-Id: Ie7ccfd3f6e021056905a79de5a3d8f199312f315
    BUG: 1286346
    Signed-off-by: Joseph Fernandes <josferna>
    Signed-off-by: Dan Lambright <dlambrig>
    Reviewed-on: http://review.gluster.org/12780
    Tested-by: Gluster Build System <jenkins.com>
    Tested-by: NetBSD Build System <jenkins.org>

Comment 21 Rejy M Cyriac 2015-12-03 14:44:26 UTC
*** Bug 1284067 has been marked as a duplicate of this bug. ***

Comment 22 Sweta Anandpara 2015-12-08 15:11:03 UTC
Tested and verified this on the build glusterfs-3.7.5-9.el7rhgs.x86_64

The scenarios tested:

Case 1: If read and write counters are set to a non-zero value, setting features.record-counters to 'off' fails with an error message, as expected.

Case 2: When read and write counters are set to '0', then features.record-counters is automatically set to 'off', as expected

Case 3: If both the read/write counters are set to '0', setting features.record-counters to 'on' does *not* pop a note mentioning the same. The fix mentions that a note should be displayed

Case 4: If any of the read/write counters are set to non-zero value, then features.record-counters is set to 'on', as expected

Could you please check in, on case 3? It would be good to have a note displayed on the screen prompting the user to update the read/write counters.


Case 1 logs:
==============
[root@dhcp37-55 ~]# gluster v info nash
 
Volume Name: nash
Type: Tier
Volume ID: 66caac13-cb0a-4a5d-93e3-544ad19472c2
Status: Started
Number of Bricks: 10
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick1: 10.70.37.203:/rhs/thinbrick2/nash2
Brick2: 10.70.37.55:/rhs/thinbrick2/nash2
Brick3: 10.70.37.203:/rhs/thinbrick2/nash
Brick4: 10.70.37.55:/rhs/thinbrick2/nash
Cold Tier:
Cold Tier Type : Disperse
Number of Bricks: 1 x (4 + 2) = 6
Brick5: 10.70.37.55:/rhs/thinbrick1/nash
Brick6: 10.70.37.203:/rhs/thinbrick1/nash
Brick7: 10.70.37.210:/rhs/thinbrick1/nash
Brick8: 10.70.37.141:/rhs/thinbrick1/nash
Brick9: 10.70.37.210:/rhs/thinbrick2/nash
Brick10: 10.70.37.141:/rhs/thinbrick2/nash
Options Reconfigured:
cluster.tier-mode: cache
features.ctr-enabled: on
cluster.disperse-self-heal-daemon: enable
performance.readdir-ahead: on
nfs.disable: off
ganesha.enable: off
features.record-counters: 1
cluster.write-freq-threshold: 4
cluster.read-freq-threshold: 4
nfs-ganesha: disable
cluster.enable-shared-storage: enable
[root@dhcp37-55 ~]# gluster v set nash features.record-counters 0
volume set: failed: Cannot set features.record-counters to "0" as cluster.write-freq-threshold is 4 and cluster.read-freq-threshold is 4. Please set both cluster.write-freq-threshold and  cluster.read-freq-threshold to 0, to set  features.record-counters to "0".
[root@dhcp37-55 ~]# gluster v set nash features.record-counters off
volume set: failed: Cannot set features.record-counters to "off" as cluster.write-freq-threshold is 4 and cluster.read-freq-threshold is 4. Please set both cluster.write-freq-threshold and  cluster.read-freq-threshold to 0, to set  features.record-counters to "off".
[root@dhcp37-55 ~]# 
[root@dhcp37-55 ~]# 



Case 2 logs
=============

[root@dhcp37-55 ~]# gluster v set nash cluster.write-freq-threshold 0
volume set: success
[root@dhcp37-55 ~]# gluster v set nash cluster.read-freq-threshold 0
volume set: success
[root@dhcp37-55 ~]# 
[root@dhcp37-55 ~]# 
[root@dhcp37-55 ~]# gluster v info nash
 
Volume Name: nash
Type: Tier
Volume ID: 66caac13-cb0a-4a5d-93e3-544ad19472c2
Status: Started
Number of Bricks: 10
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick1: 10.70.37.203:/rhs/thinbrick2/nash2
Brick2: 10.70.37.55:/rhs/thinbrick2/nash2
Brick3: 10.70.37.203:/rhs/thinbrick2/nash
Brick4: 10.70.37.55:/rhs/thinbrick2/nash
Cold Tier:
Cold Tier Type : Disperse
Number of Bricks: 1 x (4 + 2) = 6
Brick5: 10.70.37.55:/rhs/thinbrick1/nash
Brick6: 10.70.37.203:/rhs/thinbrick1/nash
Brick7: 10.70.37.210:/rhs/thinbrick1/nash
Brick8: 10.70.37.141:/rhs/thinbrick1/nash
Brick9: 10.70.37.210:/rhs/thinbrick2/nash
Brick10: 10.70.37.141:/rhs/thinbrick2/nash
Options Reconfigured:
cluster.tier-mode: cache
features.ctr-enabled: on
cluster.disperse-self-heal-daemon: enable
performance.readdir-ahead: on
nfs.disable: off
ganesha.enable: off
features.record-counters: off
cluster.write-freq-threshold: 0
cluster.read-freq-threshold: 0
nfs-ganesha: disable
cluster.enable-shared-storage: enable
[root@dhcp37-55 ~]# 
[root@dhcp37-55 ~]# 



Case 3 logs:
=============

[root@dhcp37-55 ~]# gluster v set nash features.record-counters on
volume set: success
[root@dhcp37-55 ~]# 
[root@dhcp37-55 ~]# gluster v info nash
 
Volume Name: nash
Type: Tier
Volume ID: 66caac13-cb0a-4a5d-93e3-544ad19472c2
Status: Started
Number of Bricks: 10
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick1: 10.70.37.203:/rhs/thinbrick2/nash2
Brick2: 10.70.37.55:/rhs/thinbrick2/nash2
Brick3: 10.70.37.203:/rhs/thinbrick2/nash
Brick4: 10.70.37.55:/rhs/thinbrick2/nash
Cold Tier:
Cold Tier Type : Disperse
Number of Bricks: 1 x (4 + 2) = 6
Brick5: 10.70.37.55:/rhs/thinbrick1/nash
Brick6: 10.70.37.203:/rhs/thinbrick1/nash
Brick7: 10.70.37.210:/rhs/thinbrick1/nash
Brick8: 10.70.37.141:/rhs/thinbrick1/nash
Brick9: 10.70.37.210:/rhs/thinbrick2/nash
Brick10: 10.70.37.141:/rhs/thinbrick2/nash
Options Reconfigured:
cluster.tier-mode: cache
features.ctr-enabled: on
cluster.disperse-self-heal-daemon: enable
performance.readdir-ahead: on
nfs.disable: off
ganesha.enable: off
features.record-counters: on
cluster.write-freq-threshold: 0
cluster.read-freq-threshold: 0
nfs-ganesha: disable
cluster.enable-shared-storage: enable
[root@dhcp37-55 ~]# 
[root@dhcp37-55 ~]# 


Case 4 logs
=============

[root@dhcp37-55 ~]# gluster v set nash features.record-counters off
volume set: success
[root@dhcp37-55 ~]# 
[root@dhcp37-55 ~]# 
[root@dhcp37-55 ~]# gluster v info nash
 
Volume Name: nash
Type: Tier
Volume ID: 66caac13-cb0a-4a5d-93e3-544ad19472c2
Status: Started
Number of Bricks: 10
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick1: 10.70.37.203:/rhs/thinbrick2/nash2
Brick2: 10.70.37.55:/rhs/thinbrick2/nash2
Brick3: 10.70.37.203:/rhs/thinbrick2/nash
Brick4: 10.70.37.55:/rhs/thinbrick2/nash
Cold Tier:
Cold Tier Type : Disperse
Number of Bricks: 1 x (4 + 2) = 6
Brick5: 10.70.37.55:/rhs/thinbrick1/nash
Brick6: 10.70.37.203:/rhs/thinbrick1/nash
Brick7: 10.70.37.210:/rhs/thinbrick1/nash
Brick8: 10.70.37.141:/rhs/thinbrick1/nash
Brick9: 10.70.37.210:/rhs/thinbrick2/nash
Brick10: 10.70.37.141:/rhs/thinbrick2/nash
Options Reconfigured:
cluster.tier-mode: cache
features.ctr-enabled: on
cluster.disperse-self-heal-daemon: enable
performance.readdir-ahead: on
nfs.disable: off
ganesha.enable: off
features.record-counters: off
cluster.write-freq-threshold: 0
cluster.read-freq-threshold: 0
nfs-ganesha: disable
cluster.enable-shared-storage: enable
[root@dhcp37-55 ~]# gluster v set cluster.read-freq-threshold 100
Usage: volume set <VOLNAME> <KEY> <VALUE>
[root@dhcp37-55 ~]# gluster v set nash cluster.read-freq-threshold 100
volume set: success
[root@dhcp37-55 ~]# 
[root@dhcp37-55 ~]# 
[root@dhcp37-55 ~]# gluster v get nash features.record-counters
Option                                  Value                                   
------                                  -----                                   
features.record-counters                on                                      
[root@dhcp37-55 ~]# gluster v info nash
 
Volume Name: nash
Type: Tier
Volume ID: 66caac13-cb0a-4a5d-93e3-544ad19472c2
Status: Started
Number of Bricks: 10
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick1: 10.70.37.203:/rhs/thinbrick2/nash2
Brick2: 10.70.37.55:/rhs/thinbrick2/nash2
Brick3: 10.70.37.203:/rhs/thinbrick2/nash
Brick4: 10.70.37.55:/rhs/thinbrick2/nash
Cold Tier:
Cold Tier Type : Disperse
Number of Bricks: 1 x (4 + 2) = 6
Brick5: 10.70.37.55:/rhs/thinbrick1/nash
Brick6: 10.70.37.203:/rhs/thinbrick1/nash
Brick7: 10.70.37.210:/rhs/thinbrick1/nash
Brick8: 10.70.37.141:/rhs/thinbrick1/nash
Brick9: 10.70.37.210:/rhs/thinbrick2/nash
Brick10: 10.70.37.141:/rhs/thinbrick2/nash
Options Reconfigured:
cluster.tier-mode: cache
features.ctr-enabled: on
cluster.disperse-self-heal-daemon: enable
performance.readdir-ahead: on
nfs.disable: off
ganesha.enable: off
features.record-counters: on
cluster.write-freq-threshold: 0
cluster.read-freq-threshold: 100
nfs-ganesha: disable
cluster.enable-shared-storage: enable
[root@dhcp37-55 ~]#
[root@dhcp37-55 ~]# gluster v set nash features.record-counters off
volume set: failed: Cannot set features.record-counters to "off" as cluster.write-freq-threshold is 0 and cluster.read-freq-threshold is 100. Please set both cluster.write-freq-threshold and  cluster.read-freq-threshold to 0, to set  features.record-counters to "off".
[root@dhcp37-55 ~]# gluster v set nash cluster.read-freq-threshold 0
volume set: success
[root@dhcp37-55 ~]# gluster v get  nash features.record-counters 
Option                                  Value                                   
------                                  -----                                   
features.record-counters                off                                     
[root@dhcp37-55 ~]# gluster v set nash cluster.write-freq-threshold 0x0
volume set: success
[root@dhcp37-55 ~]# gluster v get  nash features.record-counters 
Option                                  Value                                   
------                                  -----                                   
features.record-counters                off                                     
[root@dhcp37-55 ~]# gluster v set nash cluster.write-freq-threshold 0x1
volume set: success
[root@dhcp37-55 ~]# gluster v get  nash features.record-counters 
Option                                  Value                                   
------                                  -----                                   
features.record-counters                on                                      
[root@dhcp37-55 ~]# gluster v info nash
 
Volume Name: nash
Type: Tier
Volume ID: 66caac13-cb0a-4a5d-93e3-544ad19472c2
Status: Started
Number of Bricks: 10
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick1: 10.70.37.203:/rhs/thinbrick2/nash2
Brick2: 10.70.37.55:/rhs/thinbrick2/nash2
Brick3: 10.70.37.203:/rhs/thinbrick2/nash
Brick4: 10.70.37.55:/rhs/thinbrick2/nash
Cold Tier:
Cold Tier Type : Disperse
Number of Bricks: 1 x (4 + 2) = 6
Brick5: 10.70.37.55:/rhs/thinbrick1/nash
Brick6: 10.70.37.203:/rhs/thinbrick1/nash
Brick7: 10.70.37.210:/rhs/thinbrick1/nash
Brick8: 10.70.37.141:/rhs/thinbrick1/nash
Brick9: 10.70.37.210:/rhs/thinbrick2/nash
Brick10: 10.70.37.141:/rhs/thinbrick2/nash
Options Reconfigured:
cluster.tier-mode: cache
features.ctr-enabled: on
cluster.disperse-self-heal-daemon: enable
performance.readdir-ahead: on
nfs.disable: off
ganesha.enable: off
features.record-counters: on
cluster.write-freq-threshold: 0x1
cluster.read-freq-threshold: 0
nfs-ganesha: disable
cluster.enable-shared-storage: enable
[root@dhcp37-55 ~]# 
[root@dhcp37-55 ~]# 
[root@dhcp37-55 ~]# rpm -qa | grep gluster
nfs-ganesha-gluster-2.2.0-11.el7rhgs.x86_64
glusterfs-cli-3.7.5-9.el7rhgs.x86_64
glusterfs-3.7.5-9.el7rhgs.x86_64
glusterfs-api-3.7.5-9.el7rhgs.x86_64
glusterfs-ganesha-3.7.5-9.el7rhgs.x86_64
glusterfs-libs-3.7.5-9.el7rhgs.x86_64
glusterfs-fuse-3.7.5-9.el7rhgs.x86_64
glusterfs-client-xlators-3.7.5-9.el7rhgs.x86_64
glusterfs-server-3.7.5-9.el7rhgs.x86_64
[root@dhcp37-55 ~]#

Comment 23 Joseph Elwin Fernandes 2015-12-09 11:07:41 UTC
With the present infra of glusterd the message is only carried if its an error from the point of validation. The point of validation is within glusterd and not cli validation. I have put a TODO remark in the fix for this. I understand a message should pop up here but the scenario is not very harmful to the user i.e 
Thresholds are zero and record counter is "on".

Please refer to http://review.gluster.org/#/c/12780/11/xlators/mgmt/glusterd/src/glusterd-volume-set.c and line number 109.

Comment 24 Sweta Anandpara 2015-12-10 07:12:41 UTC
I am affirmative - it is not very harmful to the user. The usability aspect is going to be better if a note is displayed. I assume you must be having a way to track the TODO remark in the fix, and if there is a plan of fixing the same in current/future release?

Moving the said bug to verified in 3.1.2, based on the above note. Logs are already pasted in my previous update.

Comment 25 Sweta Anandpara 2015-12-11 10:40:34 UTC
Raised bug 1290746 to track case 3, mentioned in comment 22.

Comment 27 errata-xmlrpc 2016-03-01 05:59:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0193.html

Comment 28 Niels de Vos 2016-06-16 13:47:31 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user