Bug 1275912

Summary:

AFR self-heal-daemon option is still set on volume though tier is detached

Product:

[Red Hat Storage] Red Hat Gluster Storage

Reporter:

Bhaskarakiran <byarlaga>

Component:

tier

Assignee:

Mohammed Rafi KC <rkavunga>

Status:

CLOSED ERRATA

QA Contact:

Sweta Anandpara <sanandpa>

Severity:

high

Docs Contact:

Priority:

high

Version:

rhgs-3.1

CC:

dlambrig, mzywusko, nchilaka, rcyriac, rhs-bugs, rkavunga, sankarshan, storage-qa-internal

Target Milestone:

---

Keywords:

ZStream

Target Release:

RHGS 3.1.2

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

glusterfs-3.7.5-9

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Clones:

1285046 (view as bug list)

Environment:

Last Closed:

2016-03-01 05:46:31 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

1260783, 1260923, 1285046, 1285978

Attachments:

Description	Flags
cli logs	none

Description Bhaskarakiran 2015-10-28 06:10:36 UTC

Description of problem:
=======================

Though tier volume is detached which is a replica / dist-rep, afr self-heal-daemon option is not removed for the volume. Checked the vol files, and they are clean. It only the option that need to be removed.

[root@transformers vol1]# gluster v info vol1
 
Volume Name: vol1
Type: Disperse
Volume ID: ed22b68c-b982-40ca-86c0-d85cd127dbc1
Status: Started
Number of Bricks: 1 x (8 + 4) = 12
Transport-type: tcp
Bricks:
Brick1: transformers:/rhs/brick1/b1
Brick2: interstellar:/rhs/brick1/b2
Brick3: transformers:/rhs/brick2/b3
Brick4: interstellar:/rhs/brick2/b4
Brick5: transformers:/rhs/brick3/b5
Brick6: interstellar:/rhs/brick3/b6
Brick7: transformers:/rhs/brick4/b7
Brick8: interstellar:/rhs/brick4/b8
Brick9: transformers:/rhs/brick5/b9
Brick10: interstellar:/rhs/brick5/b10
Brick11: transformers:/rhs/brick6/b11
Brick12: interstellar:/rhs/brick6/b12
Options Reconfigured:
ganesha.enable: off
cluster.self-heal-daemon: on ********************
features.ctr-enabled: on
cluster.disperse-self-heal-daemon: enable
server.event-threads: 4
client.event-threads: 4
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
features.uss: on
nfs.disable: on
performance.readdir-ahead: on
cluster.enable-shared-storage: enable
nfs-ganesha: disable
[root@transformers vol1]# 

[root@transformers vol1]# gluster v set vol1 cluster.self-heal-daemon off
volume set: failed: Volume vol1 is not of replicate type
[root@transformers vol1]# gluster v set vol1 cluster.self-heal-daemon on
volume set: failed: Volume vol1 is not of replicate type
[root@transformers vol1]# 

Version-Release number of selected component (if applicable):
=============================================================
3.7.5-0.3

[root@transformers vol1]# gluster --version
glusterfs 3.7.5 built on Oct 15 2015 08:48:29
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General Public License.
[root@transformers vol1]# 


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 3 Mohammed Rafi KC 2015-11-25 07:25:11 UTC

http://review.gluster.org/#/c/12736/

Comment 4 Sweta Anandpara 2015-12-07 12:59:37 UTC

Tested this on the build glusterfs-3.7.5-9.el7rhgs.x86_64

Had a 4+2 disperse volume as cold tier and a 2*2 dist-rep as hot tier. Checked the cluster.self-heal-daemon and verified it to be ON. 
Issued a 'gluster v tier <volname> detach' command and checked the self-heal-daemon, which continued to be ON. 
Tried to manually set it OFF and it resulted in the error - that volume is not of replica type. 

Rafi, could you please check on your end? I don't believe the fix is working correctly yet.

Pasted below are the logs:

[root@dhcp37-55 ~]# gluster v info nash
 
Volume Name: nash
Type: Tier
Volume ID: 66caac13-cb0a-4a5d-93e3-544ad19472c2
Status: Started
Number of Bricks: 10
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick1: 10.70.37.203:/rhs/thinbrick2/nash2
Brick2: 10.70.37.55:/rhs/thinbrick2/nash2
Brick3: 10.70.37.203:/rhs/thinbrick2/nash
Brick4: 10.70.37.55:/rhs/thinbrick2/nash
Cold Tier:
Cold Tier Type : Disperse
Number of Bricks: 1 x (4 + 2) = 6
Brick5: 10.70.37.55:/rhs/thinbrick1/nash
Brick6: 10.70.37.203:/rhs/thinbrick1/nash
Brick7: 10.70.37.210:/rhs/thinbrick1/nash
Brick8: 10.70.37.141:/rhs/thinbrick1/nash
Brick9: 10.70.37.210:/rhs/thinbrick2/nash
Brick10: 10.70.37.141:/rhs/thinbrick2/nash
Options Reconfigured:
cluster.self-heal-daemon: on
performance.readdir-ahead: on
nfs.disable: off
ganesha.enable: off
features.ctr-enabled: on
cluster.tier-mode: test
features.record-counters: on
cluster.write-freq-threshold: 0x0
cluster.read-freq-threshold: 4
nfs-ganesha: disable
cluster.enable-shared-storage: enable
[root@dhcp37-55 ~]# 
[root@dhcp37-55 ~]# gluster v tier nash detach start
volume detach-tier start: success
ID: 4b64b3fe-34f0-41e0-afc0-3885a0f001fd
[root@dhcp37-55 ~]# gluster v tier nash detach status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes           100             0             0            completed               0.00
                            10.70.37.203                0        0Bytes             0             0             0            completed               0.00
[root@dhcp37-55 ~]# gluster v tier nash detach commit
Removing tier can result in data loss. Do you want to Continue? (y/n) y
volume detach-tier commit: success
Check the detached bricks to ensure all files are migrated.
If files with data are found on the brick path, copy them via a gluster mount point before re-purposing the removed brick. 
[root@dhcp37-55 ~]# 
[root@dhcp37-55 ~]# 
[root@dhcp37-55 ~]# gluster v info nash
 
Volume Name: nash
Type: Disperse
Volume ID: 66caac13-cb0a-4a5d-93e3-544ad19472c2
Status: Started
Number of Bricks: 1 x (4 + 2) = 6
Transport-type: tcp
Bricks:
Brick1: 10.70.37.55:/rhs/thinbrick1/nash
Brick2: 10.70.37.203:/rhs/thinbrick1/nash
Brick3: 10.70.37.210:/rhs/thinbrick1/nash
Brick4: 10.70.37.141:/rhs/thinbrick1/nash
Brick5: 10.70.37.210:/rhs/thinbrick2/nash
Brick6: 10.70.37.141:/rhs/thinbrick2/nash
Options Reconfigured:
performance.readdir-ahead: on
nfs.disable: off
ganesha.enable: off
features.record-counters: on
cluster.write-freq-threshold: 0x0
cluster.read-freq-threshold: 4
nfs-ganesha: disable
cluster.enable-shared-storage: enable
[root@dhcp37-55 ~]# 
[root@dhcp37-55 ~]# 
[root@dhcp37-55 ~]# gluster v get nash cluster.self-heal-daemon
Option                                  Value                                   
------                                  -----                                   
cluster.self-heal-daemon                on                                      
[root@dhcp37-55 ~]# 
[root@dhcp37-55 ~]# 
[root@dhcp37-55 ~]# gluster v set nash cluster.self-heal-daemon off
volume set: failed: Volume nash is not of replicate type
[root@dhcp37-55 ~]# rpm -qa | grep gluster
nfs-ganesha-gluster-2.2.0-11.el7rhgs.x86_64
glusterfs-cli-3.7.5-9.el7rhgs.x86_64
glusterfs-3.7.5-9.el7rhgs.x86_64
glusterfs-api-3.7.5-9.el7rhgs.x86_64
glusterfs-ganesha-3.7.5-9.el7rhgs.x86_64
glusterfs-libs-3.7.5-9.el7rhgs.x86_64
glusterfs-fuse-3.7.5-9.el7rhgs.x86_64
glusterfs-client-xlators-3.7.5-9.el7rhgs.x86_64
glusterfs-server-3.7.5-9.el7rhgs.x86_64
[root@dhcp37-55 ~]#

Comment 5 Mohammed Rafi KC 2015-12-07 15:39:28 UTC

gluster volume get will always show default value if it is not set/reset. Default option for self-heal-daemon is on. That's why it is showing as on. If it didn't show on volume info, which means option is not.

If you create a plain regular volume, then also volume get option will show as enabled.
Also if you reset the whole option's for a volume using gluster volume reset, and if you try the same you will get the value as on.

Comment 6 Sweta Anandpara 2015-12-08 03:13:28 UTC

>>If it didn't show on volume info, which means option is not.

Then that means when I create a 'distribute' volume, the option cluster.self-heal-daemon should not show up on volume info. Since it is not enabled.

And if I create a 'distribute-replicate' or 'replicate' volume, then the option should show up in volume info's output. 

Is that correct?
If it is, then why does the option not show up, when a new dist-rep volume is created?

[root@dhcp37-55 ~]# gluster v create ozon replica 2 10.70.37.55:/rhs/thinbrick1/ozon 10.70.37.141:/rhs/thinbrick1/ozon 10.70.37.55:/rhs/thinbrick2/ozon 10.70.37.141:/rhs/thinbrick2/ozon force
volume create: ozon: success: please start the volume to access data
[root@dhcp37-55 ~]# 
[root@dhcp37-55 ~]# 
[root@dhcp37-55 ~]# gluster v list
gluster_shared_storage
nash
ozon
testvol
tmp_vol
[root@dhcp37-55 ~]#
[root@dhcp37-55 ~]# gluster v start ozon
volume start: ozon: success
[root@dhcp37-55 ~]# gluster v info ozon
 
Volume Name: ozon
Type: Distributed-Replicate
Volume ID: fe3cad14-ab79-4418-bfa0-fde212446233
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 10.70.37.55:/rhs/thinbrick1/ozon
Brick2: 10.70.37.141:/rhs/thinbrick1/ozon
Brick3: 10.70.37.55:/rhs/thinbrick2/ozon
Brick4: 10.70.37.141:/rhs/thinbrick2/ozon
Options Reconfigured:
performance.readdir-ahead: on
nfs-ganesha: disable
cluster.enable-shared-storage: enable
[root@dhcp37-55 ~]#

Comment 7 Mohammed Rafi KC 2015-12-08 04:39:55 UTC

volume info will only show the reconfigured option, whatever the default value is, that will remain same even if it is not set explicitly.

Comment 8 Sweta Anandpara 2015-12-08 09:38:44 UTC

Verified this on the build glusterfs-3.7.5-9.el7rhgs.x86_64

Verified that the option cluster.self-heal-daemon does not show up in the volume info output after detaching a tier.

The combinations that it was tested upon were:
Cold tier           Hot tier
---------         ------------
Distribute          Replicate
Disperse            Dist-rep
Dist-rep            Dist-rep
Dist-rep            Distribute

In each of the combination, the option was set via the 'cluster self heal <volname> enable' command, as well as 'cluster volume set <volname> cluster.self-heal-daemon '

Moving this bug to verified in 3.1.2 . Detailed logs are attached.

Comment 9 Sweta Anandpara 2015-12-08 09:41:18 UTC

Created attachment 1103492 [details]
cli logs

Comment 11 errata-xmlrpc 2016-03-01 05:46:31 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0193.html