Bug 1386477

Summary: [Eventing]: TIER_DETACH_FORCE and TIER_DETACH_COMMIT events seen even after confirming negatively
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Sweta Anandpara <sanandpa>
Component: glusterfsAssignee: Milind Changire <mchangir>
Status: CLOSED ERRATA QA Contact: Sweta Anandpara <sanandpa>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.2CC: amukherj, rhinduja, vbellur
Target Milestone: ---   
Target Release: RHGS 3.2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.8.4-4 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-03-23 06:12:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1351528    

Description Sweta Anandpara 2016-10-19 05:10:52 UTC
Description of problem:
=======================
Have a 4 node cluster with eventing enabled. Have a 1*(4+2) volume as cold tier and 1*4 as hot tier. Try to detach using the command 'gluster volume tier detach start' followed with 'gluster volume tier detach commit' OR 'gluster volume tier detach force' directly. In either of the cases, it prompts for a confirmation message, if we would want to really go ahead with it. When we reply in negation, it does not go ahead and do the said operation. However, an event for the same is seen in the webhook listener, passing the wrong impression to any application that is consuming these events.


Version-Release number of selected component (if applicable):
============================================================
3.8.4-2



How reproducible:
==================
Always


Steps to Reproduce:
====================
1. Have a 4 node cluster with eventing enabled.
2. Create 1*(4+2) disperse volume 'disp' and attach a 1*4 hot tier to the same
3. Execute the command 'gluster volume tier detach force', and when it asks for a confirmation message, reply 'n' to it
4. Execute the command 'gluster volume tier detach start'.
5. When the rebalance completes, execute the command 'gluster volume tier detach commit'. When probed for a confirmation, answer 'n' to it.


Actual results:
===============
Step3 generates an event TIER_DETACH_FORCE and step5 generates an event TIER_DETACH_COMMIT

{u'message': {u'vol': u'disp'}, u'event': u'TIER_DETACH_FORCE', u'ts': 1476851551, u'nodeid': u'ed362eb3-421c-4a25-ad0e-82ef157ea328'}

{u'message': {u'vol': u'disp'}, u'event': u'TIER_DETACH_COMMIT', u'ts': 1476850418, u'nodeid': u'ed362eb3-421c-4a25-ad0e-82ef157ea328'}



Expected results:
=================
Step3 and Step5 should not have generated any event as we answered negatively in the confirmation message.


Additional info:
================

[root@dhcp46-239 ~]# gluster peer status
Number of Peers: 3

Hostname: 10.70.46.240
Uuid: 72c4f894-61f7-433e-a546-4ad2d7f0a176
State: Peer in Cluster (Connected)

Hostname: 10.70.46.242
Uuid: 1e8967ae-51b2-4c27-907e-a22a83107fd0
State: Peer in Cluster (Connected)

Hostname: 10.70.46.218
Uuid: 0dea52e0-8c32-4616-8ef8-16db16120eaa
State: Peer in Cluster (Connected)
[root@dhcp46-239 ~]# 
[root@dhcp46-239 ~]# 
[root@dhcp46-239 ~]# rpm -qa | grep gluster
nfs-ganesha-gluster-2.3.1-8.el7rhgs.x86_64
glusterfs-3.8.4-2.el7rhgs.x86_64
glusterfs-api-devel-3.8.4-2.el7rhgs.x86_64
glusterfs-debuginfo-3.8.4-1.el7rhgs.x86_64
glusterfs-libs-3.8.4-2.el7rhgs.x86_64
glusterfs-api-3.8.4-2.el7rhgs.x86_64
python-gluster-3.8.4-2.el7rhgs.noarch
glusterfs-geo-replication-3.8.4-2.el7rhgs.x86_64
glusterfs-rdma-3.8.4-2.el7rhgs.x86_64
glusterfs-fuse-3.8.4-2.el7rhgs.x86_64
glusterfs-cli-3.8.4-2.el7rhgs.x86_64
glusterfs-server-3.8.4-2.el7rhgs.x86_64
glusterfs-ganesha-3.8.4-2.el7rhgs.x86_64
glusterfs-client-xlators-3.8.4-2.el7rhgs.x86_64
glusterfs-devel-3.8.4-2.el7rhgs.x86_64
glusterfs-events-3.8.4-2.el7rhgs.x86_64
[root@dhcp46-239 ~]# 
[root@dhcp46-239 ~]# 
[root@dhcp46-239 ~]# gluster v info
 
Volume Name: disp
Type: Tier
Volume ID: a9999464-b094-4213-a422-c11fed555674
Status: Started
Snapshot Count: 0
Number of Bricks: 10
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distribute
Number of Bricks: 4
Brick1: 10.70.46.218:/bricks/brick2/disp_tier4
Brick2: 10.70.46.242:/bricks/brick2/disp_tier3
Brick3: 10.70.46.240:/bricks/brick2/disp_tier2
Brick4: 10.70.46.239:/bricks/brick2/disp_tier1
Cold Tier:
Cold Tier Type : Disperse
Number of Bricks: 1 x (4 + 2) = 6
Brick5: 10.70.46.239:/bricks/brick0/disp1
Brick6: 10.70.46.240:/bricks/brick0/disp2
Brick7: 10.70.46.242:/bricks/brick0/disp3
Brick8: 10.70.46.218:/bricks/brick0/disp4
Brick9: 10.70.46.239:/bricks/brick1/disp5
Brick10: 10.70.46.240:/bricks/brick1/disp6
Options Reconfigured:
cluster.tier-mode: cache
features.ctr-enabled: on
transport.address-family: inet
performance.readdir-ahead: on
cluster.enable-shared-storage: enable
[root@dhcp46-239 ~]# 
[root@dhcp46-239 ~]# 
[root@dhcp46-239 ~]#

Comment 3 Milind Changire 2016-10-24 12:37:50 UTC
Will be fixed in the patch for https://bugzilla.redhat.com/show_bug.cgi?id=1387098

Comment 5 Milind Changire 2016-11-07 15:08:41 UTC
Fixed as part of: https://bugzilla.redhat.com/show_bug.cgi?id=1386185

Comment 7 Sweta Anandpara 2016-11-14 09:48:55 UTC
Tested and verified this on the build 3.8.4-5

We no longer see an event when we answer 'no' to the confirmatory message displayed during 'tier detach force' and 'tier detach commit'. 

Tried out multiple times. The expected event is seen only when we answer in affirmation to the confirmation. 

Moving this to verified in 3.2. Detailed logs are pasted below. 

[root@dhcp46-239 ~]# rpm -qa | grep gluster
nfs-ganesha-gluster-2.3.1-8.el7rhgs.x86_64
glusterfs-api-3.8.4-5.el7rhgs.x86_64
python-gluster-3.8.4-5.el7rhgs.noarch
glusterfs-client-xlators-3.8.4-5.el7rhgs.x86_64
glusterfs-server-3.8.4-5.el7rhgs.x86_64
glusterfs-ganesha-3.8.4-5.el7rhgs.x86_64
glusterfs-devel-3.8.4-5.el7rhgs.x86_64
glusterfs-libs-3.8.4-5.el7rhgs.x86_64
glusterfs-fuse-3.8.4-5.el7rhgs.x86_64
glusterfs-api-devel-3.8.4-5.el7rhgs.x86_64
glusterfs-rdma-3.8.4-5.el7rhgs.x86_64
glusterfs-3.8.4-5.el7rhgs.x86_64
glusterfs-cli-3.8.4-5.el7rhgs.x86_64
glusterfs-geo-replication-3.8.4-5.el7rhgs.x86_64
glusterfs-debuginfo-3.8.4-4.el7rhgs.x86_64
glusterfs-events-3.8.4-5.el7rhgs.x86_64
[root@dhcp46-239 ~]# gluster peer status
Number of Peers: 3

Hostname: 10.70.46.240
Uuid: 72c4f894-61f7-433e-a546-4ad2d7f0a176
State: Peer in Cluster (Connected)

Hostname: 10.70.46.242
Uuid: 1e8967ae-51b2-4c27-907e-a22a83107fd0
State: Peer in Cluster (Connected)

Hostname: 10.70.46.218
Uuid: 0dea52e0-8c32-4616-8ef8-16db16120eaa
State: Peer in Cluster (Connected)
[root@dhcp46-239 ~]#
[root@dhcp46-239 ~]# gluster v info
 
Volume Name: ozone
Type: Tier
Volume ID: 376cdde0-194f-460a-b273-3904a704a7dd
Status: Started
Snapshot Count: 0
Number of Bricks: 8
Transport-type: tcp
Hot Tier :
Hot Tier Type : Replicate
Number of Bricks: 1 x 2 = 2
Brick1: 10.70.46.218:/bricks/brick1/ozone_tier1
Brick2: 10.70.46.218:/bricks/brick0/ozone_tier0
Cold Tier:
Cold Tier Type : Disperse
Number of Bricks: 1 x (4 + 2) = 6
Brick3: 10.70.46.239:/bricks/brick0/ozone0
Brick4: 10.70.46.240:/bricks/brick0/ozone2
Brick5: 10.70.46.242:/bricks/brick0/ozone2
Brick6: 10.70.46.239:/bricks/brick1/ozone3
Brick7: 10.70.46.240:/bricks/brick1/ozone4
Brick8: 10.70.46.242:/bricks/brick1/ozone5
Options Reconfigured:
cluster.tier-mode: cache
features.ctr-enabled: on
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on
cluster.enable-shared-storage: disable
[root@dhcp46-239 ~]#
[root@dhcp46-239 ~]# gluster v tier ozone detach start
volume detach-tier start: success
ID: 2cc1504c-773c-47ac-91a1-73fd58ef3cea
[root@dhcp46-239 ~]# gluster v tier ozone detach status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                            10.70.46.218                0        0Bytes             0             0             0            completed        0:0:1
[root@dhcp46-239 ~]# gluster v tier ozone detach 
Usage: volume tier <VOLNAME> status
volume tier <VOLNAME> start [force]
volume tier <VOLNAME> attach [<replica COUNT>] <NEW-BRICK>... [force]
volume tier <VOLNAME> detach <start|stop|status|commit|force>

Tier command failed
[root@dhcp46-239 ~]#
[root@dhcp46-239 ~]# gluster v tier ozone detach commit
Removing tier can result in data loss. Do you want to Continue? (y/n) n
[root@dhcp46-239 ~]# gluster v tier ozone detach commit
Removing tier can result in data loss. Do you want to Continue? (y/n) y
volume detach-tier commit: success
Check the detached bricks to ensure all files are migrated.
If files with data are found on the brick path, copy them via a gluster mount point before re-purposing the removed brick. 
[root@dhcp46-239 ~]#
[root@dhcp46-239 ~]# gluster  v tier ozone attach replica 2 10.70.46.218:/bricks/brick0/ozone_tier0 10.70.46.218:/bricks/brick1/ozone_tier1
volume attach-tier: success
Tiering Migration Functionality: ozone: success: Attach tier is successful on ozone. use tier status to check the status.
ID: d3937d1a-5842-4c6e-8759-b6b565464ba7

[root@dhcp46-239 ~]# 
[root@dhcp46-239 ~]# 
[root@dhcp46-239 ~]# gluster v tier ozone detach force
Removing tier can result in data loss. Do you want to Continue? (y/n) n
[root@dhcp46-239 ~]# gluster v tier ozone detach force
Removing tier can result in data loss. Do you want to Continue? (y/n) y
volume detach-tier commit force: success
[root@dhcp46-239 ~]# gluster  v tier ozone attach replica 2 10.70.46.218:/bricks/brick0/ozone_tier0 10.70.46.218:/bricks/brick1/ozone_tier1
volume attach-tier: success
Tiering Migration Functionality: ozone: success: Attach tier is successful on ozone. use tier status to check the status.
ID: 49d1c39d-0267-481b-ad00-b0ae42686641

[root@dhcp46-239 ~]#

Comment 9 errata-xmlrpc 2017-03-23 06:12:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0486.html