1236067 – Data Tiering:Don't allow a detach-tier commit if detach-tier start has failed to complete

Bug 1236067 - Data Tiering:Don't allow a detach-tier commit if detach-tier start has failed to complete

Summary: Data Tiering:Don't allow a detach-tier commit if detach-tier start has failed...

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	tier
Sub Component:
Version:	rhgs-3.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	urgent
Target Milestone:	---
Target Release:	---
Assignee:	hari gowtham
QA Contact:	Nag Pavan Chilakam
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1309999 1314617
TreeView+	depends on / blocked

Reported:	2015-06-26 13:02 UTC by Nag Pavan Chilakam
Modified:	2018-02-06 17:34 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	1309999 (view as bug list)
Environment:
Last Closed:	2018-02-06 17:34:35 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
server#1 logs sosreports (12.51 MB, application/x-xz) 2015-06-26 13:02 UTC, Nag Pavan Chilakam	no flags	Details
server#2 logs sosreports (9.33 MB, application/x-xz) 2015-06-26 13:06 UTC, Nag Pavan Chilakam	no flags	Details
View All

Description Nag Pavan Chilakam 2015-06-26 13:02:59 UTC

Created attachment 1043504 [details]
server#1 logs sosreports

Description of problem:
======================
If the detach-tier start has failed due to any reason, then don't allow the commit to succeed.
if the user really is desperate for detaching tier he/she would anyway use force option.

Refer bug#1236038 and bug#1236052 for related information

Version-Release number of selected component (if applicable):
=================================================================
[root@tettnang glusterfs]# rpm -qa|grep gluster
glusterfs-api-3.7.1-5.el7rhgs.x86_64
glusterfs-libs-3.7.1-5.el7rhgs.x86_64
glusterfs-rdma-3.7.1-5.el7rhgs.x86_64
glusterfs-3.7.1-5.el7rhgs.x86_64
glusterfs-cli-3.7.1-5.el7rhgs.x86_64
glusterfs-debuginfo-3.7.1-5.el7rhgs.x86_64
glusterfs-client-xlators-3.7.1-5.el7rhgs.x86_64
glusterfs-server-3.7.1-5.el7rhgs.x86_64
glusterfs-geo-replication-3.7.1-5.el7rhgs.x86_64
glusterfs-fuse-3.7.1-5.el7rhgs.x86_64
[root@tettnang glusterfs]# gluster --version
glusterfs 3.7.1 built on Jun 23 2015 22:08:15
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General Public License.


Steps to Reproduce:
===================

1.create a tiered volume using two nodes for bricks
2. now kill the glusterd process on one of the nodes
3.Now do a detach-tier start of the volume. The command o/p will say as below
[root@tettnang glusterfs]# gluster v detach-tier xyz start
volume detach-tier start: success
ID: 236cd41b-18fe-41ff-bbf3-b0c318386ec6
4. but if you issue a detach-tier status, it will show as below
[root@tettnang glusterfs]# gluster v detach-tier xyz status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes             0             1             0               failed               0.00


5. Check vol status
[root@tettnang glusterfs]# gluster v status xyz
Status of volume: xyz
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Hot Bricks:
Brick tettnang:/rhs/brick7/xyz              49162     0          Y       23202
Cold Bricks:
Brick tettnang:/rhs/brick1/xyz              49159     0          Y       20879
Brick tettnang:/rhs/brick2/xyz              49160     0          Y       20901
NFS Server on localhost                     N/A       N/A        N       N/A  
NFS Server on zod                           N/A       N/A        N       N/A  
 
Task Status of Volume xyz
------------------------------------------------------------------------------
Task                 : Remove brick        
ID                   : 236cd41b-18fe-41ff-bbf3-b0c318386ec6
Removed bricks:     
tettnang:/rhs/brick7/xyz
yarrow:/rhs/brick7/xyz
Status               : failed             


6. Now if you issue a commit of detach-tier it should fail, but it passes and detaches tier 
[root@tettnang glusterfs]# gluster v detach-tier xyz commit
volume detach-tier commit: success
Check the detached bricks to ensure all files are migrated.
If files with data are found on the brick path, copy them via a gluster mount point before re-purposing the removed brick. 




[root@tettnang glusterfs]# gluster v info xyz
 
Volume Name: xyz
Type: Distributed-Replicate
Volume ID: 7fe8858f-4af3-44df-bc56-dd10e19e8cf6
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: tettnang:/rhs/brick1/xyz
Brick2: yarrow:/rhs/brick1/xyz
Brick3: tettnang:/rhs/brick2/xyz
Brick4: yarrow:/rhs/brick2/xyz
Options Reconfigured:
performance.readdir-ahead: on
[root@tettnang glusterfs]# 
[root@tettnang glusterfs]# 



Expected results:
=================
detach-tier commit operation should fail if detach-tier start has failed due to any reason


Additional info:
====================
Refer similar bugs  with same issue I have raised
bug#1236038 and bug#1236052 for related information



CLI Logs:
========

[root@tettnang glusterfs]# gluster v info xyz
 
Volume Name: xyz
Type: Distributed-Replicate
Volume ID: 7fe8858f-4af3-44df-bc56-dd10e19e8cf6
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: tettnang:/rhs/brick1/xyz
Brick2: yarrow:/rhs/brick1/xyz
Brick3: tettnang:/rhs/brick2/xyz
Brick4: yarrow:/rhs/brick2/xyz
Options Reconfigured:
performance.readdir-ahead: on
[root@tettnang glusterfs]# gluster v attach-tier xyz tettnang:/rhs/brick7/xyz yarrow:/rhs/brick7/xyz
Attach tier is recommended only for testing purposes in this release. Do you want to continue? (y/n) y
volume attach-tier: failed: /rhs/brick7/xyz is already part of a volume
[root@tettnang glusterfs]# gluster v attach-tier xyz tettnang:/rhs/brick7/xyz yarrow:/rhs/brick7/xyz force
Attach tier is recommended only for testing purposes in this release. Do you want to continue? (y/n) y
volume attach-tier: success
gluster v status xyz
volume rebalance: xyz: success: Rebalance on xyz has been started successfully. Use rebalance status command to check status of the rebalance process.
ID: ce287beb-53df-4484-abd7-6216eb994f89

[root@tettnang glusterfs]# gluster v status xyz
Status of volume: xyz
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Hot Bricks:
Brick yarrow:/rhs/brick7/xyz                49164     0          Y       7798 
Brick tettnang:/rhs/brick7/xyz              49162     0          Y       23202
Cold Bricks:
Brick tettnang:/rhs/brick1/xyz              49159     0          Y       20879
Brick yarrow:/rhs/brick1/xyz                49161     0          Y       7075 
Brick tettnang:/rhs/brick2/xyz              49160     0          Y       20901
Brick yarrow:/rhs/brick2/xyz                49162     0          Y       7093 
NFS Server on localhost                     N/A       N/A        N       N/A  
NFS Server on zod                           N/A       N/A        N       N/A  
NFS Server on yarrow                        N/A       N/A        N       N/A  
 
Task Status of Volume xyz
------------------------------------------------------------------------------
Task                 : Rebalance           
ID                   : ce287beb-53df-4484-abd7-6216eb994f89
Status               : in progress         
 
[root@tettnang glusterfs]# gluster v  info xyz
 
Volume Name: xyz
Type: Tier
Volume ID: 7fe8858f-4af3-44df-bc56-dd10e19e8cf6
Status: Started
Number of Bricks: 6
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distribute
Number of Bricks: 2
Brick1: yarrow:/rhs/brick7/xyz
Brick2: tettnang:/rhs/brick7/xyz
Cold Tier:
Cold Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick3: tettnang:/rhs/brick1/xyz
Brick4: yarrow:/rhs/brick1/xyz
Brick5: tettnang:/rhs/brick2/xyz
Brick6: yarrow:/rhs/brick2/xyz
Options Reconfigured:
performance.readdir-ahead: on
[root@tettnang glusterfs]# gluster v  info xyz
 
Volume Name: xyz
Type: Tier
Volume ID: 7fe8858f-4af3-44df-bc56-dd10e19e8cf6
Status: Started
Number of Bricks: 6
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distribute
Number of Bricks: 2
Brick1: yarrow:/rhs/brick7/xyz
Brick2: tettnang:/rhs/brick7/xyz
Cold Tier:
Cold Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick3: tettnang:/rhs/brick1/xyz
Brick4: yarrow:/rhs/brick1/xyz
Brick5: tettnang:/rhs/brick2/xyz
Brick6: yarrow:/rhs/brick2/xyz
Options Reconfigured:
performance.readdir-ahead: on
[root@tettnang glusterfs]# gluster v status xyz
Status of volume: xyz
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Hot Bricks:
Brick tettnang:/rhs/brick7/xyz              49162     0          Y       23202
Cold Bricks:
Brick tettnang:/rhs/brick1/xyz              49159     0          Y       20879
Brick tettnang:/rhs/brick2/xyz              49160     0          Y       20901
NFS Server on localhost                     N/A       N/A        N       N/A  
NFS Server on zod                           N/A       N/A        N       N/A  
 
Task Status of Volume xyz
------------------------------------------------------------------------------
Task                 : Rebalance           
ID                   : ce287beb-53df-4484-abd7-6216eb994f89
Status               : in progress         
 
[root@tettnang glusterfs]# gluster v detach-tier xyz 
Usage: volume detach-tier <VOLNAME>  <start|stop|status|commit|[force]>
[root@tettnang glusterfs]# gluster v detach-tier xyz start
volume detach-tier start: success
ID: 236cd41b-18fe-41ff-bbf3-b0c318386ec6
[root@tettnang glusterfs]# gluster v detach-tier xyz status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes             0             1             0               failed               0.00
[root@tettnang glusterfs]# gluster v status xyz
Status of volume: xyz
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Hot Bricks:
Brick tettnang:/rhs/brick7/xyz              49162     0          Y       23202
Cold Bricks:
Brick tettnang:/rhs/brick1/xyz              49159     0          Y       20879
Brick tettnang:/rhs/brick2/xyz              49160     0          Y       20901
NFS Server on localhost                     N/A       N/A        N       N/A  
NFS Server on zod                           N/A       N/A        N       N/A  
 
Task Status of Volume xyz
------------------------------------------------------------------------------
Task                 : Remove brick        
ID                   : 236cd41b-18fe-41ff-bbf3-b0c318386ec6
Removed bricks:     
tettnang:/rhs/brick7/xyz
yarrow:/rhs/brick7/xyz
Status               : failed              
 
[root@tettnang glusterfs]# gluster v detach-tier commit
Usage: volume detach-tier <VOLNAME>  <start|stop|status|commit|[force]>
[root@tettnang glusterfs]# gluster v detach-tier xyz commit
volume detach-tier commit: success
Check the detached bricks to ensure all files are migrated.
If files with data are found on the brick path, copy them via a gluster mount point before re-purposing the removed brick. 
[root@tettnang glusterfs]# gluster v detach-tier xyz
Usage: volume detach-tier <VOLNAME>  <start|stop|status|commit|[force]>
[root@tettnang glusterfs]# gluster v detach-tier xyz status
volume detach-tier status: failed: Detach-tier not started.
[root@tettnang glusterfs]# gluster v info xyz
 
Volume Name: xyz
Type: Distributed-Replicate
Volume ID: 7fe8858f-4af3-44df-bc56-dd10e19e8cf6
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: tettnang:/rhs/brick1/xyz
Brick2: yarrow:/rhs/brick1/xyz
Brick3: tettnang:/rhs/brick2/xyz
Brick4: yarrow:/rhs/brick2/xyz
Options Reconfigured:
performance.readdir-ahead: on

Volume Name: xyz
Type: Distributed-Replicate
Volume ID: 7fe8858f-4af3-44df-bc56-dd10e19e8cf6
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: tettnang:/rhs/brick1/xyz
Brick2: yarrow:/rhs/brick1/xyz
Brick3: tettnang:/rhs/brick2/xyz
Brick4: yarrow:/rhs/brick2/xyz
Options Reconfigured:
performance.readdir-ahead: on

Comment 2 Nag Pavan Chilakam 2015-06-26 13:06:28 UTC

Created attachment 1043505 [details]
server#2 logs sosreports

Comment 4 hari gowtham 2016-02-19 08:14:07 UTC

The steps for reproduction mentioned above will result in an error message saying the node is down for detach start. so that the detach commit cannot be issued.

But instead of killing the glusterd if we kill a brick then issue detach start,
then the detach continues. 

As Nag as mentioned, the brick been down should not allow detach start to work.

steps for reproduction:

1) create a tiered volume.
2) kill one brick on the hot tier.
3) issue detach start. (it shouldn't start as it might lead to data loss)
but detach start works. (the detach status is failed)

Expectation:
detach start should throw error.
Found stopped brick

Actual result:
detach start succeeds.

Comment 6 Shyamsundar 2018-02-06 17:34:35 UTC

Thank you for your bug report.

This bug has documentation updated on the problem reported.

Further we are no longer releasing any bug fixes or, other updates for Tier. This bug will be set to CLOSED WONTFIX to reflect this.

Note You need to log in before you can comment on or make changes to this bug.