815745 – Volume type chage: can't change a Distribute(N)-Replicate(2) to Distribute(N) through remove-brick

Bug 815745 - Volume type chage: can't change a Distribute(N)-Replicate(2) to Distribute(N) through remove-brick

Summary: Volume type chage: can't change a Distribute(N)-Replicate(2) to Distribute(N)...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	glusterd
Sub Component:
Version:	pre-release
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	---
Assignee:	Amar Tumballi
QA Contact:	shylesh
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	817967
TreeView+	depends on / blocked

Reported:	2012-04-24 12:21 UTC by shylesh
Modified:	2013-12-19 00:08 UTC (History)
CC List:	3 users (show)
Fixed In Version:	glusterfs-3.4.0
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2013-07-24 17:38:06 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:	3.3.0qa42
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description shylesh 2012-04-24 12:21:02 UTC

Description of problem:
Can't change the volume type from distribute-replicate to distribute through remove-brick

Version-Release number of selected component (if applicable):
Mainline

How reproducible:


Steps to Reproduce:
1. create a volume of Distribute(2)-Replicate(2) volume
Volume Name: rem
Type: Distributed-Replicate
Volume ID: 9b0b4f73-8553-411d-bd70-c7a776143b1e
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 10.16.157.63:/home/bricks/re1
Brick2: 10.16.157.66:/home/bricks/re2
Brick3: 10.16.157.63:/home/bricks/re3
Brick4: 10.16.157.66:/home/bricks/re4

2. [root@gqac023 mnt]# gluster volume remove-brick rem replica 1   10.16.157.63:/home/bricks/re3 10.16.157.66:/home/bricks/re2
Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y
[root@gqac023 mnt]# echo $?
255

 
  
Actual results:
log says
========
[2012-04-24 08:09:37.322993] I [glusterd-utils.c:855:glusterd_volume_brickinfo_get_by_brick] 0-: brick: 10.16.15
7.63:/home/bricks/re3
[2012-04-24 08:09:37.323009] I [glusterd-utils.c:812:glusterd_volume_brickinfo_get] 0-management: Found brick
[2012-04-24 08:09:37.323019] I [glusterd-brick-ops.c:781:glusterd_handle_remove_brick] 0-management: failed to v
alidate brick 10.16.157.63:/home/bricks/re3 (3 0 2)
[2012-04-24 08:09:37.323028] E [glusterd-brick-ops.c:833:glusterd_handle_remove_brick] 0-: Bricks are from same 
subvol

Comment 1 Amar Tumballi 2012-04-27 07:13:52 UTC

Shylesh,

Can you try

[root@gqac023 mnt]# gluster volume remove-brick rem replica 1  
10.16.157.63:/home/bricks/re2 10.16.157.66:/home/bricks/re3

Instead of 

> [root@gqac023 mnt]# gluster volume remove-brick rem replica 1  
> 10.16.157.63:/home/bricks/re3 10.16.157.66:/home/bricks/re2

[Notice the change in the order of bricks]

Comment 2 Anand Avati 2012-05-04 04:28:44 UTC

CHANGE: http://review.gluster.com/3235 (cli: fix remove-brick output behavior in failure cases) merged in master by Vijay Bellur (vijay)

Comment 3 shylesh 2012-05-24 04:41:10 UTC

Volume Name: another
Type: Distributed-Replicate
Volume ID: eb78eeac-bd11-430a-98c3-c9cbe264f67e
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 10.16.157.63:/home/bricks/another0
Brick2: 10.16.157.66:/home/bricks/another1
Brick3: 10.16.157.69:/home/bricks/another2
Brick4: 10.16.157.63:/home/bricks/another3



[root@gqac022 ~]# gluster v remove-brick another replica 1 10.16.157.66:/home/bricks/another1 10.16.157.69:/home/bricks/another2
Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y
Remove Brick commit force unsuccessful


Volume Name: another
Type: Distribute
Volume ID: eb78eeac-bd11-430a-98c3-c9cbe264f67e
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: 10.16.157.63:/home/bricks/another0
Brick2: 10.16.157.63:/home/bricks/another3


Though remove-brick said unsuccessfull, configuration has actually changed. 

2012-05-24 04:35:54.540952] I [glusterd-utils.c:812:glusterd_volume_brickinfo_get] 0-management: Found brick
[2012-05-24 04:35:54.542297] I [glusterd-utils.c:812:glusterd_volume_brickinfo_get] 0-management: Found brick
[2012-05-24 04:35:54.542328] I [glusterd-utils.c:812:glusterd_volume_brickinfo_get] 0-management: Found brick
[2012-05-24 04:35:54.542617] E [glusterd-volgen.c:2146:volgen_graph_build_clients] 0-: volume inconsistency: total number of bricks (6) is not divisible with number of bricks per cluster (4) in a multi-cluster setup
[2012-05-24 04:35:54.542652] E [glusterd-op-sm.c:2324:glusterd_op_ac_send_commit_op] 0-management: Commit failed
[2012-05-24 04:35:54.542664] I [glusterd-op-sm.c:2254:glusterd_op_modify_op_ctx] 0-management: op_ctx modification not required
[2012-05-24 04:35:54.543487] I [glusterd-rpc-ops.c:606:glusterd3_1_cluster_unlock_cbk] 0-glusterd: Received ACC from uuid: 420dd6d2-44b7-4dfa-8133-48c0326995cd
[2012-05-24 04:35:54.543514] I [glusterd-rpc-ops.c:606:glusterd3_1_cluster_unlock_cbk] 0-glusterd: Received ACC from uuid: 2c121e31-9551-4b76-b588-d1302cab6a68
[2012-05-24 04:35:54.543537] I [glusterd-rpc-ops.c:606:glusterd3_1_cluster_unlock_cbk] 0-glusterd: Received ACC from uuid: f8222994-1e66-49a3-966c-53dc012e8308
[2012-05-24 04:35:54.543548] I [glusterd-op-sm.c:2627:glusterd_op_txn_complete] 0-glusterd: Cleared local lock
(END)

Comment 4 shylesh 2012-05-24 05:28:05 UTC

The above failue was because of peers in the cluster was not in sync so other operaitons were failing, actual issue is fixed for this bug.

Note You need to log in before you can comment on or make changes to this bug.