Bug 815745

Summary:	Volume type chage: can't change a Distribute(N)-Replicate(2) to Distribute(N) through remove-brick
Product:	[Community] GlusterFS	Reporter:	shylesh <shmohan>
Component:	glusterd	Assignee:	Amar Tumballi <amarts>
Status:	CLOSED CURRENTRELEASE	QA Contact:	shylesh <shmohan>
Severity:	high	Docs Contact:
Priority:	high
Version:	pre-release	CC:	amarts, gluster-bugs, vraman
Target Milestone:	---
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	glusterfs-3.4.0	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2013-07-24 17:38:06 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:	3.3.0qa42	Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	817967

Description shylesh 2012-04-24 12:21:02 UTC

Description of problem:
Can't change the volume type from distribute-replicate to distribute through remove-brick

Version-Release number of selected component (if applicable):
Mainline

How reproducible:


Steps to Reproduce:
1. create a volume of Distribute(2)-Replicate(2) volume
Volume Name: rem
Type: Distributed-Replicate
Volume ID: 9b0b4f73-8553-411d-bd70-c7a776143b1e
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 10.16.157.63:/home/bricks/re1
Brick2: 10.16.157.66:/home/bricks/re2
Brick3: 10.16.157.63:/home/bricks/re3
Brick4: 10.16.157.66:/home/bricks/re4

2. [root@gqac023 mnt]# gluster volume remove-brick rem replica 1   10.16.157.63:/home/bricks/re3 10.16.157.66:/home/bricks/re2
Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y
[root@gqac023 mnt]# echo $?
255

 
  
Actual results:
log says
========
[2012-04-24 08:09:37.322993] I [glusterd-utils.c:855:glusterd_volume_brickinfo_get_by_brick] 0-: brick: 10.16.15
7.63:/home/bricks/re3
[2012-04-24 08:09:37.323009] I [glusterd-utils.c:812:glusterd_volume_brickinfo_get] 0-management: Found brick
[2012-04-24 08:09:37.323019] I [glusterd-brick-ops.c:781:glusterd_handle_remove_brick] 0-management: failed to v
alidate brick 10.16.157.63:/home/bricks/re3 (3 0 2)
[2012-04-24 08:09:37.323028] E [glusterd-brick-ops.c:833:glusterd_handle_remove_brick] 0-: Bricks are from same 
subvol

Comment 1 Amar Tumballi 2012-04-27 07:13:52 UTC

Shylesh,

Can you try

[root@gqac023 mnt]# gluster volume remove-brick rem replica 1  
10.16.157.63:/home/bricks/re2 10.16.157.66:/home/bricks/re3

Instead of 

> [root@gqac023 mnt]# gluster volume remove-brick rem replica 1  
> 10.16.157.63:/home/bricks/re3 10.16.157.66:/home/bricks/re2

[Notice the change in the order of bricks]

Comment 2 Anand Avati 2012-05-04 04:28:44 UTC

CHANGE: http://review.gluster.com/3235 (cli: fix remove-brick output behavior in failure cases) merged in master by Vijay Bellur (vijay)

Comment 3 shylesh 2012-05-24 04:41:10 UTC

Volume Name: another
Type: Distributed-Replicate
Volume ID: eb78eeac-bd11-430a-98c3-c9cbe264f67e
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 10.16.157.63:/home/bricks/another0
Brick2: 10.16.157.66:/home/bricks/another1
Brick3: 10.16.157.69:/home/bricks/another2
Brick4: 10.16.157.63:/home/bricks/another3



[root@gqac022 ~]# gluster v remove-brick another replica 1 10.16.157.66:/home/bricks/another1 10.16.157.69:/home/bricks/another2
Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y
Remove Brick commit force unsuccessful


Volume Name: another
Type: Distribute
Volume ID: eb78eeac-bd11-430a-98c3-c9cbe264f67e
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: 10.16.157.63:/home/bricks/another0
Brick2: 10.16.157.63:/home/bricks/another3


Though remove-brick said unsuccessfull, configuration has actually changed. 

2012-05-24 04:35:54.540952] I [glusterd-utils.c:812:glusterd_volume_brickinfo_get] 0-management: Found brick
[2012-05-24 04:35:54.542297] I [glusterd-utils.c:812:glusterd_volume_brickinfo_get] 0-management: Found brick
[2012-05-24 04:35:54.542328] I [glusterd-utils.c:812:glusterd_volume_brickinfo_get] 0-management: Found brick
[2012-05-24 04:35:54.542617] E [glusterd-volgen.c:2146:volgen_graph_build_clients] 0-: volume inconsistency: total number of bricks (6) is not divisible with number of bricks per cluster (4) in a multi-cluster setup
[2012-05-24 04:35:54.542652] E [glusterd-op-sm.c:2324:glusterd_op_ac_send_commit_op] 0-management: Commit failed
[2012-05-24 04:35:54.542664] I [glusterd-op-sm.c:2254:glusterd_op_modify_op_ctx] 0-management: op_ctx modification not required
[2012-05-24 04:35:54.543487] I [glusterd-rpc-ops.c:606:glusterd3_1_cluster_unlock_cbk] 0-glusterd: Received ACC from uuid: 420dd6d2-44b7-4dfa-8133-48c0326995cd
[2012-05-24 04:35:54.543514] I [glusterd-rpc-ops.c:606:glusterd3_1_cluster_unlock_cbk] 0-glusterd: Received ACC from uuid: 2c121e31-9551-4b76-b588-d1302cab6a68
[2012-05-24 04:35:54.543537] I [glusterd-rpc-ops.c:606:glusterd3_1_cluster_unlock_cbk] 0-glusterd: Received ACC from uuid: f8222994-1e66-49a3-966c-53dc012e8308
[2012-05-24 04:35:54.543548] I [glusterd-op-sm.c:2627:glusterd_op_txn_complete] 0-glusterd: Cleared local lock
(END)

Comment 4 shylesh 2012-05-24 05:28:05 UTC

The above failue was because of peers in the cluster was not in sync so other operaitons were failing, actual issue is fixed for this bug.