Bug 815745

Summary: Volume type chage: can't change a Distribute(N)-Replicate(2) to Distribute(N) through remove-brick
Product: [Community] GlusterFS Reporter: shylesh <shmohan>
Component: glusterdAssignee: Amar Tumballi <amarts>
Status: CLOSED CURRENTRELEASE QA Contact: shylesh <shmohan>
Severity: high Docs Contact:
Priority: high    
Version: pre-releaseCC: amarts, gluster-bugs, vraman
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.4.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-07-24 17:38:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: 3.3.0qa42 Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 817967    

Description shylesh 2012-04-24 12:21:02 UTC
Description of problem:
Can't change the volume type from distribute-replicate to distribute through remove-brick

Version-Release number of selected component (if applicable):
Mainline

How reproducible:


Steps to Reproduce:
1. create a volume of Distribute(2)-Replicate(2) volume
Volume Name: rem
Type: Distributed-Replicate
Volume ID: 9b0b4f73-8553-411d-bd70-c7a776143b1e
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 10.16.157.63:/home/bricks/re1
Brick2: 10.16.157.66:/home/bricks/re2
Brick3: 10.16.157.63:/home/bricks/re3
Brick4: 10.16.157.66:/home/bricks/re4

2. [root@gqac023 mnt]# gluster volume remove-brick rem replica 1   10.16.157.63:/home/bricks/re3 10.16.157.66:/home/bricks/re2
Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y
[root@gqac023 mnt]# echo $?
255

 
  
Actual results:
log says
========
[2012-04-24 08:09:37.322993] I [glusterd-utils.c:855:glusterd_volume_brickinfo_get_by_brick] 0-: brick: 10.16.15
7.63:/home/bricks/re3
[2012-04-24 08:09:37.323009] I [glusterd-utils.c:812:glusterd_volume_brickinfo_get] 0-management: Found brick
[2012-04-24 08:09:37.323019] I [glusterd-brick-ops.c:781:glusterd_handle_remove_brick] 0-management: failed to v
alidate brick 10.16.157.63:/home/bricks/re3 (3 0 2)
[2012-04-24 08:09:37.323028] E [glusterd-brick-ops.c:833:glusterd_handle_remove_brick] 0-: Bricks are from same 
subvol

Comment 1 Amar Tumballi 2012-04-27 07:13:52 UTC
Shylesh,

Can you try

[root@gqac023 mnt]# gluster volume remove-brick rem replica 1  
10.16.157.63:/home/bricks/re2 10.16.157.66:/home/bricks/re3

Instead of 

> [root@gqac023 mnt]# gluster volume remove-brick rem replica 1  
> 10.16.157.63:/home/bricks/re3 10.16.157.66:/home/bricks/re2

[Notice the change in the order of bricks]

Comment 2 Anand Avati 2012-05-04 04:28:44 UTC
CHANGE: http://review.gluster.com/3235 (cli: fix remove-brick output behavior in failure cases) merged in master by Vijay Bellur (vijay)

Comment 3 shylesh 2012-05-24 04:41:10 UTC
Volume Name: another
Type: Distributed-Replicate
Volume ID: eb78eeac-bd11-430a-98c3-c9cbe264f67e
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 10.16.157.63:/home/bricks/another0
Brick2: 10.16.157.66:/home/bricks/another1
Brick3: 10.16.157.69:/home/bricks/another2
Brick4: 10.16.157.63:/home/bricks/another3



[root@gqac022 ~]# gluster v remove-brick another replica 1 10.16.157.66:/home/bricks/another1 10.16.157.69:/home/bricks/another2
Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y
Remove Brick commit force unsuccessful


Volume Name: another
Type: Distribute
Volume ID: eb78eeac-bd11-430a-98c3-c9cbe264f67e
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: 10.16.157.63:/home/bricks/another0
Brick2: 10.16.157.63:/home/bricks/another3


Though remove-brick said unsuccessfull, configuration has actually changed. 

2012-05-24 04:35:54.540952] I [glusterd-utils.c:812:glusterd_volume_brickinfo_get] 0-management: Found brick
[2012-05-24 04:35:54.542297] I [glusterd-utils.c:812:glusterd_volume_brickinfo_get] 0-management: Found brick
[2012-05-24 04:35:54.542328] I [glusterd-utils.c:812:glusterd_volume_brickinfo_get] 0-management: Found brick
[2012-05-24 04:35:54.542617] E [glusterd-volgen.c:2146:volgen_graph_build_clients] 0-: volume inconsistency: total number of bricks (6) is not divisible with number of bricks per cluster (4) in a multi-cluster setup
[2012-05-24 04:35:54.542652] E [glusterd-op-sm.c:2324:glusterd_op_ac_send_commit_op] 0-management: Commit failed
[2012-05-24 04:35:54.542664] I [glusterd-op-sm.c:2254:glusterd_op_modify_op_ctx] 0-management: op_ctx modification not required
[2012-05-24 04:35:54.543487] I [glusterd-rpc-ops.c:606:glusterd3_1_cluster_unlock_cbk] 0-glusterd: Received ACC from uuid: 420dd6d2-44b7-4dfa-8133-48c0326995cd
[2012-05-24 04:35:54.543514] I [glusterd-rpc-ops.c:606:glusterd3_1_cluster_unlock_cbk] 0-glusterd: Received ACC from uuid: 2c121e31-9551-4b76-b588-d1302cab6a68
[2012-05-24 04:35:54.543537] I [glusterd-rpc-ops.c:606:glusterd3_1_cluster_unlock_cbk] 0-glusterd: Received ACC from uuid: f8222994-1e66-49a3-966c-53dc012e8308
[2012-05-24 04:35:54.543548] I [glusterd-op-sm.c:2627:glusterd_op_txn_complete] 0-glusterd: Cleared local lock
(END)

Comment 4 shylesh 2012-05-24 05:28:05 UTC
The above failue was because of peers in the cluster was not in sync so other operaitons were failing, actual issue is fixed for this bug.