Description of problem: simultaneous remove-brick commands corrupt volumes. Steps to Reproduce: 1. set up a simple replicated volume with two nodes {code} root@gluster1:~# gluster volume info Volume Name: hosting-test Type: Replicate Volume ID: 0dcadde0-b981-472d-851a-08fbfff40ae3 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: gluster2.justindev:/export/brick1/sdb1 Brick2: gluster1.justindev:/export/brick1/sdb1 {code} 2. add a third brick to the replica {code} root@gluster2:~# gluster volume add-brick hosting-test replica 3 gluster1.justindev:/export/brick2/sdc1 Add Brick successful root@gluster2:~# gluster volume info Volume Name: hosting-test Type: Replicate Volume ID: 0dcadde0-b981-472d-851a-08fbfff40ae3 Status: Started Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: gluster2.justindev:/export/brick1/sdb1 Brick2: gluster1.justindev:/export/brick1/sdb1 Brick3: gluster1.justindev:/export/brick2/sdc1 {code} 3. aaaand now for the fun bit. remove the brick at the same time from both nodes, one will fail, both will report a healthy volume. here's the node that wins: {code} root@gluster1:~# echo y | gluster volume remove-brick hosting-test replica 2 gluster1.justindev:/export/brick2/sdc1 Removing brick(s) can result in data loss. Do you want to Continue? (y/n) Remove Brick commit force successful root@gluster1:~# gluster volume info Volume Name: hosting-test Type: Replicate Volume ID: 0dcadde0-b981-472d-851a-08fbfff40ae3 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: gluster2.justindev:/export/brick1/sdb1 Brick2: gluster1.justindev:/export/brick1/sdb1 {code} and the node that fails: {code} root@gluster2:~# echo y | gluster volume remove-brick hosting-test replica 2 gluster1.justindev:/export/brick2/sdc1 Operation failed Removing brick(s) can result in data loss. Do you want to Continue? (y/n) root@gluster2:~# root@gluster2:~# gluster volume info Volume Name: hosting-test Type: Replicate Volume ID: 0dcadde0-b981-472d-851a-08fbfff40ae3 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: gluster2.justindev:/export/brick1/sdb1 Brick2: gluster1.justindev:/export/brick1/sdb1 {code} 4. stop and start gluster on either node, and we get funky maths: {code} root@gluster2:~# service glusterfs-server stop glusterfs-server stop/waiting root@gluster2:~# service glusterfs-server start glusterfs-server start/running, process 11739 root@gluster2:~# gluster volume info Volume Name: hosting-test Type: Replicate Volume ID: f8d7132b-6bb1-40d4-8414-b2168cdf2cd7 Status: Started Number of Bricks: 0 x 3 = 2 Transport-type: tcp Bricks: Brick1: gluster2.justindev:/export/brick1/sdb1 Brick2: gluster1.justindev:/export/brick1/sdb1 {code} Actual results: volume ends up with funky maths for bricks. Expected results: volumes continue operating normally. Additional info: Ubuntu 13.04, using the 3.3 packages from http://download.gluster.org/pub/gluster/glusterfs/3.3/3.3.2/Ubuntu.README
this bug is worse than my initial description. it can reproduce this, on 3.3 and 3.4, with just these steps: 1. create a simple replicated volume across two nodes, on brick on each node 2. add a third brick to the volume from one of the existing nodes 3. remove the brick 4. restart gluster
*** This bug has been marked as a duplicate of bug 1002556 ***