Red Hat Bugzilla – Bug 1017020
Volume rebalance starts even if few bricks are down in a distributed volume
Last modified: 2015-11-27 07:21:09 EST
Description of problem: GlusterFS starts the rebalance task even when few bricks are down in a distributed volume. But when you see the rebalance status, it fails in all the nodes. This also happens for replicate/striped volumes when a complete set of bricks in a replica group is down. Similar can be seen in 'Remove Brick Start' asynchronous task.
Start rebalance on a distributed volume in which few brick processes are down.
Steps to Reproduce:
1. Create a volume with 3 bricks
2. Kill one of the brick process
3. Start volume rebalance on the volume created
Rebalance task starts with a task ID and if you check for the task/rebalance status, it will be failed from all the nodes.
'Rebalance Start' should report an error without starting the rebalance task.
Same thing reproducible in a replicated volume with all bricks of a replica group down. Similar behaviour observed in remove brick start (brick migration) use case.
I am moving this bug priority to Urgent, because RHSC Corbett Rebalance feature would not be complete without this. I have discussed with Vivek and he is going to assign it to someone soon.
We need the fix in U2 branch by 18th Oct or the latest by 21st Oct.
This is a bug and not a RFE. Making that change.