Bug 1017020

Summary: Volume rebalance starts even if few bricks are down in a distributed volume
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Ramesh N <rnachimu>
Component: distributeAssignee: Nithya Balachandran <nbalacha>
Status: CLOSED DEFERRED QA Contact: Anoop <annair>
Severity: high Docs Contact:
Priority: medium    
Version: 2.1CC: dpati, kaushal, rwheeler, spalai, ssaha, ssampat, vagarwal, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1286186 (view as bug list) Environment:
Last Closed: 2015-11-27 12:17:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 928646    
Bug Blocks: 1286186    

Description Ramesh N 2013-10-09 07:25:34 UTC
Description of problem: GlusterFS starts the rebalance task even when few bricks are down in a distributed volume. But when you see the rebalance status, it fails in all the nodes. This also happens for replicate/striped volumes when a complete set of bricks in a replica group is down. Similar can be seen in 'Remove Brick Start' asynchronous task.

How reproducible:
   Start rebalance on a distributed volume in which few brick processes are down.

Steps to Reproduce:
1. Create a volume with 3 bricks 
2. Kill one of the brick process
3. Start volume rebalance on the volume created

Actual results:

   Rebalance task starts with a task ID and if you check for the task/rebalance status, it will be failed from all the nodes.

Expected results:

   'Rebalance Start' should report an error without starting the rebalance task.

Additional info:

  Same thing reproducible in a replicated volume with all bricks of a replica group down. Similar behaviour observed in remove brick start (brick migration) use case.

Comment 2 Dusmant 2013-10-10 08:54:18 UTC
I am moving this bug priority to Urgent, because RHSC Corbett Rebalance feature would not be complete without this. I have discussed with Vivek and he is going to assign it to someone soon.

We need the fix in U2 branch by 18th Oct or the latest by 21st Oct.

Comment 3 Sayan Saha 2014-10-10 18:24:54 UTC
This is a bug and not a RFE. Making that change.