Bug 1035976

Summary: Rebalance starts on a volume even if one of the participating node's glusterd is down
Product: Red Hat Gluster Storage Reporter: Shubhendu Tripathi <shtripat>
Component: distributeAssignee: Nithya Balachandran <nbalacha>
Status: CLOSED DEFERRED QA Contact: shylesh <shmohan>
Severity: high Docs Contact:
Priority: high    
Version: 2.1CC: amukherj, nsathyan, sdharane, spalai, vagarwal, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1286157 1286159 (view as bug list) Environment:
Last Closed: 2015-11-27 12:10:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1035460, 1286157, 1286159    

Description Shubhendu Tripathi 2013-11-29 05:31:12 UTC
Description of problem:
Rebalance starts on a volume which has multiple nodes participating with bricks and glusterd is down on one of these nodes. However if the status if executed for rebalance on the same volume, it shows the status as FAILED.

Version-Release number of selected component (if applicable):


How reproducible:
Ocassionally.
Mostly it gives the error saying "Rebalance cannot be stared, One of the bricks is down".

Steps to Reproduce:
1. Create a two node cluster
2. Create a volume say vol1 with bricks on both the nodes
3. Bring the glusterd down on one of the nodes using "service glusterd stop"
4. Start rebalance on the volume vol1

Actual results:
Sometimes the start rebalance is successful and the status of rebalance shows FAILED.

Expected results:
Start rebalance should always fail in first stage if one of the participating nodes has glusterd down.

Additional info:

Comment 2 Vivek Agarwal 2013-12-10 10:03:12 UTC
Per bug triage discussion with Shanks and Dusmant, removing it from corbett

Comment 6 Kaushal 2014-06-02 06:53:22 UTC
The issue here is with the order of daemonizing and graph initialization in the glusterfsd process. The fetching of the volfiles and the graph initialization is done after the process is daemonized.
When glusterd starts the rebalance process, it returns after the process has daemonized assuming (correctly) that the process started and returns that starting the rebalance process was successful. But, in this particular case the graph initialization of the rebalance process fails as it cannot connect to the brick on the downed peer (this failure occurs even if just glusterd is down, as the brick port cannot be obtained by the client xlator). Since, rebalance requires all DHT subvolumes to be online, the process kills itself. This leads to the rebalance status showing as failed almost immediately. This is similar to rebalance ending up in failed status when a peer goes down during rebalance.

This could be fixed in two ways,
1. Make sure that rebalance process has correctly initialized its graph and is connected to all the bricks before returning success. This is quite hard to, and probably requires new tolling to be done to support this approach,

2. Check if all the peers involved and the bricks involved are online during the staging of rebalance start. This is comparatively easier to do, as we already have a mechanism to check volume quorum thanks to volume snapshots. But this is also big change involving significant code changes as the volume quorum framework as it is present is kindof tied up with the snapshot.

Removing devel_ack and the Denali label as solving this issue for the Denali is not easy or straight forward.

Comment 9 Susant Kumar Palai 2015-11-27 12:10:25 UTC
Cloning this to 3.1. to be fixed in future release.