Bug 1041857

Summary: rest api: rebalance causes hosts to become Non Operational after adding 4 bricks sequentially to a 6 node dist volume
Product: Red Hat Gluster Storage Reporter: Dustin Tsang <dtsang>
Component: rhsc-sdkAssignee: Shubhendu Tripathi <shtripat>
Status: CLOSED NOTABUG QA Contact: Dustin Tsang <dtsang>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 2.1CC: dtsang, knarra, mmahoney, mmccune, pprakash, rhs-bugs, ssampat
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-12-13 03:26:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
automation log
none
vdsm.log
none
engine.log none

Description Dustin Tsang 2013-12-12 19:44:27 UTC
Created attachment 836010 [details]
automation log

Description of problem:

rest api: rebalance causes hosts to become Non Operational after adding 4 bricks sequentially to a 6 node dist volume. 

* glusterd is down on each of the nodes
* vdsmd is up on each of the nodes
* glusterd process terminates briefly after attempts to start glusterd


Version-Release number of selected component (if applicable):
rhsc-cb11

How reproducible:
100% of the time

Steps to Reproduce:
1. setup a 2 node cluster 
2. via rest add a 6 brick dist volume and start the volume
3. via rest add 4 bricks one at a time sequentially to the volume
=> hosts are up and volume is fine.
4. via rest or gui start rebalance on the volume

Actual results:
Rebalance fails to start. Hosts are Non Operational. 

response given --
HTTP 400 connection failed. please check if gluster daemon is operational.

Expected results:
Rebalance starts successfully.

Additional info:

Comment 1 Dustin Tsang 2013-12-12 19:47:33 UTC
Created attachment 836012 [details]
vdsm.log

Comment 2 Dustin Tsang 2013-12-12 19:51:00 UTC
Created attachment 836013 [details]
engine.log

Comment 4 Dustin Tsang 2013-12-12 22:04:10 UTC
adding all bricks at once for step 4 in comment#0 also results in hosts becoming Non Operational.

Comment 5 Dustin Tsang 2013-12-12 22:05:28 UTC
correction to comment#4 s/step 4/step 3/

Comment 6 Dustin Tsang 2013-12-12 23:38:57 UTC
step 3 doesn't need to run to cause the the hosts to go non operational.

Comment 7 Dustin Tsang 2013-12-13 03:26:54 UTC
The issue seems to be with the volume name "StartMigrationDuringRebalanceTest".
Reopening this bug as a glusterfs bug.