1567173 – Able to start rebalance even when the newly added brick is down

Bug 1567173 - Able to start rebalance even when the newly added brick is down

Summary: Able to start rebalance even when the newly added brick is down

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	distribute
Sub Component:
Version:	rhgs-3.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	low
Target Milestone:	---
Target Release:	---
Assignee:	Nithya Balachandran
QA Contact:	Prasad Desala
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-04-13 14:02 UTC by Vijay Avuthu
Modified:	2018-04-18 04:24 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-04-18 04:24:23 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Vijay Avuthu 2018-04-13 14:02:48 UTC

Description of problem:

Able to start rebalance even when the newly added brick is down

Version-Release number of selected component (if applicable):

glusterfs-server-3.12.2-7.el7rhgs.x86_64

How reproducible:

Always

Steps to Reproduce:
1) create 2 * 3 ( distribute-replicate ) volume and start
2) write I/O's from client
3) Add new bricks to the volume
4) Kill one of the newly added brick
5) start rebalance without force

Actual results:

Rebalance is started without failure 

# gluster vol rebalance 23 start
volume rebalance: 23: success: Rebalance on 23 has been started successfully. Use rebalance status command to check status of the rebalance process.
ID: d4860d83-2959-41c3-8806-44552702d30f
# 

Expected results:

Rebalance should fail when newly added brick is down

Additional info:

> o/p of volume status after newly added brick is down 

# gluster vol status 23
Status of volume: 23
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.61:/bricks/brick0/testvol_di
stributed-replicated_brick0                 49153     0          Y       18275
Brick 10.70.35.174:/bricks/brick0/testvol_d
istributed-replicated_brick1                49153     0          Y       1468 
Brick 10.70.35.17:/bricks/brick0/testvol_di
stributed-replicated_brick2                 49153     0          Y       7750 
Brick 10.70.35.163:/bricks/brick0/testvol_d
istributed-replicated_brick3                49153     0          Y       9256 
Brick 10.70.35.136:/bricks/brick0/testvol_d
istributed-replicated_brick4                49153     0          Y       30581
Brick 10.70.35.214:/bricks/brick0/testvol_d
istributed-replicated_brick5                49153     0          Y       7817 
Brick 10.70.35.163:/bricks/brick3/testvol_d
istributed-replicated_brick9                N/A       N/A        N       N/A  
Brick 10.70.35.136:/bricks/brick3/testvol_d
istributed-replicated_brick10               49153     0          Y       30581
Brick 10.70.35.214:/bricks/brick3/testvol_d
istributed-replicated_brick11               49153     0          Y       7817 
Self-heal Daemon on localhost               N/A       N/A        Y       18429
Self-heal Daemon on dhcp35-163.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       9376 
Self-heal Daemon on dhcp35-174.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       1585 
Self-heal Daemon on dhcp35-17.lab.eng.blr.r
edhat.com                                   N/A       N/A        Y       7862 
Self-heal Daemon on dhcp35-136.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       30706
Self-heal Daemon on dhcp35-214.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       7942 
 
Task Status of Volume 23
------------------------------------------------------------------------------
There are no active volume tasks
 
# 


# gluster vol rebalance 23 start
volume rebalance: 23: success: Rebalance on 23 has been started successfully. Use rebalance status command to check status of the rebalance process.
ID: d4860d83-2959-41c3-8806-44552702d30f
# 

> I have executed the same scenario in build ( glusterfs-server-3.8.4-54.4.el7rhgs.x86_64 ) and below is o/p of command

# gluster vol status oldbuild
Status of volume: oldbuild
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick gqas013.sbu.lab.eng.bos.redhat.com:/g
luster/brick10/b0                           49164     0          Y       24972
Brick gqas016.sbu.lab.eng.bos.redhat.com:/g
luster/brick10/b1                           49164     0          Y       24275
Brick gqas006.sbu.lab.eng.bos.redhat.com:/g
luster/brick10/b2                           49164     0          Y       23737
Brick gqas008.sbu.lab.eng.bos.redhat.com:/g
luster/brick10/b3                           49164     0          Y       24210
Brick gqas007.sbu.lab.eng.bos.redhat.com:/g
luster/brick10/b4                           49164     0          Y       23448
Brick gqas003.sbu.lab.eng.bos.redhat.com:/g
luster/brick10/b5                           49164     0          Y       31227
Brick gqas008.sbu.lab.eng.bos.redhat.com:/g
luster/brick11/b6                           N/A       N/A        N       N/A  
Brick gqas007.sbu.lab.eng.bos.redhat.com:/g
luster/brick10/b7                           49165     0          Y       23642
Brick gqas003.sbu.lab.eng.bos.redhat.com:/g
luster/brick10/b8                           49165     0          Y       31415
Self-heal Daemon on localhost               N/A       N/A        Y       25229
Self-heal Daemon on gqas016.sbu.lab.eng.bos
.redhat.com                                 N/A       N/A        Y       24468
Self-heal Daemon on gqas006.sbu.lab.eng.bos
.redhat.com                                 N/A       N/A        Y       23940
Self-heal Daemon on gqas008.sbu.lab.eng.bos
.redhat.com                                 N/A       N/A        Y       24437
Self-heal Daemon on gqas003.sbu.lab.eng.bos
.redhat.com                                 N/A       N/A        Y       31444
Self-heal Daemon on gqas007.sbu.lab.eng.bos
.redhat.com                                 N/A       N/A        Y       23671
 
Task Status of Volume oldbuild
------------------------------------------------------------------------------
There are no active volume tasks
 
# 

# gluster vol rebalance oldbuild start
volume rebalance: oldbuild: failed: Received rebalance on volume with  stopped brick /gluster/brick11/b6
#

Note You need to log in before you can comment on or make changes to this bug.