Bug 1232717

Summary: Rebalance status changes from 'completed' to 'not started' when replace-brick volume operation is performed
Product: [Community] GlusterFS Reporter: Sakshi <sabansal>
Component: distributeAssignee: Nithya Balachandran <nbalacha>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: mainlineCC: amukherj, bugs, nbalacha, nlevinki, smohan, spandura, storage-qa-internal, vbellur
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-4.1.3 (or later) Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1057553 Environment:
Last Closed: 2018-08-29 03:35:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Sakshi 2015-06-17 11:26:59 UTC
+++ This bug was initially created as a clone of Bug #1057553 +++

Description of problem:
============================
When rebalance on a volume is successfully complete , "gluster volume status <volume_name>" reports rebalance Task as "complete". 

If we perform "replace-brick commit force" of a brick on the same volume, it is changing the already completed rebalance task status to "not started"

root@domU-12-31-39-0A-99-B2 [Jan-23-2014- 4:19:15] >gluster v status exporter
Status of volume: exporter
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick domU-12-31-39-0A-99-B2.compute-1.internal:/rhs/br
icks/exporter						49152	Y	19405
Brick ip-10-194-111-63.ec2.internal:/rhs/bricks/exporte
r							49152	Y	3812
Brick ip-10-234-21-235.ec2.internal:/rhs/bricks/exporte
r							49152	Y	20226
Brick ip-10-2-34-53.ec2.internal:/rhs/bricks/exporter	49152	Y	20910
Brick ip-10-83-5-197.ec2.internal:/rhs/bricks/exporter	49152	Y	8705
Brick ip-10-159-26-108.ec2.internal:/rhs/bricks/exporte
r							49152	Y	20196
Brick domU-12-31-39-07-74-A5.compute-1.internal:/rhs/br
icks/exporter						49152	Y	6553
Brick ip-10-62-118-194.ec2.internal:/rhs/bricks/exporte
r							49152	Y	6391
Brick ip-10-181-128-26.ec2.internal:/rhs/bricks/exporte
r							49152	Y	8569
Brick domU-12-31-39-0B-DC-01.compute-1.internal:/rhs/br
icks/exporter						49152	Y	7145
Brick ip-10-34-105-112.ec2.internal:/rhs/bricks/exporte
r							49152	Y	7123
Brick ip-10-29-156-183.ec2.internal:/rhs/bricks/exporte
r							49152	Y	7103
NFS Server on localhost					2049	Y	15111
Self-heal Daemon on localhost				N/A	Y	15118
NFS Server on ip-10-234-21-235.ec2.internal		2049	Y	13597
Self-heal Daemon on ip-10-234-21-235.ec2.internal	N/A	Y	13604
NFS Server on ip-10-159-26-108.ec2.internal		2049	Y	16799
Self-heal Daemon on ip-10-159-26-108.ec2.internal	N/A	Y	16806
NFS Server on ip-10-2-34-53.ec2.internal		2049	Y	18988
Self-heal Daemon on ip-10-2-34-53.ec2.internal		N/A	Y	18995
NFS Server on domU-12-31-39-0B-DC-01.compute-1.internal	2049	Y	8440
Self-heal Daemon on domU-12-31-39-0B-DC-01.compute-1.in
ternal							N/A	Y	8447
NFS Server on ip-10-29-156-183.ec2.internal		2049	Y	8384
Self-heal Daemon on ip-10-29-156-183.ec2.internal	N/A	Y	8391
NFS Server on ip-10-194-111-63.ec2.internal		2049	Y	3823
Self-heal Daemon on ip-10-194-111-63.ec2.internal	N/A	Y	3829
NFS Server on ip-10-62-118-194.ec2.internal		2049	Y	21600
Self-heal Daemon on ip-10-62-118-194.ec2.internal	N/A	Y	21607
NFS Server on ip-10-181-128-26.ec2.internal		2049	Y	17712
Self-heal Daemon on ip-10-181-128-26.ec2.internal	N/A	Y	17719
NFS Server on ip-10-34-105-112.ec2.internal		2049	Y	9538
Self-heal Daemon on ip-10-34-105-112.ec2.internal	N/A	Y	9545
NFS Server on domU-12-31-39-07-74-A5.compute-1.internal	2049	Y	19344
Self-heal Daemon on domU-12-31-39-07-74-A5.compute-1.in
ternal							N/A	Y	19351
NFS Server on ip-10-83-5-197.ec2.internal		2049	Y	16001
Self-heal Daemon on ip-10-83-5-197.ec2.internal		N/A	Y	16008
 
Task Status of Volume exporter
------------------------------------------------------------------------------
Task                 : Rebalance           
ID                   : 1c08ae18-0458-4a63-9ba8-f7484b3bc3ff
Status               : completed 


root@domU-12-31-39-0A-99-B2 [Jan-24-2014- 4:16:44] >gluster volume replace-brick exporter ip-10-234-21-235.ec2.internal:/rhs/bricks/exporter ip-10-182-165-181.ec2.internal:/rhs/bricks/exporter commit force ; gluster volume replace-brick exporter ip-10-2-34-53.ec2.internal:/rhs/bricks/exporter ip-10-46-226-179.ec2.internal:/rhs/bricks/exporter commit force ; gluster volume replace-brick exporter ip-10-62-118-194.ec2.internal:/rhs/bricks/exporter ip-10-80-109-233.ec2.internal:/rhs/bricks/exporter commit force ; gluster volume replace-brick exporter ip-10-29-156-183.ec2.internal:/rhs/bricks/exporter ip-10-232-7-75.ec2.internal:/rhs/bricks/exporter commit force 


root@domU-12-31-39-0A-99-B2 [Jan-24-2014-10:01:25] >gluster v status exporter
Status of volume: exporter
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick domU-12-31-39-0A-99-B2.compute-1.internal:/rhs/br
icks/exporter						49152	Y	19405
Brick ip-10-194-111-63.ec2.internal:/rhs/bricks/exporte
r							49152	Y	3812
Brick ip-10-182-165-181.ec2.internal:/rhs/bricks/export
er							49152	Y	3954
Brick ip-10-46-226-179.ec2.internal:/rhs/bricks/exporte
r							49152	Y	3933
Brick ip-10-83-5-197.ec2.internal:/rhs/bricks/exporter	49152	Y	8705
Brick ip-10-159-26-108.ec2.internal:/rhs/bricks/exporte
r							49152	Y	20196
Brick domU-12-31-39-07-74-A5.compute-1.internal:/rhs/br
icks/exporter						49152	Y	6553
Brick ip-10-80-109-233.ec2.internal:/rhs/bricks/exporte
r							49152	Y	6450
Brick ip-10-181-128-26.ec2.internal:/rhs/bricks/exporte
r							49152	Y	8569
Brick domU-12-31-39-0B-DC-01.compute-1.internal:/rhs/br
icks/exporter						49152	Y	7145
Brick ip-10-34-105-112.ec2.internal:/rhs/bricks/exporte
r							49152	Y	7123
Brick ip-10-232-7-75.ec2.internal:/rhs/bricks/exporter	49152	Y	3935
NFS Server on localhost					2049	Y	2736
Self-heal Daemon on localhost				N/A	Y	2741
NFS Server on ip-10-34-105-112.ec2.internal		2049	Y	29815
Self-heal Daemon on ip-10-34-105-112.ec2.internal	N/A	Y	29821
NFS Server on ip-10-182-165-181.ec2.internal		2049	Y	4242
Self-heal Daemon on ip-10-182-165-181.ec2.internal	N/A	Y	4249
NFS Server on ip-10-232-7-75.ec2.internal		2049	Y	4129
Self-heal Daemon on ip-10-232-7-75.ec2.internal		N/A	Y	4139
NFS Server on ip-10-194-111-63.ec2.internal		2049	Y	23275
Self-heal Daemon on ip-10-194-111-63.ec2.internal	N/A	Y	23280
NFS Server on ip-10-80-109-233.ec2.internal		2049	Y	6689
Self-heal Daemon on ip-10-80-109-233.ec2.internal	N/A	Y	6695
NFS Server on domU-12-31-39-0B-DC-01.compute-1.internal	2049	Y	30069
Self-heal Daemon on domU-12-31-39-0B-DC-01.compute-1.in
ternal							N/A	Y	30075
NFS Server on ip-10-159-26-108.ec2.internal		2049	Y	7495
Self-heal Daemon on ip-10-159-26-108.ec2.internal	N/A	Y	7501
NFS Server on ip-10-83-5-197.ec2.internal		2049	Y	7996
Self-heal Daemon on ip-10-83-5-197.ec2.internal		N/A	Y	8001
NFS Server on domU-12-31-39-07-74-A5.compute-1.internal	2049	Y	6137
Self-heal Daemon on domU-12-31-39-07-74-A5.compute-1.in
ternal							N/A	Y	6143
NFS Server on ip-10-181-128-26.ec2.internal		2049	Y	4389
Self-heal Daemon on ip-10-181-128-26.ec2.internal	N/A	Y	4394
NFS Server on ip-10-46-226-179.ec2.internal		2049	Y	4186
Self-heal Daemon on ip-10-46-226-179.ec2.internal	N/A	Y	4192
 
Task Status of Volume exporter
------------------------------------------------------------------------------
Task                 : Rebalance           
ID                   : 1c08ae18-0458-4a63-9ba8-f7484b3bc3ff
Status               : not started        

Version-Release number of selected component (if applicable):
===================================================================
glusterfs 3.4.0.57rhs built on Jan 13 2014 06:59:05

How reproducible:
=================
Often

Steps to Reproduce:
======================
1. Create a distribute-replicate volume ( 2 x 3 ) 

2. Create fuse mount and add files/dirs from mount point. 

3. Add new bricks to the volume changing the volume type to 3 x 3. 

4. Start rebalance. Wait for the rebalance to complete. 

5. check "gluster v status" to see the rebalance task completion. 

5. Bring down a brick.

6. replace-brick the offline brick with a new brick ("commit force").

7. check "gluster volume status"

Actual results:
=================
The rebalance status will be reset to "not started" from "completed" state. 

Expected results:
===================
rebalance status for a given id should not change once the rebalance task is complete or rebalance status should not be changed by any other process that doesn't involve rebalance.

Comment 1 Anand Avati 2015-06-17 11:33:00 UTC
REVIEW: http://review.gluster.org/11277 (dht: retaining rebalance status after replace-brick) posted (#1) for review on master by Sakshi Bansal (sabansal)

Comment 2 Mike McCune 2016-03-28 23:31:34 UTC
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions

Comment 5 Amar Tumballi 2018-08-29 03:35:58 UTC
This update is done in bulk based on the state of the patch and the time since last activity. If the issue is still seen, please reopen the bug.