1006247 – Rebalance Status message not showing correct status .

Bug 1006247 - Rebalance Status message not showing correct status .

Summary: Rebalance Status message not showing correct status .

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	glusterd
Sub Component:
Version:	3.4.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Kaushal
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	982104
TreeView+	depends on / blocked

Reported:	2013-09-10 10:02 UTC by Kaushal
Modified:	2014-04-17 11:47 UTC (History)
CC List:	6 users (show)
Fixed In Version:	glusterfs-3.5.0
Clone Of:	982104
Environment:
Last Closed:	2014-04-17 11:47:36 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Kaushal 2013-09-10 10:02:28 UTC

+++ This bug was initially created as a clone of Bug #982104 +++

Description of problem:
======================== 
After executing rebalance operation on the volume , add some more bricks and check rebalance status . It shows 'not started' but shows the files that were rebalanced in the previous operation . 


Version-Release number of selected component (if applicable):
============================================================ 
3.4.0.12rhs.beta3-1.el6rhs.x86_64


How reproducible:

Steps to Reproduce:
=================== 
1.Create a distributed volume 

2.Add 2 bricks and start rebalance 

3.Check rebalance status 

gluster v rebalance vol_11 status
Node   Rebalanced-files  size    scanned   failures    status run time in secs

localhost   28         280.0MB    305         0      completed      9.00
10.70.34.85 26         260.0MB    278         0      completed      9.00
10.70.34.86 40         400.0MB    344         0      completed      10.00

4. Add 2 more bricks 

5. Check rebalance status (with out starting another rebalabce operation)

gluster v rebalance vol_11 status
Node   Rebalanced-files  size    scanned   failures    status run time in secs

localhost   28         280.0MB    305         0      not started    9.00
10.70.34.85 26         260.0MB    278         0      not started    9.00
10.70.34.86 40         400.0MB    344         0      not started    10.00

Actual results:
==============
Status shows not started where as rebalanced files shows the no of files rebalanced in previous operation 

Expected results:
================ 
If Status shows 'not started' , then the other parameters like rebalanced files , size , scanned and run time should show '0'. 

Ideally if a new rebalance operation has not been started , the status should still show the status of the previous rebalance operation 

Additional info:

--- Additional comment from Sahina Bose on 2013-08-28 11:40:12 IST ---

Another related issue,

When gluster rebalance is completed, and a brick added to volume after completion, the gluster volume status all gives incorrect output:

[root@localhost ~]# gluster volume status all
Status of volume: dv1
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick 10.70.42.152:/brcks/dvb1				49156	Y	11341
Brick 10.70.42.152:/brcks/dvb2				49157	Y	11351
Brick 10.70.42.152:/brcks/dvb3				49158	Y	27642
NFS Server on localhost					2049	Y	27652
 
           Task                                      ID         Status
           ----                                      --         ------
      Rebalance    4a96c34d-fe5e-48b3-b349-80621abb85f3              0


Here, that taskid was for the previous rebalance operation. 
The status is incorrectly shown as "Not Started"

This affects the task monitoring in RHSC. Can this be fixed, please?

Comment 1 Anand Avati 2013-09-11 07:28:17 UTC

REVIEW: http://review.gluster.org/5895 (glusterd: Reset rebalance task on add-brick) posted (#1) for review on master by Kaushal M (kaushal)

Comment 2 Anand Avati 2013-09-16 04:06:07 UTC

REVIEW: http://review.gluster.org/5895 (glusterd: Don't reset rebalance status on add-brick) posted (#2) for review on master by Kaushal M (kaushal)

Comment 3 Anand Avati 2013-09-18 05:00:45 UTC

REVIEW: http://review.gluster.org/5895 (glusterd: Don't reset rebalance status on add-brick) posted (#3) for review on master by Kaushal M (kaushal)

Comment 4 Anand Avati 2013-09-18 15:59:14 UTC

COMMIT: http://review.gluster.org/5895 committed in master by Anand Avati (avati) 
------
commit 67c28b19355c47e96d1420405cc38753a3e5f9be
Author: Kaushal M <kaushal>
Date:   Tue Sep 10 15:33:00 2013 +0530

    glusterd: Don't reset rebalance status on add-brick
    
    The rebalance status was being reset to 'Not started' when add-brick was
    performed. This would lead to odd cases where a 'rebalance status' on a
    volume would show status as 'not started' but would also include the
    rebalance statistics. This also affected the showing of asynchronus task
    status in 'volume status' command.
    
    By not resetting the status prevent the above issues from happening.
    Since we use the running/not-running of the rebalance process as the
    check when performing other operations we can safely leave the rebalance
    stats collected on an add-brick.
    
    Change-Id: I4c69d9c789d081c6de7e7a81dd0d4eba2e83ec17
    BUG: 1006247
    Signed-off-by: Kaushal M <kaushal>
    Reviewed-on: http://review.gluster.org/5895
    Reviewed-by: Vijay Bellur <vbellur>
    Tested-by: Gluster Build System <jenkins.com>

Comment 5 Anand Avati 2013-09-19 10:44:18 UTC

REVIEW: http://review.gluster.org/5971 (glusterd: Don't reset rebalance status on add-brick) posted (#1) for review on release-3.4 by Kaushal M (kaushal)

Comment 6 Anand Avati 2013-09-19 21:08:00 UTC

COMMIT: http://review.gluster.org/5971 committed in release-3.4 by Anand Avati (avati) 
------
commit ac92dccc8727acaa3c9e9353fba80817947552bf
Author: Kaushal M <kaushal>
Date:   Tue Sep 10 15:33:00 2013 +0530

    glusterd: Don't reset rebalance status on add-brick
    
     Backport of 67c28b19355c47e96d1420405cc38753a3e5f9be from master
    
    The rebalance status was being reset to 'Not started' when add-brick was
    performed. This would lead to odd cases where a 'rebalance status' on a
    volume would show status as 'not started' but would also include the
    rebalance statistics. This also affected the showing of asynchronus task
    status in 'volume status' command.
    
    By not resetting the status prevent the above issues from happening.
    Since we use the running/not-running of the rebalance process as the
    check when performing other operations we can safely leave the rebalance
    stats collected on an add-brick.
    
    BUG: 1006247
    Change-Id: Idade88d9e5a6f27659490b3e6d85495d426ef0a3
    Signed-off-by: Kaushal M <kaushal>
    Reviewed-on: http://review.gluster.org/5971
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Anand Avati <avati>

Comment 7 Niels de Vos 2014-04-17 11:47:36 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.5.0, please reopen this bug report.

glusterfs-3.5.0 has been announced on the Gluster Developers mailinglist [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/6137
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Note You need to log in before you can comment on or make changes to this bug.