1457731 – [Scale] : Rebalance ETA (towards the end) may be inaccurate,even on a moderately large data set.

Bug 1457731 - [Scale] : Rebalance ETA (towards the end) may be inaccurate,even on a moderately large data set.

Summary: [Scale] : Rebalance ETA (towards the end) may be inaccurate,even on a moderat...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	distribute
Sub Component:
Version:	rhgs-3.3
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	RHGS 3.3.0
Assignee:	Nithya Balachandran
QA Contact:	Ambarish
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1417151 1464110
TreeView+	depends on / blocked

Reported:	2017-06-01 08:09 UTC by Ambarish
Modified:	2017-09-21 04:58 UTC (History)
CC List:	6 users (show)
Fixed In Version:	glusterfs-3.8.4-31
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1464110 (view as bug list)
Environment:
Last Closed:	2017-09-21 04:45:37 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2017:2774	0	normal	SHIPPED_LIVE	glusterfs bug fix and enhancement update	2017-09-21 08:16:29 UTC

Description Ambarish 2017-06-01 08:09:17 UTC

Description:
------------
 
Added bricks to a dist rep volume,ran rebalance.
 
These are the rebalance ETAs at different intervals :
 
[T4 > T3 > T2 > T1]
 
**At time T1**
 
 
[root@gqas014 ~]# gluster v rebalance butcher status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost            63949         9.8GB        295287             0             0          in progress        0:34:57
      gqas015.sbu.lab.eng.bos.redhat.com            64644         9.9GB        300745             0             0          in progress        0:34:57
Estimated time left for rebalance to complete :        0:00:38
volume rebalance: butcher: success
 
 
**At time T2**
 
[root@gqas014 ~]# gluster v rebalance butcher status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost            64010         9.8GB        295597             0             0          in progress        0:34:58
      gqas015.sbu.lab.eng.bos.redhat.com            64705         9.9GB        300918             0             0          in progress        0:34:58
Estimated time left for rebalance to complete :        0:01:09
 
 
**At Time T3** :
 
[root@gqas014 ~]# gluster v rebalance butcher status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost            68057        10.0GB        313569             0             0          in progress        0:36:46
      gqas015.sbu.lab.eng.bos.redhat.com            68904        10.2GB        319823             0             0          in progress        0:36:46
Estimated time left for rebalance to complete :        0:00:09
volume rebalance: butcher: success
[root@gqas014 ~]# gluster v rebalance butcher status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost            68110        10.0GB        313882             0             0          in progress        0:36:48
      gqas015.sbu.lab.eng.bos.redhat.com            68958        10.2GB        319948             0             0          in progress        0:36:48
Estimated time left for rebalance to complete :        0:01:10
volume rebalance: butcher: success
 
 
 
**At time T4** // When it finally completed :
 
[root@gqas014 ~]# gluster v rebalance butcher status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost            74885       104.4GB        345001             0             0            completed        1:12:32
      gqas015.sbu.lab.eng.bos.redhat.com            74658        10.5GB        345747             0             0            completed        0:39:54
volume rebalance: butcher: success
[root@gqas014 ~]#
[root@gqas014 ~]#
 
 
 
So at interval T1,it says ETA for completion is 38 seconds.
 
At T2 it suddenly increased to slightly more than a minute.
 
You can see the same thing happening at T3 interval.
 
So,basically it keeps looping for a while at 1:10 minutes,counts down to 0 and starts with 1:10 again.
 
This continued for another half an hour ,after which it finally completed( You can see the time diff in run time column accross the intervals).
 
 
##NUM_FILES##
[root@gqac011 gluster-mount]# find . -mindepth 1 -type f | wc -l
 
352120

Comment 6 Atin Mukherjee 2017-06-22 13:14:24 UTC

upstream patch : https://review.gluster.org/#/c/17607/

Comment 18 errata-xmlrpc 2017-09-21 04:45:37 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2774

Comment 19 errata-xmlrpc 2017-09-21 04:58:49 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2774

Note You need to log in before you can comment on or make changes to this bug.