1475192 – [Scale] : Rebalance ETA shows the initial estimate to be ~140 days,finishes within 18 hours though.

Bug 1475192 - [Scale] : Rebalance ETA shows the initial estimate to be ~140 days,finishes within 18 hours though.

Summary: [Scale] : Rebalance ETA shows the initial estimate to be ~140 days,finishes w...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	distribute
Sub Component:
Version:	3.12
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	---
Assignee:	Nithya Balachandran
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:	1460936 1467209
Blocks:
TreeView+	depends on / blocked

Reported:	2017-07-26 08:01 UTC by Nithya Balachandran
Modified:	2017-09-05 17:37 UTC (History)
CC List:	9 users (show)
Fixed In Version:	glusterfs-3.12.0
Clone Of:	1467209
Environment:
Last Closed:	2017-09-05 17:37:29 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Nithya Balachandran 2017-07-26 08:01:44 UTC

+++ This bug was initially created as a clone of Bug #1467209 +++

+++ This bug was initially created as a clone of Bug #1460936 +++

Description of problem:
-----------------------

This is slightly different than https://bugzilla.redhat.com/show_bug.cgi?id=1457731.

Rebalance ETA showed the initial estimate to be ~140 days at one point :

[root@gqas014 ~]# gluster v rebalance butcher status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost             3899        40.0GB          8162             0             0          in progress        0:33:53
      server1                6       150.0GB           508             0             0          in progress        0:33:53
Estimated time left for rebalance to complete :     3301:23:54
volume rebalance: butcher: success
[root@server2 ~]# 


It finished within 18 hours though :

[root@gqas014 ~]# gluster v rebalance butcher status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost          1058854        72.1GB       5320040             0             0            completed       18:44:51
      server1          1062859       451.6GB       4843484             0             0            completed       18:44:51
volume rebalance: butcher: success
[root@server2 ~]# 



Version-Release number of selected component (if applicable):
-------------------------------------------------------------

3.8.4-27

How reproducible:
-----------------

1/1

Additional info:
---------------

[root@gqas014 ~]# gluster v info
 
Volume Name: butcher
Type: Distribute
Volume ID: f297fb8e-f276-4f96-8a58-a1215112d3b2
Status: Started
Snapshot Count: 0
Number of Bricks: 24
Transport-type: tcp
Bricks:
Brick1: gqas014.sbu.lab.eng.bos.redhat.com:/bricks1/A1
Brick2: gqas015.sbu.lab.eng.bos.redhat.com:/bricks1/A1
Brick3: gqas014.sbu.lab.eng.bos.redhat.com:/bricks2/A1
Brick4: gqas015.sbu.lab.eng.bos.redhat.com:/bricks2/A1
Brick5: gqas014.sbu.lab.eng.bos.redhat.com:/bricks3/A1
Brick6: gqas015.sbu.lab.eng.bos.redhat.com:/bricks3/A1
Brick7: gqas014.sbu.lab.eng.bos.redhat.com:/bricks4/A1
Brick8: gqas015.sbu.lab.eng.bos.redhat.com:/bricks4/A1
Brick9: gqas014.sbu.lab.eng.bos.redhat.com:/bricks5/A1
Brick10: gqas015.sbu.lab.eng.bos.redhat.com:/bricks5/A1
Brick11: gqas014.sbu.lab.eng.bos.redhat.com:/bricks6/A1
Brick12: gqas015.sbu.lab.eng.bos.redhat.com:/bricks6/A1
Brick13: gqas014.sbu.lab.eng.bos.redhat.com:/bricks7/A1
Brick14: gqas015.sbu.lab.eng.bos.redhat.com:/bricks7/A1
Brick15: gqas014.sbu.lab.eng.bos.redhat.com:/bricks8/A1
Brick16: gqas015.sbu.lab.eng.bos.redhat.com:/bricks8/A1
Brick17: gqas014.sbu.lab.eng.bos.redhat.com:/bricks9/A1
Brick18: gqas015.sbu.lab.eng.bos.redhat.com:/bricks9/A1
Brick19: gqas014.sbu.lab.eng.bos.redhat.com:/bricks10/A1
Brick20: gqas015.sbu.lab.eng.bos.redhat.com:/bricks10/A1
Brick21: gqas014.sbu.lab.eng.bos.redhat.com:/bricks11/A1
Brick22: gqas015.sbu.lab.eng.bos.redhat.com:/bricks11/A1
Brick23: gqas014.sbu.lab.eng.bos.redhat.com:/bricks12/A1
Brick24: gqas015.sbu.lab.eng.bos.redhat.com:/bricks12/A1
Options Reconfigured:
nfs.disable: on
transport.address-family: inet
cluster.lookup-optimize: on
server.event-threads: 4
client.event-threads: 4
features.cache-invalidation: on
features.cache-invalidation-timeout: 600
performance.stat-prefetch: on
performance.cache-invalidation: on
performance.md-cache-timeout: 600
network.inode-lru-limit: 50000
[root@gqas014 ~]#

--- Additional comment from Red Hat Bugzilla Rules Engine on 2017-06-13 04:05:59 EDT ---

This bug is automatically being proposed for the current release of Red Hat Gluster Storage 3 under active development, by setting the release flag 'rhgs‑3.3.0' to '?'. 

If this bug should be proposed for a different release, please manually change the proposed release flag.

--- Additional comment from Nithya Balachandran on 2017-06-13 04:17:15 EDT ---

Did the estimates change as the rebalance progressed? These values are recalculated as the rebalance proceeds and expected to become more accurate over time.

--- Additional comment from Ambarish on 2017-06-13 04:19:05 EDT ---

This is rebal ETA at diff intervals :

*Interval1* :

[root@gqas014 ~]# gluster v rebalance butcher status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost              421        40.0GB           973             0             0          in progress        0:12:18
      gqas015.sbu.lab.eng.bos.redhat.com                2        20.0GB           508             0             0          in progress        0:12:18
Estimated time left for rebalance to complete :     1198:26:30
volume rebalance: butcher: success
[root@gqas014 ~]# gluster v rebalance butcher status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost              421        40.0GB           973             0             0          in progress        0:12:20
      gqas015.sbu.lab.eng.bos.redhat.com                2        20.0GB           508             0             0          in progress        0:12:20
Estimated time left for rebalance to complete :     1201:41:22
volume rebalance: butcher: success
[root@gqas014 ~]# gluster v rebalance butcher status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost              421        40.0GB           973             0             0          in progress        0:12:22
      gqas015.sbu.lab.eng.bos.redhat.com                2        20.0GB           508             0             0          in progress        0:12:22
Estimated time left for rebalance to complete :     1204:56:14
volume rebalance: butcher: success
[root@gqas014 ~]# 




*Interval2* :


[root@gqas014 ~]# gluster v rebalance butcher status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost             1958        40.0GB          4137             0             0          in progress        0:21:10
      gqas015.sbu.lab.eng.bos.redhat.com                2        20.0GB           508             0             0          in progress        0:21:10
Estimated time left for rebalance to complete :     2062:21:32
volume rebalance: butcher: success
[root@gqas014 ~]# gluster v rebalance butcher status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost             1990        40.0GB          4144             0             0          in progress        0:21:11
      gqas015.sbu.lab.eng.bos.redhat.com                2        20.0GB           508             0             0          in progress        0:21:11
Estimated time left for rebalance to complete :     2063:58:58
volume rebalance: butcher: success
[root@gqas014 ~]# gluster v rebalance butcher status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost             2052        40.0GB          4210             0             0          in progress        0:21:17
      gqas015.sbu.lab.eng.bos.redhat.com                2        20.0GB           508             0             0          in progress        0:21:17
Estimated time left for rebalance to complete :     2073:43:34
volume rebalance: butcher: success
[root@gqas014 ~]# 


*Interval3*

[root@gqas014 ~]# gluster v rebalance butcher status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost             3894        40.0GB          8096             0             0          in progress        0:33:50
      gqas015.sbu.lab.eng.bos.redhat.com                6       150.0GB           508             0             0          in progress        0:33:50
Estimated time left for rebalance to complete :     3296:31:35
volume rebalance: butcher: success
[root@gqas014 ~]# gluster v rebalance butcher status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost             3898        40.0GB          8102             0             0          in progress        0:33:52
      gqas015.sbu.lab.eng.bos.redhat.com                6       150.0GB           508             0             0          in progress        0:33:52
Estimated time left for rebalance to complete :     3299:46:28
volume rebalance: butcher: success
[root@gqas014 ~]# gluster v rebalance butcher status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost             3899        40.0GB          8162             0             0          in progress        0:33:53
      gqas015.sbu.lab.eng.bos.redhat.com                6       150.0GB           508             0             0          in progress        0:33:53
Estimated time left for rebalance to complete :     3301:23:54
volume rebalance: butcher: success
[root@gqas014 ~]# 








*Interval4* :

(reverse-i-search)`st': cd /var/log/glu^Cerfs/
[root@gqas014 glusterfs]# gluster v rebalance butcher status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost            12439        40.1GB         26304             0             0          in progress        1:01:17
      gqas015.sbu.lab.eng.bos.redhat.com             5840       420.8GB         15875             0             0          in progress        1:01:17
Estimated time left for rebalance to complete :      190:05:11
volume rebalance: butcher: success
[root@gqas014 glusterfs]# gluster v rebalance butcher status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost            12451        40.1GB         26390             0             0          in progress        1:01:20
      gqas015.sbu.lab.eng.bos.redhat.com             5852       420.8GB         15897             0             0          in progress        1:01:20
Estimated time left for rebalance to complete :      189:58:36
volume rebalance: butcher: success
[root@gqas014 glusterfs]# gluster v rebalance butcher status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost            12459        40.1GB         26391             0             0          in progress        1:01:22
      gqas015.sbu.lab.eng.bos.redhat.com             5857       420.8GB         15907             0             0          in progress        1:01:22
Estimated time left for rebalance to complete :      189:57:35
volume rebalance: butcher: success
[root@gqas014 glusterfs]# 




*Interval 5* :

[root@gqas014 ~]# gluster v rebalance butcher status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost            64709        41.4GB        137889             0             0          in progress        1:35:49
      gqas015.sbu.lab.eng.bos.redhat.com            63986       422.8GB        165367             0             0          in progress        1:35:49
Estimated time left for rebalance to complete :       32:47:45
volume rebalance: butcher: success
[root@gqas014 ~]# gluster v rebalance butcher status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost            64767        41.4GB        137939             0             0          in progress        1:35:50
      gqas015.sbu.lab.eng.bos.redhat.com            64014       422.8GB        165526             0             0          in progress        1:35:50
Estimated time left for rebalance to complete :       32:47:21
volume rebalance: butcher: success
[root@gqas014 ~]# gluster v rebalance butcher status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost            64784        41.4GB        137947             0             0          in progress        1:35:59
      gqas015.sbu.lab.eng.bos.redhat.com            64039       422.8GB        165583             0             0          in progress        1:35:59
Estimated time left for rebalance to complete :       32:50:19
volume rebalance: butcher: success
[root@gqas014 ~]# 


As you can see,it shows 3k+ hors for nearly an hour (till Interval 4).

--- Additional comment from Nithya Balachandran on 2017-06-14 02:15:27 EDT ---

The rebalance estimate feature works best when the files are of a uniform size.
This is not the case with this setup where the volume contains a mix of both large and small files.


From the logs, it looks like rebalance initially spent a lot of time migrating very large files:


1413 [2017-06-12 13:14:26.923797] I [MSGID: 109028] [dht-rebalance.c:4669:gf_defrag_status_get] 0-glusterfs: Files migrated: 2, size: 21474836480, lookups: 514, failures: 0, skipped: 0
1414 [2017-06-12 13:14:28.069317] I [dht-rebalance.c:4578:gf_defrag_status_get] 0-glusterfs: TIME: num_files_lookedup=514,elapsed time = 507.000000,rate_lookedup=1.013807
1415 [2017-06-12 13:14:28.069357] I [dht-rebalance.c:4581:gf_defrag_status_get] 0-glusterfs: TIME: Estimated total time to complete = 2929242 seconds
1416 [2017-06-12 13:14:28.069369] I [dht-rebalance.c:4584:gf_defrag_status_get] 0-glusterfs: TIME: Seconds left = 2928735 seconds


So far only 2 files have been migrated but initially calculated file count shows well over 200K files. Based on this the estimated time is roughly 140 days.  


As rebalance proceeds and starts processing the smaller files, the rate goes up and the estimated time goes down.

This starts roughly around :
[2017-06-12 14:41:47.655006] I [dht-rebalance.c:4578:gf_defrag_status_get] 0-glusterfs: TIME: num_files_lookedup=137397,elapsed time = 5746.000000,rate_lookedup=23.911765
[2017-06-12 14:41:47.655044] I [dht-rebalance.c:4581:gf_defrag_status_get] 0-glusterfs: TIME: Estimated total time to complete = 124193 seconds
[2017-06-12 14:41:47.655058] I [dht-rebalance.c:4584:gf_defrag_status_get] 0-glusterfs: TIME: Seconds left = 118447 seconds


and the estimated time now is roughly 1/20th the originally calculated time (roughly 32 hours).


As the rebalance proceed further,
[2017-06-13 03:23:00.853181] I [dht-rebalance.c:4578:gf_defrag_status_get] 0-glusterfs: TIME: num_files_lookedup=3557582,elapsed time = 51419.000000,rate_lookedup=69.188082
[2017-06-13 03:23:00.853216] I [dht-rebalance.c:4581:gf_defrag_status_get] 0-glusterfs: TIME: Estimated total time to complete = 51563 seconds
[2017-06-13 03:23:00.853227] I [dht-rebalance.c:4584:gf_defrag_status_get] 0-glusterfs: TIME: Seconds left = 144 seconds


The estimated time is now 51563 s (roughly 14 hours).



What can we do here?
1. Try this out with more uniform file sizes
2. BZ 1460894 has some improvements wrt dir processing which should help with the initial file calculations
3. See if processing based on size instead of inode counts will help here. This will need to be tried out on large data sets and will need QE effort.

--- Additional comment from Atin Mukherjee on 2017-06-14 02:49:33 EDT ---

(In reply to Nithya Balachandran from comment #4)
> The rebalance estimate feature works best when the files are of a uniform
> size.
> This is not the case with this setup where the volume contains a mix of both
> large and small files.
> 
> 
> From the logs, it looks like rebalance initially spent a lot of time
> migrating very large files:
> 
> 
> 1413 [2017-06-12 13:14:26.923797] I [MSGID: 109028]
> [dht-rebalance.c:4669:gf_defrag_status_get] 0-glusterfs: Files migrated: 2,
> size: 21474836480, lookups: 514, failures: 0, skipped: 0
> 1414 [2017-06-12 13:14:28.069317] I
> [dht-rebalance.c:4578:gf_defrag_status_get] 0-glusterfs: TIME:
> num_files_lookedup=514,elapsed time = 507.000000,rate_lookedup=1.013807
> 1415 [2017-06-12 13:14:28.069357] I
> [dht-rebalance.c:4581:gf_defrag_status_get] 0-glusterfs: TIME: Estimated
> total time to complete = 2929242 seconds
> 1416 [2017-06-12 13:14:28.069369] I
> [dht-rebalance.c:4584:gf_defrag_status_get] 0-glusterfs: TIME: Seconds left
> = 2928735 seconds
> 
> 
> So far only 2 files have been migrated but initially calculated file count
> shows well over 200K files. Based on this the estimated time is roughly 140
> days.  
> 
> 
> As rebalance proceeds and starts processing the smaller files, the rate goes
> up and the estimated time goes down.
> 
> This starts roughly around :
> [2017-06-12 14:41:47.655006] I [dht-rebalance.c:4578:gf_defrag_status_get]
> 0-glusterfs: TIME: num_files_lookedup=137397,elapsed time =
> 5746.000000,rate_lookedup=23.911765
> [2017-06-12 14:41:47.655044] I [dht-rebalance.c:4581:gf_defrag_status_get]
> 0-glusterfs: TIME: Estimated total time to complete = 124193 seconds
> [2017-06-12 14:41:47.655058] I [dht-rebalance.c:4584:gf_defrag_status_get]
> 0-glusterfs: TIME: Seconds left = 118447 seconds
> 
> 
> and the estimated time now is roughly 1/20th the originally calculated time
> (roughly 32 hours).
> 
> 
> As the rebalance proceed further,
> [2017-06-13 03:23:00.853181] I [dht-rebalance.c:4578:gf_defrag_status_get]
> 0-glusterfs: TIME: num_files_lookedup=3557582,elapsed time =
> 51419.000000,rate_lookedup=69.188082
> [2017-06-13 03:23:00.853216] I [dht-rebalance.c:4581:gf_defrag_status_get]
> 0-glusterfs: TIME: Estimated total time to complete = 51563 seconds
> [2017-06-13 03:23:00.853227] I [dht-rebalance.c:4584:gf_defrag_status_get]
> 0-glusterfs: TIME: Seconds left = 144 seconds
> 
> 
> The estimated time is now 51563 s (roughly 14 hours).
> 
> 
> 
> What can we do here?
> 1. Try this out with more uniform file sizes
> 2. BZ 1460894 has some improvements wrt dir processing which should help
> with the initial file calculations
> 3. See if processing based on size instead of inode counts will help here.
> This will need to be tried out on large data sets and will need QE effort.

Point 3 is not in scope of rhgs-3.3.0, right Nithya?

--- Additional comment from Red Hat Bugzilla Rules Engine on 2017-06-22 08:58:45 EDT ---

This bug is automatically being provided 'pm_ack+' for the release flag 'rhgs‑3.3.0', the current release of Red Hat Gluster Storage 3 under active development, having been appropriately marked for the release, and having been provided ACK from Development and QE

--- Additional comment from Red Hat Bugzilla Rules Engine on 2017-06-23 07:27:38 EDT ---

Since this bug has been approved for the RHGS 3.3.0 release of Red Hat Gluster Storage 3, through release flag 'rhgs-3.3.0+', and through the Internal Whiteboard entry of '3.3.0', the Target Release is being automatically set to 'RHGS 3.3.0'

--- Additional comment from Nithya Balachandran on 2017-07-03 03:33:15 EDT ---

--- Additional comment from Nithya Balachandran on 2017-06-14 02:15:27 EDT ---

The rebalance estimate feature works best when the files are of a uniform size.
This is not the case with this setup where the volume contains a mix of both large and small files.


From the logs, it looks like rebalance initially spent a lot of time migrating very large files:


1413 [2017-06-12 13:14:26.923797] I [MSGID: 109028] [dht-rebalance.c:4669:gf_defrag_status_get] 0-glusterfs: Files migrated: 2, size: 21474836480, lookups: 514, failures: 0, skipped: 0
1414 [2017-06-12 13:14:28.069317] I [dht-rebalance.c:4578:gf_defrag_status_get] 0-glusterfs: TIME: num_files_lookedup=514,elapsed time = 507.000000,rate_lookedup=1.013807
1415 [2017-06-12 13:14:28.069357] I [dht-rebalance.c:4581:gf_defrag_status_get] 0-glusterfs: TIME: Estimated total time to complete = 2929242 seconds
1416 [2017-06-12 13:14:28.069369] I [dht-rebalance.c:4584:gf_defrag_status_get] 0-glusterfs: TIME: Seconds left = 2928735 seconds


So far only 2 files have been migrated but initially calculated file count shows well over 200K files. Based on this the estimated time is roughly 140 days.  


As rebalance proceeds and starts processing the smaller files, the rate goes up and the estimated time goes down.

This starts roughly around :
[2017-06-12 14:41:47.655006] I [dht-rebalance.c:4578:gf_defrag_status_get] 0-glusterfs: TIME: num_files_lookedup=137397,elapsed time = 5746.000000,rate_lookedup=23.911765
[2017-06-12 14:41:47.655044] I [dht-rebalance.c:4581:gf_defrag_status_get] 0-glusterfs: TIME: Estimated total time to complete = 124193 seconds
[2017-06-12 14:41:47.655058] I [dht-rebalance.c:4584:gf_defrag_status_get] 0-glusterfs: TIME: Seconds left = 118447 seconds


and the estimated time now is roughly 1/20th the originally calculated time (roughly 32 hours).


As the rebalance proceed further,
[2017-06-13 03:23:00.853181] I [dht-rebalance.c:4578:gf_defrag_status_get] 0-glusterfs: TIME: num_files_lookedup=3557582,elapsed time = 51419.000000,rate_lookedup=69.188082
[2017-06-13 03:23:00.853216] I [dht-rebalance.c:4581:gf_defrag_status_get] 0-glusterfs: TIME: Estimated total time to complete = 51563 seconds
[2017-06-13 03:23:00.853227] I [dht-rebalance.c:4584:gf_defrag_status_get] 0-glusterfs: TIME: Seconds left = 144 seconds


The estimated time is now 51563 s (roughly 14 hours).

--- Additional comment from Worker Ant on 2017-07-03 03:47:31 EDT ---

REVIEW: https://review.gluster.org/17668 (cluster/dht: Use size to calculate estimates) posted (#1) for review on master by N Balachandran (nbalacha)

--- Additional comment from Worker Ant on 2017-07-04 09:23:36 EDT ---

REVIEW: https://review.gluster.org/17668 (cluster/dht: Use size to calculate estimates) posted (#2) for review on master by N Balachandran (nbalacha)

--- Additional comment from Worker Ant on 2017-07-06 13:57:54 EDT ---

REVIEW: https://review.gluster.org/17668 (cluster/dht: Use size to calculate estimates) posted (#3) for review on master by N Balachandran (nbalacha)

--- Additional comment from Worker Ant on 2017-07-06 23:34:02 EDT ---

REVIEW: https://review.gluster.org/17668 (cluster/dht: Use size to calculate estimates) posted (#4) for review on master by N Balachandran (nbalacha)

--- Additional comment from Worker Ant on 2017-07-07 00:24:24 EDT ---

REVIEW: https://review.gluster.org/17668 (cluster/dht: Use size to calculate estimates) posted (#5) for review on master by N Balachandran (nbalacha)

--- Additional comment from Worker Ant on 2017-07-07 01:54:13 EDT ---

REVIEW: https://review.gluster.org/17668 (cluster/dht: Use size to calculate estimates) posted (#6) for review on master by Atin Mukherjee (amukherj)

--- Additional comment from Worker Ant on 2017-07-09 08:06:25 EDT ---

REVIEW: https://review.gluster.org/17668 (cluster/dht: Use size to calculate estimates) posted (#7) for review on master by N Balachandran (nbalacha)

--- Additional comment from Worker Ant on 2017-07-10 10:35:38 EDT ---

COMMIT: https://review.gluster.org/17668 committed in master by Raghavendra G (rgowdapp) 
------
commit 9156a743aa76c955d18c9bfcb7c1a38ba00da890
Author: N Balachandran <nbalacha>
Date:   Mon Jul 3 13:13:35 2017 +0530

    cluster/dht: Use size to calculate estimates
    
    The earlier approach of using the number of files
    to determine when the rebalance would complete did
    not work well when file sizes differed widely.
    
    The new approach now gets the total data size and
    uses that information to determine how long
    the rebalance is expected to take.
    
    Change-Id: I84e80a0893efab72ff06130e4596fa71c9c8c868
    BUG: 1467209
    Signed-off-by: N Balachandran <nbalacha>
    Reviewed-on: https://review.gluster.org/17668
    Smoke: Gluster Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: MOHIT AGRAWAL <moagrawa>
    Reviewed-by: Raghavendra G <rgowdapp>

--- Additional comment from Worker Ant on 2017-07-25 05:08:23 EDT ---

REVIEW: https://review.gluster.org/17867 (cluster/dht: Update size processed for non-migrated files) posted (#1) for review on master by N Balachandran (nbalacha)

--- Additional comment from Worker Ant on 2017-07-25 17:52:52 EDT ---

COMMIT: https://review.gluster.org/17867 committed in master by Jeff Darcy (jeff.us) 
------
commit 24ab0ef44a1646223b59e33d0109d8424f8eddd0
Author: N Balachandran <nbalacha>
Date:   Tue Jul 25 14:28:00 2017 +0530

    cluster/dht: Update size processed for non-migrated files
    
    The size of non-migrated files was not added to the
    size_processed causing incorrect rebalance estimate
    calculations. This has been fixed.
    
    Change-Id: I9f338c44da22b856e9fdc6dc558f732ae9a22f15
    BUG: 1467209
    Signed-off-by: N Balachandran <nbalacha>
    Reviewed-on: https://review.gluster.org/17867
    Reviewed-by: Amar Tumballi <amarts>
    Smoke: Gluster Build System <jenkins.org>
    Reviewed-by: Raghavendra G <rgowdapp>
    CentOS-regression: Gluster Build System <jenkins.org>

Comment 1 Worker Ant 2017-07-26 08:04:11 UTC

REVIEW: https://review.gluster.org/17873 (cluster/dht: Update size processed for non-migrated files) posted (#1) for review on release-3.12 by N Balachandran (nbalacha)

Comment 2 Worker Ant 2017-07-31 17:33:23 UTC

COMMIT: https://review.gluster.org/17873 committed in release-3.12 by Shyamsundar Ranganathan (srangana) 
------
commit c394cb71cf422f68c4910c54b8a835f83fe64bc2
Author: N Balachandran <nbalacha>
Date:   Tue Jul 25 14:28:00 2017 +0530

    cluster/dht: Update size processed for non-migrated files
    
    The size of non-migrated files was not added to the
    size_processed causing incorrect rebalance estimate
    calculations. This has been fixed.
    
    > BUG: 1467209
    > Signed-off-by: N Balachandran <nbalacha>
    > Reviewed-on: https://review.gluster.org/17867
    > Reviewed-by: Amar Tumballi <amarts>
    > Smoke: Gluster Build System <jenkins.org>
    > Reviewed-by: Raghavendra G <rgowdapp>
    > CentOS-regression: Gluster Build System <jenkins.org>
    (cherry picked from commit 24ab0ef44a1646223b59e33d0109d8424f8eddd0)
    Change-Id: I9f338c44da22b856e9fdc6dc558f732ae9a22f15
    BUG: 1475192
    Signed-off-by: N Balachandran <nbalacha>
    Reviewed-on: https://review.gluster.org/17873
    Smoke: Gluster Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Raghavendra G <rgowdapp>

Comment 3 Shyamsundar 2017-09-05 17:37:29 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.12.0, please open a new bug report.

glusterfs-3.12.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2017-September/000082.html
[2] https://www.gluster.org/pipermail/gluster-users/

Note You need to log in before you can comment on or make changes to this bug.