--- Additional comment from Nithya Balachandran on 2017-06-14 02:15:27 EDT --- The rebalance estimate feature works best when the files are of a uniform size. This is not the case with this setup where the volume contains a mix of both large and small files. From the logs, it looks like rebalance initially spent a lot of time migrating very large files: 1413 [2017-06-12 13:14:26.923797] I [MSGID: 109028] [dht-rebalance.c:4669:gf_defrag_status_get] 0-glusterfs: Files migrated: 2, size: 21474836480, lookups: 514, failures: 0, skipped: 0 1414 [2017-06-12 13:14:28.069317] I [dht-rebalance.c:4578:gf_defrag_status_get] 0-glusterfs: TIME: num_files_lookedup=514,elapsed time = 507.000000,rate_lookedup=1.013807 1415 [2017-06-12 13:14:28.069357] I [dht-rebalance.c:4581:gf_defrag_status_get] 0-glusterfs: TIME: Estimated total time to complete = 2929242 seconds 1416 [2017-06-12 13:14:28.069369] I [dht-rebalance.c:4584:gf_defrag_status_get] 0-glusterfs: TIME: Seconds left = 2928735 seconds So far only 2 files have been migrated but initially calculated file count shows well over 200K files. Based on this the estimated time is roughly 140 days. As rebalance proceeds and starts processing the smaller files, the rate goes up and the estimated time goes down. This starts roughly around : [2017-06-12 14:41:47.655006] I [dht-rebalance.c:4578:gf_defrag_status_get] 0-glusterfs: TIME: num_files_lookedup=137397,elapsed time = 5746.000000,rate_lookedup=23.911765 [2017-06-12 14:41:47.655044] I [dht-rebalance.c:4581:gf_defrag_status_get] 0-glusterfs: TIME: Estimated total time to complete = 124193 seconds [2017-06-12 14:41:47.655058] I [dht-rebalance.c:4584:gf_defrag_status_get] 0-glusterfs: TIME: Seconds left = 118447 seconds and the estimated time now is roughly 1/20th the originally calculated time (roughly 32 hours). As the rebalance proceed further, [2017-06-13 03:23:00.853181] I [dht-rebalance.c:4578:gf_defrag_status_get] 0-glusterfs: TIME: num_files_lookedup=3557582,elapsed time = 51419.000000,rate_lookedup=69.188082 [2017-06-13 03:23:00.853216] I [dht-rebalance.c:4581:gf_defrag_status_get] 0-glusterfs: TIME: Estimated total time to complete = 51563 seconds [2017-06-13 03:23:00.853227] I [dht-rebalance.c:4584:gf_defrag_status_get] 0-glusterfs: TIME: Seconds left = 144 seconds The estimated time is now 51563 s (roughly 14 hours).
REVIEW: https://review.gluster.org/17668 (cluster/dht: Use size to calculate estimates) posted (#1) for review on master by N Balachandran (nbalacha)
REVIEW: https://review.gluster.org/17668 (cluster/dht: Use size to calculate estimates) posted (#2) for review on master by N Balachandran (nbalacha)
REVIEW: https://review.gluster.org/17668 (cluster/dht: Use size to calculate estimates) posted (#3) for review on master by N Balachandran (nbalacha)
REVIEW: https://review.gluster.org/17668 (cluster/dht: Use size to calculate estimates) posted (#4) for review on master by N Balachandran (nbalacha)
REVIEW: https://review.gluster.org/17668 (cluster/dht: Use size to calculate estimates) posted (#5) for review on master by N Balachandran (nbalacha)
REVIEW: https://review.gluster.org/17668 (cluster/dht: Use size to calculate estimates) posted (#6) for review on master by Atin Mukherjee (amukherj)
REVIEW: https://review.gluster.org/17668 (cluster/dht: Use size to calculate estimates) posted (#7) for review on master by N Balachandran (nbalacha)
COMMIT: https://review.gluster.org/17668 committed in master by Raghavendra G (rgowdapp) ------ commit 9156a743aa76c955d18c9bfcb7c1a38ba00da890 Author: N Balachandran <nbalacha> Date: Mon Jul 3 13:13:35 2017 +0530 cluster/dht: Use size to calculate estimates The earlier approach of using the number of files to determine when the rebalance would complete did not work well when file sizes differed widely. The new approach now gets the total data size and uses that information to determine how long the rebalance is expected to take. Change-Id: I84e80a0893efab72ff06130e4596fa71c9c8c868 BUG: 1467209 Signed-off-by: N Balachandran <nbalacha> Reviewed-on: https://review.gluster.org/17668 Smoke: Gluster Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: MOHIT AGRAWAL <moagrawa> Reviewed-by: Raghavendra G <rgowdapp>
REVIEW: https://review.gluster.org/17867 (cluster/dht: Update size processed for non-migrated files) posted (#1) for review on master by N Balachandran (nbalacha)
COMMIT: https://review.gluster.org/17867 committed in master by Jeff Darcy (jeff.us) ------ commit 24ab0ef44a1646223b59e33d0109d8424f8eddd0 Author: N Balachandran <nbalacha> Date: Tue Jul 25 14:28:00 2017 +0530 cluster/dht: Update size processed for non-migrated files The size of non-migrated files was not added to the size_processed causing incorrect rebalance estimate calculations. This has been fixed. Change-Id: I9f338c44da22b856e9fdc6dc558f732ae9a22f15 BUG: 1467209 Signed-off-by: N Balachandran <nbalacha> Reviewed-on: https://review.gluster.org/17867 Reviewed-by: Amar Tumballi <amarts> Smoke: Gluster Build System <jenkins.org> Reviewed-by: Raghavendra G <rgowdapp> CentOS-regression: Gluster Build System <jenkins.org>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.12.0, please open a new bug report. glusterfs-3.12.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/announce/2017-September/000082.html [2] https://www.gluster.org/pipermail/gluster-users/
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.13.0, please open a new bug report. glusterfs-3.13.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/announce/2017-December/000087.html [2] https://www.gluster.org/pipermail/gluster-users/