Bug 1457731
| Summary: | [Scale] : Rebalance ETA (towards the end) may be inaccurate,even on a moderately large data set. | |||
|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Ambarish <asoman> | |
| Component: | distribute | Assignee: | Nithya Balachandran <nbalacha> | |
| Status: | CLOSED ERRATA | QA Contact: | Ambarish <asoman> | |
| Severity: | high | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | rhgs-3.3 | CC: | amukherj, bturner, rhinduja, rhs-bugs, storage-qa-internal, tdesala | |
| Target Milestone: | --- | |||
| Target Release: | RHGS 3.3.0 | |||
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | glusterfs-3.8.4-31 | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1464110 (view as bug list) | Environment: | ||
| Last Closed: | 2017-09-21 04:45:37 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1417151, 1464110 | |||
upstream patch : https://review.gluster.org/#/c/17607/ Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2774 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2774 |
Description: ------------ Added bricks to a dist rep volume,ran rebalance. These are the rebalance ETAs at different intervals : [T4 > T3 > T2 > T1] **At time T1** [root@gqas014 ~]# gluster v rebalance butcher status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 63949 9.8GB 295287 0 0 in progress 0:34:57 gqas015.sbu.lab.eng.bos.redhat.com 64644 9.9GB 300745 0 0 in progress 0:34:57 Estimated time left for rebalance to complete : 0:00:38 volume rebalance: butcher: success **At time T2** [root@gqas014 ~]# gluster v rebalance butcher status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 64010 9.8GB 295597 0 0 in progress 0:34:58 gqas015.sbu.lab.eng.bos.redhat.com 64705 9.9GB 300918 0 0 in progress 0:34:58 Estimated time left for rebalance to complete : 0:01:09 **At Time T3** : [root@gqas014 ~]# gluster v rebalance butcher status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 68057 10.0GB 313569 0 0 in progress 0:36:46 gqas015.sbu.lab.eng.bos.redhat.com 68904 10.2GB 319823 0 0 in progress 0:36:46 Estimated time left for rebalance to complete : 0:00:09 volume rebalance: butcher: success [root@gqas014 ~]# gluster v rebalance butcher status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 68110 10.0GB 313882 0 0 in progress 0:36:48 gqas015.sbu.lab.eng.bos.redhat.com 68958 10.2GB 319948 0 0 in progress 0:36:48 Estimated time left for rebalance to complete : 0:01:10 volume rebalance: butcher: success **At time T4** // When it finally completed : [root@gqas014 ~]# gluster v rebalance butcher status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 74885 104.4GB 345001 0 0 completed 1:12:32 gqas015.sbu.lab.eng.bos.redhat.com 74658 10.5GB 345747 0 0 completed 0:39:54 volume rebalance: butcher: success [root@gqas014 ~]# [root@gqas014 ~]# So at interval T1,it says ETA for completion is 38 seconds. At T2 it suddenly increased to slightly more than a minute. You can see the same thing happening at T3 interval. So,basically it keeps looping for a while at 1:10 minutes,counts down to 0 and starts with 1:10 again. This continued for another half an hour ,after which it finally completed( You can see the time diff in run time column accross the intervals). ##NUM_FILES## [root@gqac011 gluster-mount]# find . -mindepth 1 -type f | wc -l 352120