Bug 1479446
| Summary: | Rebalance estimate(ETA) shows wrong details(as intial message of 10min wait reappears) when still in progress | |||
|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Nag Pavan Chilakam <nchilaka> | |
| Component: | distribute | Assignee: | Nithya Balachandran <nbalacha> | |
| Status: | CLOSED ERRATA | QA Contact: | Prasad Desala <tdesala> | |
| Severity: | high | Docs Contact: | ||
| Priority: | medium | |||
| Version: | rhgs-3.3 | CC: | amukherj, apaladug, nchilaka, pasik, rhs-bugs, sanandpa, saraut, sheggodu, storage-qa-internal, tdesala | |
| Target Milestone: | --- | Keywords: | ZStream | |
| Target Release: | RHGS 3.4.z Batch Update 2 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | glusterfs-3.12.2-27 | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1479528 (view as bug list) | Environment: | ||
| Last Closed: | 2018-12-17 17:07:02 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1479528, 1511271 | |||
rebalance at end
[root@dhcp46-42 ~]# gluster v rebal nrep2 status
Node Rebalanced-files size scanned failures skipped status run time in h:m:s
--------- ----------- ----------- ----------- ----------- ----------- ------------ --------------
localhost 23842 194.1MB 37488 0 0 completed 0:44:21
dhcp46-101.lab.eng.blr.redhat.com 23440 285.5MB 39635 0 0 completed 0:43:29
volume rebalance: nrep2: success
Is this reproducible? Prasad, can you check this as part of your testing(comment#3, ie if this is reproducible) Verified this BZ on glusterfs version 3.12.2-30. Followed the same steps as in the description, rebalance ETA displayed as expected. Moving this BZ to Verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:3827 |
Description of problem: ============================== I did a removebrick operation to convert 2x2 to 1x2 , while IOs were going on from 3 different ganesha mounts. I noticed that at a later stage(may be >80% completed), the message of "The estimated time for rebalance to complete will be unavailable for the first 10 minutes." appears again. I thinks this comes when the rebalance estimated time is over, but rebalance as such is not yet completed Last login: Tue Aug 8 19:32:38 2017 from 10.70.35.77 [root@dhcp46-42 ~]# gluster v rebal nrep2 status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 5145 7.4MB 10594 0 0 in progress 0:06:38 dhcp46-101.lab.eng.blr.redhat.com 4142 21.7MB 8722 0 0 in progress 0:06:38 The estimated time for rebalance to complete will be unavailable for the first 10 minutes. volume rebalance: nrep2: success [root@dhcp46-42 ~]# gluster v rebal nrep2 status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 5993 31.3MB 11970 0 0 in progress 0:08:38 dhcp46-101.lab.eng.blr.redhat.com 5050 26.6MB 10415 0 0 in progress 0:08:38 The estimated time for rebalance to complete will be unavailable for the first 10 minutes. volume rebalance: nrep2: success [root@dhcp46-42 ~]# gluster v rebal nrep2 status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 8059 62.0MB 16022 0 0 in progress 0:13:13 dhcp46-101.lab.eng.blr.redhat.com 7208 76.2MB 14071 0 0 in progress 0:13:13 Estimated time left for rebalance to complete : 0:47:28 volume rebalance: nrep2: success [root@dhcp46-42 ~]# gluster v rebal nrep2 status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 10699 110.9MB 21188 0 0 in progress 0:19:58 dhcp46-101.lab.eng.blr.redhat.com 9949 119.4MB 16739 0 0 in progress 0:19:58 Estimated time left for rebalance to complete : 0:47:25 volume rebalance: nrep2: success [root@dhcp46-42 ~]# gluster v rebal nrep2 status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 16839 151.7MB 28114 0 0 in progress 0:33:23 dhcp46-101.lab.eng.blr.redhat.com 16754 184.3MB 27528 0 0 in progress 0:33:23 Estimated time left for rebalance to complete : 0:00:48 volume rebalance: nrep2: success [root@dhcp46-42 ~]# [root@dhcp46-42 ~]# [root@dhcp46-42 ~]# gluster v rebal nrep2 status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 20687 192.2MB 32058 0 0 in progress 0:39:16 dhcp46-101.lab.eng.blr.redhat.com 20965 189.6MB 32669 0 0 in progress 0:39:16 Estimated time left for rebalance to complete : 0:00:06 volume rebalance: nrep2: success [root@dhcp46-42 ~]# ============== SEE FROM BELOW ================== [root@dhcp46-42 ~]# gluster v rebal nrep2 status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 21521 192.8MB 33069 0 0 in progress 0:40:28 dhcp46-101.lab.eng.blr.redhat.com 22456 189.6MB 35708 0 0 in progress 0:40:28 The estimated time for rebalance to complete will be unavailable for the first 10 minutes. volume rebalance: nrep2: success [root@dhcp46-42 ~]# gluster v rebal nrep2 status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 21669 192.8MB 33372 0 0 in progress 0:40:36 dhcp46-101.lab.eng.blr.redhat.com 22614 189.6MB 35708 0 0 in progress 0:40:36 The estimated time for rebalance to complete will be unavailable for the first 10 minutes. volume rebalance: nrep2: success [root@dhcp46-42 ~]# gluster v rebal nrep2 status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 21718 192.8MB 33372 0 0 in progress 0:40:40 dhcp46-101.lab.eng.blr.redhat.com 22667 189.6MB 36020 0 0 in progress 0:40:40 The estimated time for rebalance to complete will be unavailable for the first 10 minutes. volume rebalance: nrep2: success [root@dhcp46-42 ~]# [root@dhcp46-42 ~]# gluster v rebal nrep2 status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 23842 194.1MB 37488 0 0 in progress 0:43:47 dhcp46-101.lab.eng.blr.redhat.com 23440 285.5MB 39635 0 0 completed 0:43:29 The estimated time for rebalance to complete will be unavailable for the first 10 minutes. volume rebalance: nrep2: success Version-Release number of selected component (if applicable): [root@dhcp46-42 ~]# rpm -qa|grep gluster glusterfs-api-3.8.4-38.el7rhgs.x86_64 python-gluster-3.8.4-34.el7rhgs.noarch glusterfs-server-3.8.4-38.el7rhgs.x86_64 gluster-nagios-addons-0.2.9-1.el7rhgs.x86_64 nfs-ganesha-gluster-2.4.4-16.el7rhgs.x86_64 glusterfs-3.8.4-38.el7rhgs.x86_64 glusterfs-cli-3.8.4-38.el7rhgs.x86_64 glusterfs-rdma-3.8.4-38.el7rhgs.x86_64 gluster-nagios-common-0.2.4-1.el7rhgs.noarch libvirt-daemon-driver-storage-gluster-3.2.0-14.el7_4.2.x86_64 vdsm-gluster-4.17.33-1.2.el7rhgs.noarch glusterfs-libs-3.8.4-38.el7rhgs.x86_64 glusterfs-fuse-3.8.4-38.el7rhgs.x86_64 glusterfs-ganesha-3.8.4-38.el7rhgs.x86_64 glusterfs-geo-replication-3.8.4-38.el7rhgs.x86_64 glusterfs-client-xlators-3.8.4-38.el7rhgs.x86_64 Steps to Reproduce: 1.had a 1x2 volume add-brick to convert 2x2 and rebalance was done(with some files skipped) 2.did linux untar from one client, lookups from another client(going on till end) rename,move,chmod,chgrp from another client , but for only sometime, that too these operations were complete much before the rebalance was at this state. 3.observed rebalance eta Actual results: ========== again eta starts to show the initial 10 min wait message