+++ This bug was initially created as a clone of Bug #1303028 +++ +++ This bug was initially created as a clone of Bug #1302968 +++ On my 16 node setup after about a day, 3 nodes in the rebalance status shows the lapsed time reset to "ZERO" and again after about 4-5 hours, all the nodes stopped ticking except only one node continued which is continually ticking. Hence the promote/demote and scanned files stats have stopped getting updated [root@dhcp37-202 ~]# gluster v rebal nagvol status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 2 0Bytes 35287 0 0 in progress 29986.00 10.70.37.195 0 0Bytes 35281 0 0 in progress 29986.00 10.70.35.155 0 0Bytes 35003 0 0 in progress 29986.00 10.70.35.222 0 0Bytes 35002 0 0 in progress 29986.00 10.70.35.108 0 0Bytes 0 0 0 in progress 29985.00 10.70.35.44 0 0Bytes 0 0 0 in progress 29986.00 10.70.35.89 0 0Bytes 0 0 0 in progress 146477.00 10.70.35.231 0 0Bytes 0 0 0 in progress 29986.00 10.70.35.176 0 0Bytes 35487 0 0 in progress 29986.00 10.70.35.232 0 0Bytes 0 0 0 in progress 0.00 10.70.35.173 0 0Bytes 0 0 0 in progress 0.00 10.70.35.163 0 0Bytes 35314 0 0 in progress 29986.00 10.70.37.101 0 0Bytes 0 0 0 in progress 0.00 10.70.37.69 0 0Bytes 35385 0 0 in progress 29986.00 10.70.37.60 0 0Bytes 35255 0 0 in progress 29986.00 10.70.37.120 0 0Bytes 35250 0 0 in progress 29986.00 volume rebalance: nagvol: success [root@dhcp37-202 ~]# [root@dhcp37-202 ~]# [root@dhcp37-202 ~]# gluster v rebal nagvol status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 2 0Bytes 35287 0 0 in progress 29986.00 10.70.37.195 0 0Bytes 35281 0 0 in progress 29986.00 10.70.35.155 0 0Bytes 35003 0 0 in progress 29986.00 10.70.35.222 0 0Bytes 35002 0 0 in progress 29986.00 10.70.35.108 0 0Bytes 0 0 0 in progress 29985.00 10.70.35.44 0 0Bytes 0 0 0 in progress 29986.00 10.70.35.89 0 0Bytes 0 0 0 in progress 146488.00 10.70.35.231 0 0Bytes 0 0 0 in progress 29986.00 10.70.35.176 0 0Bytes 35487 0 0 in progress 29986.00 10.70.35.232 0 0Bytes 0 0 0 in progress 0.00 10.70.35.173 0 0Bytes 0 0 0 in progress 0.00 10.70.35.163 0 0Bytes 35314 0 0 in progress 29986.00 10.70.37.101 0 0Bytes 0 0 0 in progress 0.00 10.70.37.69 0 0Bytes 35385 0 0 in progress 29986.00 10.70.37.60 0 0Bytes 35255 0 0 in progress 29986.00 10.70.37.120 0 0Bytes 35250 0 0 in progress 29986.00 Also, the tier status shows as belo: [root@dhcp37-202 ~]# gluster v tier nagvol status Node Promoted files Demoted files Status --------- --------- --------- --------- localhost 0 0 in progress 10.70.37.195 0 0 in progress 10.70.35.155 0 0 in progress 10.70.35.222 0 0 in progress 10.70.35.108 0 0 in progress 10.70.35.44 0 0 in progress 10.70.35.89 0 0 in progress 10.70.35.231 0 0 in progress 10.70.35.176 0 0 in progress 10.70.35.232 0 0 in progress 10.70.35.173 0 0 in progress 10.70.35.163 0 0 in progress 10.70.37.101 0 0 in progress 10.70.37.69 0 0 in progress 10.70.37.60 0 0 in progress 10.70.37.120 0 0 in progress Tiering Migration Functionality: nagvol: success -> I was running some IOs but not very heavy -> Also, there was an nfs problem reported wrt music files, stopped palying with permission denied -> I saw files promotes happening -> Also, the glusterd was restarted only on one of the nodes, in the last 2 days glusterfs-client-xlators-3.7.5-17.el7rhgs.x86_64 glusterfs-server-3.7.5-17.el7rhgs.x86_64 gluster-nagios-addons-0.2.5-1.el7rhgs.x86_64 vdsm-gluster-4.16.30-1.3.el7rhgs.noarch glusterfs-3.7.5-17.el7rhgs.x86_64 glusterfs-api-3.7.5-17.el7rhgs.x86_64 glusterfs-cli-3.7.5-17.el7rhgs.x86_64 glusterfs-geo-replication-3.7.5-17.el7rhgs.x86_64 glusterfs-debuginfo-3.7.5-17.el7rhgs.x86_64 gluster-nagios-common-0.2.3-1.el7rhgs.noarch python-gluster-3.7.5-16.el7rhgs.noarch glusterfs-libs-3.7.5-17.el7rhgs.x86_64 glusterfs-fuse-3.7.5-17.el7rhgs.x86_64 glusterfs-rdma-3.7.5-17.el7rhgs.x86_64 sosreports will be attached --- Additional comment from Red Hat Bugzilla Rules Engine on 2016-01-29 02:45:42 EST --- This bug is automatically being proposed for the current z-stream release of Red Hat Gluster Storage 3 by setting the release flag 'rhgs‑3.1.z' to '?'. If this bug should be proposed for a different release, please manually change the proposed release flag. --- Additional comment from Vijay Bellur on 2016-01-29 05:57:58 EST --- REVIEW: http://review.gluster.org/13319 (glusterd/rebalance: initialize defrag variable after glusterd restart) posted (#1) for review on master by mohammed rafi kc (rkavunga) --- Additional comment from Vijay Bellur on 2016-01-29 12:35:09 EST --- REVIEW: http://review.gluster.org/13319 (glusterd/rebalance: initialize defrag variable after glusterd restart) posted (#2) for review on master by mohammed rafi kc (rkavunga) --- Additional comment from Vijay Bellur on 2016-01-30 03:35:01 EST --- REVIEW: http://review.gluster.org/13319 (glusterd/rebalance: initialize defrag variable after glusterd restart) posted (#3) for review on master by mohammed rafi kc (rkavunga) --- Additional comment from Vijay Bellur on 2016-01-31 12:51:07 EST --- REVIEW: http://review.gluster.org/13319 (glusterd/rebalance: initialize defrag variable after glusterd restart) posted (#4) for review on master by mohammed rafi kc (rkavunga) --- Additional comment from Vijay Bellur on 2016-02-22 06:26:39 EST --- REVIEW: http://review.gluster.org/13319 (glusterd/rebalance: initialize defrag variable after glusterd restart) posted (#5) for review on master by mohammed rafi kc (rkavunga) --- Additional comment from Vijay Bellur on 2016-02-23 00:42:08 EST --- COMMIT: http://review.gluster.org/13319 committed in master by Atin Mukherjee (amukherj) ------ commit a67331f3f79e827ffa4f7a547f6898e12407bbf9 Author: Mohammed Rafi KC <rkavunga> Date: Fri Jan 29 16:24:02 2016 +0530 glusterd/rebalance: initialize defrag variable after glusterd restart During reblance restart after glusterd restarted, we are not connecting to rebalance process from glusterd, because the defrag variable in volinfo will be null. Initializing the variable will connect the rpc Change-Id: Id820cad6a3634a9fc976427fbe1c45844d3d4b9b BUG: 1303028 Signed-off-by: Mohammed Rafi KC <rkavunga> Reviewed-on: http://review.gluster.org/13319 Smoke: Gluster Build System <jenkins.com> NetBSD-regression: NetBSD Build System <jenkins.org> Reviewed-by: Dan Lambright <dlambrig> CentOS-regression: Gluster Build System <jenkins.com>
REVIEW: http://review.gluster.org/13494 (glusterd/rebalance: initialize defrag variable after glusterd restart) posted (#1) for review on release-3.7 by mohammed rafi kc (rkavunga)
COMMIT: http://review.gluster.org/13494 committed in release-3.7 by Atin Mukherjee (amukherj) ------ commit d9cc672719b96168c46bc82334f44efc010adad5 Author: Mohammed Rafi KC <rkavunga> Date: Fri Jan 29 16:24:02 2016 +0530 glusterd/rebalance: initialize defrag variable after glusterd restart During reblance restart after glusterd restarted, we are not connecting to rebalance process from glusterd, because the defrag variable in volinfo will be null. Initializing the variable will connect the rpc Back port of> >Change-Id: Id820cad6a3634a9fc976427fbe1c45844d3d4b9b >BUG: 1303028 >Signed-off-by: Mohammed Rafi KC <rkavunga> >Reviewed-on: http://review.gluster.org/13319 >Smoke: Gluster Build System <jenkins.com> >NetBSD-regression: NetBSD Build System <jenkins.org> >Reviewed-by: Dan Lambright <dlambrig> >CentOS-regression: Gluster Build System <jenkins.com> (cherry picked from commit a67331f3f79e827ffa4f7a547f6898e12407bbf9) Change-Id: Ieec82a798da937002e09fb9325c93678a5eefca8 BUG: 1311041 Signed-off-by: Mohammed Rafi KC <rkavunga> Reviewed-on: http://review.gluster.org/13494 Smoke: Gluster Build System <jenkins.com> CentOS-regression: Gluster Build System <jenkins.com> NetBSD-regression: NetBSD Build System <jenkins.org> Reviewed-by: Atin Mukherjee <amukherj>
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.9, please open a new bug report. glusterfs-3.7.9 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] https://www.gluster.org/pipermail/gluster-users/2016-March/025922.html [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user