+++ This bug was initially created as a clone of Bug #1023921 +++ Description of problem: Rebalance status does not give the correct ouput when glusterd is made down and made up after a while. Version-Release number of selected component (if applicable): glusterfs-3.4.0.35.1u2rhs-1.el6rhs.x86_64 glusterfs-geo-replication-3.4.0.35.1u2rhs-1.el6rhs.x86_64 glusterfs-rdma-3.4.0.35.1u2rhs-1.el6rhs.x86_64 glusterfs-libs-3.4.0.35.1u2rhs-1.el6rhs.x86_64 glusterfs-fuse-3.4.0.35.1u2rhs-1.el6rhs.x86_64 glusterfs-server-3.4.0.35.1u2rhs-1.el6rhs.x86_64 glusterfs-api-3.4.0.35.1u2rhs-1.el6rhs.x86_64 samba-glusterfs-3.6.9-160.3.el6rhs.x86_64 How reproducible: Always Steps to Reproduce: 1. Create a distribute volume with 2 bricks. 2. Stop glusterd in one of the node. 3. start rebalance on the volume created. 4. Now check for the rebalance status using the command "gluster vol rebalance <vol_Name> status". The following is seen in the output. [root@localhost ~]# gluster vol rebalance vol_dis status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 0 0Bytes 0 1 0 failed 0.00 10.70.37.43 0 0Bytes 0 1 0 failed 0.00 10.70.37.75 0 0Bytes 0 1 0 failed 0.00 volume rebalance: vol_dis: success: 5. Now make the glusterd up in the node, where it was stopped. 6. Now check the rebalance status again. The following is seen in the ouput. [root@localhost ~]# gluster vol rebalance vol_dis status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 0 0Bytes 10 0 0 completed 0.00 10.70.37.43 0 0Bytes 10 0 0 completed 0.00 10.70.37.75 0 0Bytes 10 0 3 completed 0.00 10.70.37.108 0 0Bytes 10 0 2 completed 0.00 volume rebalance: vol_dis: success: Actual results: The rebalance status it shows is, prior to the one when glusterd was made down. Expected results: It should always show the last run rebalance output. Additional info: --- Additional comment from RHEL Product and Program Management on 2013-10-28 16:24:45 IST --- Since this issue was entered in bugzilla, the release flag has been set to ? to ensure that it is properly evaluated for this release. --- Additional comment from RamaKasturi on 2013-10-29 18:10:37 IST --- Above issue is not seen in glusterfs update1. 1) following is the ouput when glusterd was made down and rebalance was run [root@localhost ~]# gluster vol rebalance vol_dis status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 0 0Bytes 0 1 0 failed 0.00 10.70.34.85 0 0Bytes 0 1 0 failed 0.00 10.70.34.86 0 0Bytes 0 1 0 failed 0.00 volume rebalance: vol_dis: success: 2) Following is the ouput seen when glusterd is made up and checked for the status using the command "gluster vol rebalance vol_dis status" [root@localhost ~]# gluster vol rebalance vol_dis status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 0 0Bytes 0 1 0 not started 0.00 10.70.37.43 0 0Bytes 0 1 0 failed 0.00 10.70.37.75 0 0Bytes 0 1 0 failed 0.00 volume rebalance: vol_dis: success: --- Additional comment from RamaKasturi on 2013-10-29 19:02:58 IST --- The following is also seen when doing the above steps. 1. Create a distribute volume with 2 bircks. 2. Now add a brick to the volume. 3. Stop glusterd in one of the node and start rebalance. 4. The following is the ouput seen [root@localhost ~]# gluster vol rebalance vol_dis status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 0 0Bytes 0 1 0 failed 0.00 10.70.37.43 0 0Bytes 0 1 0 failed 0.00 10.70.37.75 0 0Bytes 0 1 0 failed 0.00 volume rebalance: vol_dis: success: 5. Now start glusterd in the node and check for the status . The following output comes. [root@localhost ~]# gluster vol rebalance vol_dis status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 0 0Bytes 42 0 15 in progress 17.00 10.70.37.43 0 0Bytes 60 0 2 completed 0.00 10.70.37.75 0 0Bytes 60 0 0 completed 0.00 10.70.37.108 0 0Bytes 1 0 0 in progress 17.00 volume rebalance: vol_dis: success: Actual results: After doing step 5 , rebalance process starts automatically which it should not. --- Additional comment from Dusmant on 2013-10-30 15:35:20 IST --- Needed by RHSC --- Additional comment from Kaushal on 2013-11-28 10:09:01 IST --- Taking the bug under my name as I'm actively working on this right now. I should have done this earlier, but since I was the only one working on the RHSC dependencies at that time, I left it at that. My mistake.
REVIEW: http://review.gluster.org/6334 (glusterd: Improve rebalance handling during volume sync) posted (#2) for review on master by Kaushal M (kaushal)
REVIEW: http://review.gluster.org/6334 (glusterd: Improve rebalance handling during volume sync) posted (#3) for review on master by Kaushal M (kaushal)
REVIEW: http://review.gluster.org/6334 (glusterd: Improve rebalance handling during volume sync) posted (#4) for review on master by Kaushal M (kaushal)
REVIEW: http://review.gluster.org/6334 (glusterd: Improve rebalance handling during volume sync) posted (#5) for review on master by Kaushal M (kaushal)
REVIEW: http://review.gluster.org/6334 (glusterd: Improve rebalance handling during volume sync) posted (#6) for review on master by Kaushal M (kaushal)
REVIEW: http://review.gluster.org/6334 (glusterd: Improve rebalance handling during volume sync) posted (#7) for review on master by Kaushal M (kaushal)
COMMIT: http://review.gluster.org/6334 committed in master by Vijay Bellur (vbellur) ------ commit cb44756616f2ef9a6480adf104efa108300b06c3 Author: Kaushal M <kaushal> Date: Fri Nov 22 11:27:14 2013 +0530 glusterd: Improve rebalance handling during volume sync Glusterd will now correctly copy existing rebalance information when a volinfo is updated during volume sync. If the existing rebalance information was stale, then any existing rebalance process will be termimnated. A new rebalance process will be started only if there is no existing rebalance process. The rebalance process will not be started if the existing rebalance session had completed, failed or been stopped. Change-Id: I68c5984267c188734da76770ba557662d4ea3ee0 BUG: 1036464 Signed-off-by: Kaushal M <kaushal> Reviewed-on: http://review.gluster.org/6334 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Vijay Bellur <vbellur>
REVIEW: http://review.gluster.org/6564 (glusterd: Improve rebalance handling during volume sync) posted (#1) for review on release-3.5 by Krishnan Parthasarathi (kparthas)
COMMIT: http://review.gluster.org/6564 committed in release-3.5 by Vijay Bellur (vbellur) ------ commit b07107511c51ae518a1a952ff9c223673cd218a8 Author: Krishnan Parthasarathi <kparthas> Date: Mon Dec 23 14:07:50 2013 +0530 glusterd: Improve rebalance handling during volume sync Backport of http://review.gluster.org/6334 Glusterd will now correctly copy existing rebalance information when a volinfo is updated during volume sync. If the existing rebalance information was stale, then any existing rebalance process will be termimnated. A new rebalance process will be started only if there is no existing rebalance process. The rebalance process will not be started if the existing rebalance session had completed, failed or been stopped. Change-Id: I68c5984267c188734da76770ba557662d4ea3ee0 BUG: 1036464 Signed-off-by: Kaushal M <kaushal> Signed-off-by: Krishnan Parthasarathi <kparthas> Reviewed-on: http://review.gluster.org/6564 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Vijay Bellur <vbellur>
A beta release for GlusterFS 3.6.0 has been released. Please verify if the release solves this bug report for you. In case the glusterfs-3.6.0beta1 release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED. Packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update (possibly an "updates-testing" repository) infrastructure for your distribution. [1] http://supercolony.gluster.org/pipermail/gluster-users/2014-September/018836.html [2] http://supercolony.gluster.org/pipermail/gluster-users/
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.6.1, please reopen this bug report. glusterfs-3.6.1 has been announced [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://supercolony.gluster.org/pipermail/gluster-users/2014-November/019410.html [2] http://supercolony.gluster.org/mailman/listinfo/gluster-users