Description of problem: After adding a new brick from a new node, on a 2 node Distribute volume, rebalance was run, and allowed to be completed, as per the output of the rebalance status command. But still the gluster volume status output shows rebalance as running. Version-Release number of selected component (if applicable): glusterfs-server-3.4.0.8rhs-1.el6rhs.x86_64 gluster-swift-container-1.4.8-4.el6.noarch glusterfs-fuse-3.4.0.8rhs-1.el6rhs.x86_64 glusterfs-geo-replication-3.4.0.8rhs-1.el6rhs.x86_64 gluster-swift-1.4.8-4.el6.noarch gluster-swift-doc-1.4.8-4.el6.noarch vdsm-gluster-4.10.2-4.0.qa5.el6rhs.noarch gluster-swift-plugin-1.0-5.noarch gluster-swift-proxy-1.4.8-4.el6.noarch gluster-swift-account-1.4.8-4.el6.noarch org.apache.hadoop.fs.glusterfs-glusterfs-0.20.2_0.2-1.noarch glusterfs-3.4.0.8rhs-1.el6rhs.x86_64 glusterfs-rdma-3.4.0.8rhs-1.el6rhs.x86_64 gluster-swift-object-1.4.8-4.el6.noarch How reproducible: Steps to Reproduce: 1.Create 2 node Distribute volume, start using the volume, then add another brick, and run rebalance. Volume Name: RHEV-RHS_Dist Type: Distribute Volume ID: 4e1ba3b2-3d16-4c0d-b131-8aad989af138 Status: Started Number of Bricks: 3 Transport-type: tcp Bricks: Brick1: rhs-client45.lab.eng.blr.redhat.com:/rhs/brick4/RHEV-RHS_Dist Brick2: rhs-client37.lab.eng.blr.redhat.com:/rhs/brick4/RHEV-RHS_Dist Brick3: rhs-client15.lab.eng.blr.redhat.com:/rhs/brick4/RHEV-RHS_Dist Options Reconfigured: cluster.subvols-per-directory: 1 storage.owner-gid: 36 storage.owner-uid: 36 network.remote-dio: enable cluster.eager-lock: enable performance.stat-prefetch: off performance.io-cache: off performance.read-ahead: off performance.quick-read: off 2. Check output of rebalance status command for rebalance completion. #gluster volume rebalance RHEV-RHS_Dist status Node Rebalanced-files size scanned failures status run time in secs --------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 6 10.0GB 23 3 completed 171.00 rhs-client37.lab.eng.blr.redhat.com 2 25.0GB 19 6 completed 311.00 rhs-client4.lab.eng.blr.redhat.com 0 0Bytes 18 0 completed 2.00 rhs-client15.lab.eng.blr.redhat.com 0 0Bytes 18 0 completed 2.00 volume rebalance: RHEV-RHS_Dist: success: 3. Check output of gluster volume status command, and rebalance is reported as running [Fri May 17 16:02:30 root@rhs-client45:~ ] #gluster volume status Status of volume: RHEV-RHS_Dist Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick rhs-client45.lab.eng.blr.redhat.com:/rhs/brick4/R HEV-RHS_Dist 49163 Y 30387 Brick rhs-client37.lab.eng.blr.redhat.com:/rhs/brick4/R HEV-RHS_Dist 49163 Y 30010 Brick rhs-client15.lab.eng.blr.redhat.com:/rhs/brick4/R HEV-RHS_Dist 49164 Y 30715 NFS Server on localhost 2049 Y 31230 NFS Server on f167208d-13df-4532-897f-0887204a2e39 2049 Y 30725 NFS Server on 9abcd448-f230-411c-9565-8f75a782f56a 2049 Y 30651 NFS Server on 838d97b8-6881-43ba-8f67-b0d17fea74cf 2049 Y 30816 Task ID Status ---- -- ------ Rebalance f6c0d972-4f79-439a-9570-974b6e7c69d8 3 The whole string of commands is given below. ------------------------------------------------------------------------ [Fri May 17 15:47:22 root@rhs-client45:~ ] #gluster volume info Volume Name: RHEV-RHS_Dist Type: Distribute Volume ID: 4e1ba3b2-3d16-4c0d-b131-8aad989af138 Status: Started Number of Bricks: 2 Transport-type: tcp Bricks: Brick1: rhs-client45.lab.eng.blr.redhat.com:/rhs/brick4/RHEV-RHS_Dist Brick2: rhs-client37.lab.eng.blr.redhat.com:/rhs/brick4/RHEV-RHS_Dist Options Reconfigured: cluster.subvols-per-directory: 1 storage.owner-gid: 36 storage.owner-uid: 36 network.remote-dio: enable cluster.eager-lock: enable performance.stat-prefetch: off performance.io-cache: off performance.read-ahead: off performance.quick-read: off [Fri May 17 15:47:30 root@rhs-client45:~ ] #gluster volume status Status of volume: RHEV-RHS_Dist Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick rhs-client45.lab.eng.blr.redhat.com:/rhs/brick4/R HEV-RHS_Dist 49163 Y 30387 Brick rhs-client37.lab.eng.blr.redhat.com:/rhs/brick4/R HEV-RHS_Dist 49163 Y 30010 NFS Server on localhost 2049 Y 30398 NFS Server on 9abcd448-f230-411c-9565-8f75a782f56a 2049 Y 29855 NFS Server on 838d97b8-6881-43ba-8f67-b0d17fea74cf 2049 Y 30021 NFS Server on f167208d-13df-4532-897f-0887204a2e39 2049 Y 29917 There are no active volume tasks [Fri May 17 15:49:16 root@rhs-client45:~ ] #gluster volume add-brick RHEV-RHS_Dist rhs-client15.lab.eng.blr.redhat.com:/rhs/brick4/RHEV-RHS_Dist volume add-brick: success [Fri May 17 15:49:44 root@rhs-client45:~ ] #gluster volume info Volume Name: RHEV-RHS_Dist Type: Distribute Volume ID: 4e1ba3b2-3d16-4c0d-b131-8aad989af138 Status: Started Number of Bricks: 3 Transport-type: tcp Bricks: Brick1: rhs-client45.lab.eng.blr.redhat.com:/rhs/brick4/RHEV-RHS_Dist Brick2: rhs-client37.lab.eng.blr.redhat.com:/rhs/brick4/RHEV-RHS_Dist Brick3: rhs-client15.lab.eng.blr.redhat.com:/rhs/brick4/RHEV-RHS_Dist Options Reconfigured: cluster.subvols-per-directory: 1 storage.owner-gid: 36 storage.owner-uid: 36 network.remote-dio: enable cluster.eager-lock: enable performance.stat-prefetch: off performance.io-cache: off performance.read-ahead: off performance.quick-read: off [Fri May 17 15:49:52 root@rhs-client45:~ ] #gluster volume status Status of volume: RHEV-RHS_Dist Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick rhs-client45.lab.eng.blr.redhat.com:/rhs/brick4/R HEV-RHS_Dist 49163 Y 30387 Brick rhs-client37.lab.eng.blr.redhat.com:/rhs/brick4/R HEV-RHS_Dist 49163 Y 30010 Brick rhs-client15.lab.eng.blr.redhat.com:/rhs/brick4/R HEV-RHS_Dist 49164 Y 30715 NFS Server on localhost 2049 Y 31230 NFS Server on 9abcd448-f230-411c-9565-8f75a782f56a 2049 Y 30651 NFS Server on f167208d-13df-4532-897f-0887204a2e39 2049 Y 30725 NFS Server on 838d97b8-6881-43ba-8f67-b0d17fea74cf 2049 Y 30816 There are no active volume tasks [Fri May 17 15:49:59 root@rhs-client45:~ ] #gluster volume rebalance RHEV-RHS_Dist start volume rebalance: RHEV-RHS_Dist: success: Starting rebalance on volume RHEV-RHS_Dist has been successful. ID: f6c0d972-4f79-439a-9570-974b6e7c69d8 [Fri May 17 15:50:52 root@rhs-client45:~ ] #gluster volume status Status of volume: RHEV-RHS_Dist Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick rhs-client45.lab.eng.blr.redhat.com:/rhs/brick4/R HEV-RHS_Dist 49163 Y 30387 Brick rhs-client37.lab.eng.blr.redhat.com:/rhs/brick4/R HEV-RHS_Dist 49163 Y 30010 Brick rhs-client15.lab.eng.blr.redhat.com:/rhs/brick4/R HEV-RHS_Dist 49164 Y 30715 NFS Server on localhost 2049 Y 31230 NFS Server on 9abcd448-f230-411c-9565-8f75a782f56a 2049 Y 30651 NFS Server on f167208d-13df-4532-897f-0887204a2e39 2049 Y 30725 NFS Server on 838d97b8-6881-43ba-8f67-b0d17fea74cf 2049 Y 30816 Task ID Status ---- -- ------ Rebalance f6c0d972-4f79-439a-9570-974b6e7c69d8 1 [Fri May 17 15:51:05 root@rhs-client45:~ ] #gluster volume rebalance RHEV-RHS_Dist status Node Rebalanced-files size scanned failures status run time in secs --------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 5 3.0MB 11 0 in progress 20.00 rhs-client37.lab.eng.blr.redhat.com 0 0Bytes 10 0 in progress 20.00 rhs-client4.lab.eng.blr.redhat.com 0 0Bytes 18 0 completed 2.00 rhs-client15.lab.eng.blr.redhat.com 0 0Bytes 18 0 completed 2.00 volume rebalance: RHEV-RHS_Dist: success: [Fri May 17 15:51:12 root@rhs-client45:~ ] #gluster volume rebalance RHEV-RHS_Dist status Node Rebalanced-files size scanned failures status run time in secs --------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 6 10.0GB 23 3 completed 171.00 rhs-client37.lab.eng.blr.redhat.com 0 0Bytes 10 0 in progress 206.00 rhs-client4.lab.eng.blr.redhat.com 0 0Bytes 18 0 completed 2.00 rhs-client15.lab.eng.blr.redhat.com 0 0Bytes 18 0 completed 2.00 volume rebalance: RHEV-RHS_Dist: success: [Fri May 17 15:54:18 root@rhs-client45:~ ] #gluster volume rebalance RHEV-RHS_Dist status Node Rebalanced-files size scanned failures status run time in secs --------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 6 10.0GB 23 3 completed 171.00 rhs-client37.lab.eng.blr.redhat.com 1 15.0GB 14 2 in progress 286.00 rhs-client4.lab.eng.blr.redhat.com 0 0Bytes 18 0 completed 2.00 rhs-client15.lab.eng.blr.redhat.com 0 0Bytes 18 0 completed 2.00 volume rebalance: RHEV-RHS_Dist: success: [Fri May 17 15:55:38 root@rhs-client45:~ ] #gluster volume rebalance RHEV-RHS_Dist status Node Rebalanced-files size scanned failures status run time in secs --------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 6 10.0GB 23 3 completed 171.00 rhs-client37.lab.eng.blr.redhat.com 2 25.0GB 19 6 completed 311.00 rhs-client4.lab.eng.blr.redhat.com 0 0Bytes 18 0 completed 2.00 rhs-client15.lab.eng.blr.redhat.com 0 0Bytes 18 0 completed 2.00 volume rebalance: RHEV-RHS_Dist: success: [Fri May 17 16:02:30 root@rhs-client45:~ ] #gluster volume status Status of volume: RHEV-RHS_Dist Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick rhs-client45.lab.eng.blr.redhat.com:/rhs/brick4/R HEV-RHS_Dist 49163 Y 30387 Brick rhs-client37.lab.eng.blr.redhat.com:/rhs/brick4/R HEV-RHS_Dist 49163 Y 30010 Brick rhs-client15.lab.eng.blr.redhat.com:/rhs/brick4/R HEV-RHS_Dist 49164 Y 30715 NFS Server on localhost 2049 Y 31230 NFS Server on f167208d-13df-4532-897f-0887204a2e39 2049 Y 30725 NFS Server on 9abcd448-f230-411c-9565-8f75a782f56a 2049 Y 30651 NFS Server on 838d97b8-6881-43ba-8f67-b0d17fea74cf 2049 Y 30816 Task ID Status ---- -- ------ Rebalance f6c0d972-4f79-439a-9570-974b6e7c69d8 3 [Fri May 17 16:11:09 root@rhs-client45:~ ] #gluster volume rebalance RHEV-RHS_Dist status Node Rebalanced-files size scanned failures status run time in secs --------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 6 10.0GB 23 3 completed 171.00 rhs-client37.lab.eng.blr.redhat.com 2 25.0GB 19 6 completed 311.00 rhs-client4.lab.eng.blr.redhat.com 0 0Bytes 18 0 completed 2.00 rhs-client15.lab.eng.blr.redhat.com 0 0Bytes 18 0 completed 2.00 volume rebalance: RHEV-RHS_Dist: success: [Fri May 17 16:11:23 root@rhs-client45:~ ] #gluster volume status Status of volume: RHEV-RHS_Dist Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick rhs-client45.lab.eng.blr.redhat.com:/rhs/brick4/R HEV-RHS_Dist 49163 Y 30387 Brick rhs-client37.lab.eng.blr.redhat.com:/rhs/brick4/R HEV-RHS_Dist 49163 Y 30010 Brick rhs-client15.lab.eng.blr.redhat.com:/rhs/brick4/R HEV-RHS_Dist 49164 Y 30715 NFS Server on localhost 2049 Y 31230 NFS Server on 838d97b8-6881-43ba-8f67-b0d17fea74cf 2049 Y 30816 NFS Server on f167208d-13df-4532-897f-0887204a2e39 2049 Y 30725 NFS Server on 9abcd448-f230-411c-9565-8f75a782f56a 2049 Y 30651 Task ID Status ---- -- ------ Rebalance f6c0d972-4f79-439a-9570-974b6e7c69d8 3 [Fri May 17 16:20:43 root@rhs-client45:~ ] #gluster volume rebalance RHEV-RHS_Dist status Node Rebalanced-files size scanned failures status run time in secs --------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 6 10.0GB 23 3 completed 171.00 rhs-client37.lab.eng.blr.redhat.com 2 25.0GB 19 6 completed 311.00 rhs-client4.lab.eng.blr.redhat.com 0 0Bytes 18 0 completed 2.00 rhs-client15.lab.eng.blr.redhat.com 0 0Bytes 18 0 completed 2.00 volume rebalance: RHEV-RHS_Dist: success: [Fri May 17 16:20:46 root@rhs-client45:~ ] # ------------------------------------------------------------------------ Actual results: Inconsistency on status of the rebalance task in the output of the two commands - gluster volume status gluster volume rebalance <VOLUME-NAME> status Expected results: All status commands must be consistent and accurate in reporting status of tasks. Additional info:
The rebalance status being shown in 'volume status' is completed. The confusion in this case was because the rebalance status was being shown as an index number instead of a string. '3' is completed, whereas '1' is started/running. With changes done for bug 955611, now 'volume status' proper strings instead of an unknown index. With this, instead of getting '3' as the rebalance status in 'volume status', you'd get 'completed'. Moving this to MODIFIED as the changes introduced for 955611 shouldn't cause this confusion anymore.
The fix is available is glusterfs-3.4.0.35.1u2rhs.
Verified on glusterfs-server-3.4.0.44.1u2rhs-1.el6rhs.x86_64 The outputs are much clearer now. Examples of current outputs given below. .................................................................... # gluster volume status Status of volume: revol Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick rhs1:/srv/rhs/brick1/revol 49152 Y 6835 Brick rhs2:/srv/rhs/brick1/revol 49152 Y 6738 NFS Server on localhost 2049 Y 6847 NFS Server on rhs4 2049 Y 7583 NFS Server on rhs3 2049 Y 6686 NFS Server on rhs2 2049 Y 6750 Task Status of Volume revol ------------------------------------------------------------------------------ There are no active volume tasks .... # gluster volume status Status of volume: revol Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick rhs1:/srv/rhs/brick1/revol 49152 Y 6835 Brick rhs2:/srv/rhs/brick1/revol 49152 Y 6738 Brick rhs3:/srv/rhs/brick2/revol 49152 Y 6820 NFS Server on localhost 2049 Y 7045 NFS Server on rhs4 2049 Y 7715 NFS Server on rhs3 2049 Y 6832 NFS Server on rhs2 2049 Y 6907 Task Status of Volume revol ------------------------------------------------------------------------------ Task : Rebalance ID : 62cff49e-cd7f-417e-a1ff-7bfcd245203b Status : in progress .... # gluster volume status Status of volume: revol Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick rhs1:/srv/rhs/brick1/revol 49152 Y 6835 Brick rhs2:/srv/rhs/brick1/revol 49152 Y 6738 Brick rhs3:/srv/rhs/brick2/revol 49152 Y 6820 NFS Server on localhost 2049 Y 7045 NFS Server on rhs4 2049 Y 7715 NFS Server on rhs3 2049 Y 6832 NFS Server on rhs2 2049 Y 6907 Task Status of Volume revol ------------------------------------------------------------------------------ Task : Rebalance ID : 62cff49e-cd7f-417e-a1ff-7bfcd245203b Status : completed ....................................................................
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-1278.html