Bug 764116 (GLUSTER-2384) - volume rebalance is unsuccessful
Summary: volume rebalance is unsuccessful
Keywords:
Status: CLOSED DUPLICATE of bug 763990
Alias: GLUSTER-2384
Product: GlusterFS
Classification: Community
Component: replicate
Version: 3.1.2
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Amar Tumballi
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-02-07 12:41 UTC by Saurabh
Modified: 2013-12-19 00:06 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)
attachment of glusterd.vol.log file from the brick (166.39 KB, text/x-log)
2011-02-07 09:44 UTC, Saurabh
no flags Details

Description Saurabh 2011-02-07 09:44:08 UTC
Created attachment 432

Comment 1 Saurabh 2011-02-07 09:45:06 UTC
assigning to Amar, as discussed with him

Comment 2 Saurabh 2011-02-07 12:41:05 UTC
Hello,
 

   I have a dist-rep volume and I added bricks to it, but the rebalance after addition is unsuccessful,


  ---- it is a distribute-replicate volume-------


gluster> volume info

Volume Name: repdist
Type: Distributed-Replicate
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: domU-12-31-39-02-75-92.compute-1.internal:/mnt/repdist
Brick2: domU-12-31-39-02-9A-DC.compute-1.internal:/mnt/repdist1
Brick3: domU-12-31-39-03-B4-C0.compute-1.internal:/mnt/repdist
Brick4: domU-12-31-39-03-6D-DE.compute-1.internal:/mnt/repdist1
Options Reconfigured:
diagnostics.brick-log-level: WARNING
diagnostics.client-log-level: DEBUG


----- added the bricks here------------

gluster> volume info repdist

Volume Name: repdist
Type: Distributed-Replicate
Status: Started
Number of Bricks: 3 x 2 = 6
Transport-type: tcp
Bricks:
Brick1: domU-12-31-39-02-75-92.compute-1.internal:/mnt/repdist
Brick2: domU-12-31-39-02-9A-DC.compute-1.internal:/mnt/repdist1
Brick3: domU-12-31-39-03-B4-C0.compute-1.internal:/mnt/repdist
Brick4: domU-12-31-39-03-6D-DE.compute-1.internal:/mnt/repdist1
Brick5: domU-12-31-39-02-75-92.compute-1.internal:/mnt/repdist1
Brick6: domU-12-31-39-03-B4-C0.compute-1.internal:/mnt/repdist2
Options Reconfigured:
diagnostics.brick-log-level: WARNING
diagnostics.client-log-level: DEBUG
gluster> volume rebalance repdist start
starting rebalance on volume repdist has been unsuccessful
Rebalance already started on volume repdist
gluster> volume rebalance repdist status
rebalance not started
gluster>



gluster> volume rebalance repdist stop
stopped rebalance process of volume repdist 

-------rebalace fails-------------

gluster> volume remove-brick repdist domU-12-31-39-02-75-92.compute-1.internal:/mnt/repdist1 domU-12-31-39-03-B4-C0.compute-1.internal:/mnt/repdist2
Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y
Remove Brick successful
gluster> volume info repdist

Volume Name: repdist
Type: Distributed-Replicate
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: domU-12-31-39-02-75-92.compute-1.internal:/mnt/repdist
Brick2: domU-12-31-39-02-9A-DC.compute-1.internal:/mnt/repdist1
Brick3: domU-12-31-39-03-B4-C0.compute-1.internal:/mnt/repdist
Brick4: domU-12-31-39-03-6D-DE.compute-1.internal:/mnt/repdist1
gluster> volume rebalance repdist start
starting rebalance on volume repdist has been unsuccessful
Rebalance already started on volume repdist
gluster> volume rebalance repdist status
rebalance not started
gluster>



  I have already tried stopping/restarting the glusterd, but the issue still remains same.


   As per the discussion with Amar just recently , he pointed that it may issue similar to bug-1922, though for further debugging in to the present scenario filing this bug.

Comment 3 stsmith 2011-02-15 15:44:29 UTC
I encountered a similar issue:

root@iss01:~# gluster volume rebalance image-service start
starting rebalance on volume image-service has been unsuccessful

I didn't notice any relevant messages in the logs.  After googling I guessed that the issue might be caused by time not being in sync across my peers.  I then ran ntpdate manually and installed ntp on each peer.  After restarting glusterd on each peer I was able to rebalance:

root@iss01:~# gluster volume rebalance image-service start
starting rebalance on volume image-service has been unsuccessful
Rebalance already started on volume image-service
root@iss01:~# /etc/init.d/glusterd restart
 * Stopping glusterd service glusterd
   ...done.
 * Starting glusterd service glusterd
   ...done.
root@iss01:~# gluster volume rebalance image-service status
rebalance not started
root@iss01:~# gluster volume rebalance image-service start
starting rebalance on volume image-service has been successful

Thanks,
Stephen

Comment 4 Amar Tumballi 2011-02-23 09:31:46 UTC
With the rebalance enhancements, this issue will be solved.

*** This bug has been marked as a duplicate of bug 2258 ***


Note You need to log in before you can comment on or make changes to this bug.