Bug 916226 - Rebalance on replicate volume does not complete
Summary: Rebalance on replicate volume does not complete
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: replicate
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Pranith Kumar K
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 917325 952693
TreeView+ depends on / blocked
 
Reported: 2013-02-27 15:08 UTC by Pranith Kumar K
Modified: 2013-07-24 18:03 UTC (History)
2 users (show)

Fixed In Version: glusterfs-3.4.0
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 917325 (view as bug list)
Environment:
Last Closed: 2013-07-24 18:03:26 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Pranith Kumar K 2013-02-27 15:08:16 UTC
Description of problem:
With gluster test frame work if we execute the following script the rebalance never completes.
#!/bin/bash

. $(dirname $0)/../include.rc
. $(dirname $0)/../volume.rc

cleanup;

TEST glusterd
TEST pidof glusterd

TEST $CLI volume create $V0 replica 2 $H0:$B0/${V0}0 $H0:$B0/${V0}1 $H0:$B0/${V0}2 $H0:$B0/${V0}3
TEST $CLI volume set $V0 cluster.eager-lock on
TEST $CLI volume start $V0

## Mount FUSE
TEST glusterfs -s $H0 --volfile-id $V0 $M0;

TEST mkdir $M0/dir{1..10};
TEST touch $M0/dir{1..10}/files{1..10};

# add a brick process
TEST $CLI volume add-brick $V0 $H0:$B0/${V0}4 $H0:/$B0/${V0}5

TEST $CLI volume rebalance $V0 start force


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Vijay Bellur 2013-03-01 22:51:35 UTC
CHANGE: http://review.gluster.org/4588 (cluster/afr: Turn on eager-lock for fd DATA transactions) merged in master by Anand Avati (avati)

Comment 2 yin.yin 2013-03-04 08:17:09 UTC
bug-916226 check rebalance status not correct.

EXPECT_WITHIN 60 "success:" rebalance_status_field $V0
it is still in progress, not finished, but it print success.
should use another method to check rebalance staus, to check the status.

[root@cc02 tests]# gluster volume rebalance patchy status
                                    Node Rebalanced-files          size       scanned      failures         status run time in secs
                               ---------      -----------   -----------   -----------   -----------   ------------   --------------
                               localhost                0        0Bytes             1             0    in progress            56.00
volume rebalance: patchy: success:

Comment 3 Pranith Kumar K 2013-03-04 10:52:43 UTC
(In reply to comment #2)
> bug-916226 check rebalance status not correct.
> 
> EXPECT_WITHIN 60 "success:" rebalance_status_field $V0
> it is still in progress, not finished, but it print success.
> should use another method to check rebalance staus, to check the status.
> 
> [root@cc02 tests]# gluster volume rebalance patchy status
>                                     Node Rebalanced-files          size     
> scanned      failures         status run time in secs
>                                ---------      -----------   -----------  
> -----------   -----------   ------------   --------------
>                                localhost                0        0Bytes     
> 1             0    in progress            56.00
> volume rebalance: patchy: success:

Thanks a lot for pointing this out. The following patch should address this.
http://review.gluster.org/4614

Comment 4 Anand Avati 2013-04-29 15:17:56 UTC
REVIEW: http://review.gluster.org/4899 (cluster/afr: Turn on eager-lock for fd DATA transactions) posted (#1) for review on release-3.4 by Emmanuel Dreyfus (manu)

Comment 5 Anand Avati 2013-05-07 12:00:43 UTC
COMMIT: http://review.gluster.org/4899 committed in release-3.4 by Vijay Bellur (vbellur) 
------
commit eaa3cdcb80befe3fe7c6b181672bface9d4ff539
Author: Emmanuel Dreyfus <manu>
Date:   Mon Apr 29 17:15:56 2013 +0200

    cluster/afr: Turn on eager-lock for fd DATA transactions
    
    Problem:
    With the present implementation, eager-lock is issued for
    any fd fop. eager-lock is being transferred to metadata
    transactions. But the lk-owner is set to local->fd address
    only for DATA transactions, but for METADATA transactions
    it is frame->root. Because of this unlock on the eager-lock fails
    and rebalance hangs.
    
    Fix:
    Enable eager-lock for fd DATA transactions
    
    This is a backport of change If30df7486a0b2f5e4150d3259d1261f81473ce8a
    http://review.gluster.org/#/c/4588/
    
    BUG: 916226
    Change-Id: Id41ac17f467c37e7fd8863e0c19932d7b16344f8
    Signed-off-by: Emmanuel Dreyfus <manu>
    Reviewed-on: http://review.gluster.org/4899
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Vijay Bellur <vbellur>


Note You need to log in before you can comment on or make changes to this bug.