Bug 916226

Summary:	Rebalance on replicate volume does not complete
Product:	[Community] GlusterFS	Reporter:	Pranith Kumar K <pkarampu>
Component:	replicate	Assignee:	Pranith Kumar K <pkarampu>
Status:	CLOSED CURRENTRELEASE	QA Contact:
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	mainline	CC:	gluster-bugs, yinyin2010
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	glusterfs-3.4.0	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:
Clones:	917325 (view as bug list)		Environment:
Last Closed:	2013-07-24 18:03:26 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	917325, 952693

Description Pranith Kumar K 2013-02-27 15:08:16 UTC

Description of problem:
With gluster test frame work if we execute the following script the rebalance never completes.
#!/bin/bash

. $(dirname $0)/../include.rc
. $(dirname $0)/../volume.rc

cleanup;

TEST glusterd
TEST pidof glusterd

TEST $CLI volume create $V0 replica 2 $H0:$B0/${V0}0 $H0:$B0/${V0}1 $H0:$B0/${V0}2 $H0:$B0/${V0}3
TEST $CLI volume set $V0 cluster.eager-lock on
TEST $CLI volume start $V0

## Mount FUSE
TEST glusterfs -s $H0 --volfile-id $V0 $M0;

TEST mkdir $M0/dir{1..10};
TEST touch $M0/dir{1..10}/files{1..10};

# add a brick process
TEST $CLI volume add-brick $V0 $H0:$B0/${V0}4 $H0:/$B0/${V0}5

TEST $CLI volume rebalance $V0 start force


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Vijay Bellur 2013-03-01 22:51:35 UTC

CHANGE: http://review.gluster.org/4588 (cluster/afr: Turn on eager-lock for fd DATA transactions) merged in master by Anand Avati (avati)

Comment 2 yin.yin 2013-03-04 08:17:09 UTC

bug-916226 check rebalance status not correct.

EXPECT_WITHIN 60 "success:" rebalance_status_field $V0
it is still in progress, not finished, but it print success.
should use another method to check rebalance staus, to check the status.

[root@cc02 tests]# gluster volume rebalance patchy status
                                    Node Rebalanced-files          size       scanned      failures         status run time in secs
                               ---------      -----------   -----------   -----------   -----------   ------------   --------------
                               localhost                0        0Bytes             1             0    in progress            56.00
volume rebalance: patchy: success:

Comment 3 Pranith Kumar K 2013-03-04 10:52:43 UTC

(In reply to comment #2)
> bug-916226 check rebalance status not correct.
> 
> EXPECT_WITHIN 60 "success:" rebalance_status_field $V0
> it is still in progress, not finished, but it print success.
> should use another method to check rebalance staus, to check the status.
> 
> [root@cc02 tests]# gluster volume rebalance patchy status
>                                     Node Rebalanced-files          size     
> scanned      failures         status run time in secs
>                                ---------      -----------   -----------  
> -----------   -----------   ------------   --------------
>                                localhost                0        0Bytes     
> 1             0    in progress            56.00
> volume rebalance: patchy: success:

Thanks a lot for pointing this out. The following patch should address this.
http://review.gluster.org/4614

Comment 4 Anand Avati 2013-04-29 15:17:56 UTC

REVIEW: http://review.gluster.org/4899 (cluster/afr: Turn on eager-lock for fd DATA transactions) posted (#1) for review on release-3.4 by Emmanuel Dreyfus (manu)

Comment 5 Anand Avati 2013-05-07 12:00:43 UTC

COMMIT: http://review.gluster.org/4899 committed in release-3.4 by Vijay Bellur (vbellur) 
------
commit eaa3cdcb80befe3fe7c6b181672bface9d4ff539
Author: Emmanuel Dreyfus <manu>
Date:   Mon Apr 29 17:15:56 2013 +0200

    cluster/afr: Turn on eager-lock for fd DATA transactions
    
    Problem:
    With the present implementation, eager-lock is issued for
    any fd fop. eager-lock is being transferred to metadata
    transactions. But the lk-owner is set to local->fd address
    only for DATA transactions, but for METADATA transactions
    it is frame->root. Because of this unlock on the eager-lock fails
    and rebalance hangs.
    
    Fix:
    Enable eager-lock for fd DATA transactions
    
    This is a backport of change If30df7486a0b2f5e4150d3259d1261f81473ce8a
    http://review.gluster.org/#/c/4588/
    
    BUG: 916226
    Change-Id: Id41ac17f467c37e7fd8863e0c19932d7b16344f8
    Signed-off-by: Emmanuel Dreyfus <manu>
    Reviewed-on: http://review.gluster.org/4899
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Vijay Bellur <vbellur>