Bug 1285289

Summary: Files skipped during rebalance because destination brick size is lesser than source brick size
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Shruti Sampat <ssampat>
Component: distributeAssignee: Nithya Balachandran <nbalacha>
Status: CLOSED NOTABUG QA Contact: storage-qa-internal <storage-qa-internal>
Severity: medium Docs Contact:
Priority: unspecified    
Version: rhgs-3.1CC: rgowdapp, rhs-bugs
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-06-28 07:15:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Shruti Sampat 2015-11-25 11:17:36 UTC
Description of problem:
-----------------------

While running rebalance on a 3x2 distributed replicate volume, I see that files being skipped. The following is from rebalance logs -

[2015-11-25 16:29:59.914769] I [dht-rebalance.c:1002:dht_migrate_file] 0-vol-dht: /newfile_944: attempting to move from vol-replicate-1 to vol-replicate-0
[2015-11-25 16:29:59.914796] I [dht-rebalance.c:1002:dht_migrate_file] 0-vol-dht: /newfile_952: attempting to move from vol-replicate-1 to vol-replicate-0
[2015-11-25 16:29:59.921714] W [MSGID: 109023] [dht-rebalance.c:657:__dht_check_free_space] 0-vol-dht: data movement attempted from node (vol-replicate-1:142365920) with higher disk space to a node (vol-repli
cate-2:141751520) with lesser disk space, file { blocks:204800, name:(/newfile_927) }
[2015-11-25 16:29:59.922552] I [dht-rebalance.c:1002:dht_migrate_file] 0-vol-dht: /newfile_967: attempting to move from vol-replicate-1 to vol-replicate-2
[2015-11-25 16:29:59.931073] W [MSGID: 109023] [dht-rebalance.c:657:__dht_check_free_space] 0-vol-dht: data movement attempted from node (vol-replicate-1:142365920) with higher disk space to a node (vol-replicate-0:142161120) with lesser disk space, file { blocks:204800, name:(/newfile_952) }
[2015-11-25 16:29:59.931814] W [MSGID: 109023] [dht-rebalance.c:657:__dht_check_free_space] 0-vol-dht: data movement attempted from node (vol-replicate-1:142365920) with higher disk space to a node (vol-replicate-0:142161120) with lesser disk space, file { blocks:204800, name:(/newfile_944) }
[2015-11-25 16:29:59.939816] W [MSGID: 109023] [dht-rebalance.c:657:__dht_check_free_space] 0-vol-dht: data movement attempted from node (vol-replicate-1:142365920) with higher disk space to a node (vol-replicate-2:141751520) with lesser disk space, file { blocks:204800, name:(/newfile_967) }

I see these files to be present as `T' files on the destination brick.

The destination brick has enough space to accommodate the file so it should be migrated, in my opinion. 

Version-Release number of selected component (if applicable):
--------------------------------------------------------------
glusterfs-3.7.1-11.el7rhgs.x86_64

How reproducible:
-----------------
Frequently

Steps to Reproduce:
-------------------
1. A 2x2 distributed-replicate volume running inside a container is mounted on another container.  
2. Created about 1000 files of 100M each. 
3. Added another pair of bricks to make the volume 3x2.
4. Performed rebalance on the volume.
5. After the rebalance was completed, renamed each file from the client container.
6. Started rebalance on the volume again.

Actual results:
---------------
Some files were skipped during migration.

Expected results:
-----------------
Files should not be skipped as long as the destination brick has sufficient space to accommodate it.