Bug 956276

Summary: DHT - rebalance - If no. of 'subvols-per-directory' is less than no. of up sub-volumes then log of each rebalance run says 'found anomalies in <dir>. holes=#X overlaps=#Y'
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Rachana Patel <racpatel>
Component: glusterfsAssignee: Ravishankar N <ravishankar>
Status: CLOSED ERRATA QA Contact: amainkar
Severity: medium Docs Contact:
Priority: medium    
Version: 2.1CC: amarts, asriram, kaushal, rhs-bugs, sdharane, vagarwal, vbellur
Target Milestone: ---Keywords: Reopened, ZStream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.4.0.36rhs-1 Doc Type: Bug Fix
Doc Text:
Previously, after adding a brick the DHT (Distribute Internal Layout) used to be re-written without triggering re-balance. Now, with this update adding bricks does not trigger layout re-write and solves the problem of confusing logs.
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-11-27 15:24:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Rachana Patel 2013-04-24 15:00:16 UTC
Description of problem:
DHT - rebalance - If no. of 'subvols-per-directory' is less than no. of up sub-volumes then log of each rebalance run says 'found anomalies in <dir>. holes=#X overlaps=#Y'

Version-Release number of selected component (if applicable):
3.4.0.1rhs-1.el6rhs.x86_64

How reproducible:
always

Steps to Reproduce:
1. Create a Distributed volume having 2 or more sub-volume and start the volume.

2. Fuse Mount the volume from the client-1 using “mount -t glusterfs  server:/<volume> <client-1_mount_point>”

mount -t glusterfs XXX:/<volname> /mnt/XXX

3. Change sub-vol per dir option and make sure it is less than no. of up sub-volumes.

4.Run rebalance command for that volume with force option and check status 


5. check log for that rebalance
log will have entry like - 'found anomalies in <dir>. holes=#X overlaps=#Y' for all directory

and it will fix-layout for that directory. (expected behaviour)

On backend verify that layout has been fixed for all directories


[root@mia ~]# gluster volume rebalance v1 status
                                    Node Rebalanced-files          size       scanned      failures         status run time in secs
                               ---------      -----------   -----------   -----------   -----------   ------------   --------------
                               localhost               25        0Bytes           101             0      completed             1.00
             fred.lab.eng.blr.redhat.com               31        0Bytes            82             0      completed             1.00
             fred.lab.eng.blr.redhat.com               31        0Bytes            82             0      completed             1.00
              fan.lab.eng.blr.redhat.com                9        0Bytes            91             0      completed             1.00
volume rebalance: v1: success: 

[root@mia ~]# getfattr -d -m . -e hex /rhs/brick1/v1
getfattr: Removing leading '/' from absolute path names
# file: rhs/brick1/v1
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x0000000100000000000000007ffffffe
trusted.glusterfs.volume-id=0x3315a3d66d524dbcbe69670bc844aee3

[root@fan ~]# getfattr -d -m . -e hex /rhs/brick1/v1
getfattr: Removing leading '/' from absolute path names
# file: rhs/brick1/v1
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x00000001000000000000000000000000
trusted.glusterfs.volume-id=0x3315a3d66d524dbcbe69670bc844aee3

root@fred ~]# getfattr -d -m . -e hex /rhs/brick1/v1
getfattr: Removing leading '/' from absolute path names
# file: rhs/brick1/v1
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x00000001000000007fffffffffffffff
trusted.glusterfs.volume-id=0x3315a3d66d524dbcbe69670bc844aee3

less /var/log/glusterfs/v1-rebalance.log
.....
[2013-04-24 10:12:06.271006] I [dht-layout.c:624:dht_layout_normalize] 0-v1-dht: found anomalies in /. holes=1 overlaps=1
[2013-04-24 10:12:06.272652] I [dht-common.c:2542:dht_setxattr] 0-v1-dht: fixing the layout of /
[2013-04-24 10:12:06.274116] I [dht-rebalance.c:1069:gf_defrag_migrate_data] 0-v1-dht: migrate data called on /
[2013-04-24 10:12:06.330302] I [dht-rebalance.c:1268:gf_defrag_migrate_data] 0-v1-dht: Migration operation on dir / took 0.06 secs
......



6.Dont do any I/O on mount point(dont access any file or dir)

Now again execute same command and check output. 
Once rebalance is completed check log file.
On backend verify that layout has not been changed for directory.


[root@mia ~]# gluster volume rebalance v1 start force
volume rebalance: v1: success: Starting rebalance on volume v1 has been successful.
ID: a4ec8573-0171-4f2c-9320-acb706918bad
[root@mia ~]# gluster volume rebalance v1 status
                                    Node Rebalanced-files          size       scanned      failures         status run time in secs
                               ---------      -----------   -----------   -----------   -----------   ------------   --------------
                               localhost                0        0Bytes            80             0      completed             1.00
             fred.lab.eng.blr.redhat.com                0        0Bytes            80             0      completed             0.00
             fred.lab.eng.blr.redhat.com                0        0Bytes            80             0      completed             0.00
              fan.lab.eng.blr.redhat.com                0        0Bytes            80             0      completed             0.00
volume rebalance: v1: success: 
[root@mia ~]# getfattr -d -m . -e hex /rhs/brick1/v1
getfattr: Removing leading '/' from absolute path names
# file: rhs/brick1/v1
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x0000000100000000000000007ffffffe
trusted.glusterfs.volume-id=0x3315a3d66d524dbcbe69670bc844aee3

[root@fan ~]# getfattr -d -m . -e hex /rhs/brick1/v1
getfattr: Removing leading '/' from absolute path names
# file: rhs/brick1/v1
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x00000001000000000000000000000000
trusted.glusterfs.volume-id=0x3315a3d66d524dbcbe69670bc844aee3

root@fred ~]# getfattr -d -m . -e hex /rhs/brick1/v1
getfattr: Removing leading '/' from absolute path names
# file: rhs/brick1/v1
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x00000001000000007fffffffffffffff
trusted.glusterfs.volume-id=0x3315a3d66d524dbcbe69670bc844aee3

less /var/log/glusterfs/v1-rebalance.log
....
[2013-04-24 10:16:06.438141] I [dht-layout.c:624:dht_layout_normalize] 0-v1-dht: found anomalies in /. holes=1 overlaps=1
[2013-04-24 10:16:06.439634] I [dht-common.c:2542:dht_setxattr] 0-v1-dht: fixing the layout of /
[2013-04-24 10:16:06.441094] I [dht-rebalance.c:1069:gf_defrag_migrate_data] 0-v1-dht: migrate data called on /
[2013-04-24 10:16:06.498667] I [dht-rebalance.c:1268:gf_defrag_migrate_data] 0-v1-dht: Migration operation on dir / took 0.06 secs
....

Actual results:
log will have entry like - 'found anomalies in <dir>. holes=#X overlaps=#Y' for all directory


Expected results:
Once layout has been fixed and nothing has been changed(no new dir, no subvols-per-directory, no add-brick or remove-brick) than why it is getting holes or overlaps?

Additional info:

Comment 3 Amar Tumballi 2013-09-16 12:22:33 UTC
Ravi, can you try with patch http://review.gluster.org/5762 for this?

Comment 4 Amar Tumballi 2013-09-16 13:31:17 UTC
Considering this is a medium priority issue with log level mismatches, should we remove 'Big Bend U1' from Internal whiteboard?

Comment 5 Ravishankar N 2013-09-17 10:08:00 UTC
Tested on glusterfs 3.4.0.33rhs; unable to reproduce this issue (even without patch mentioned in comment # 3). Moving it to ON_QA so that it can be verified and closed.

Comment 6 Rachana Patel 2013-09-26 06:26:08 UTC
Unable to reproduce this with glusterfs 3.4.0.33rhs, hence moving it to verified

Comment 8 errata-xmlrpc 2013-11-27 15:24:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1769.html