Bug 956276 - DHT - rebalance - If no. of 'subvols-per-directory' is less than no. of up sub-volumes then log of each rebalance run says 'found anomalies in <dir>. holes=#X overlaps=#Y'
Summary: DHT - rebalance - If no. of 'subvols-per-directory' is less than no. of up su...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterfs
Version: 2.1
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Ravishankar N
QA Contact: amainkar
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-04-24 15:00 UTC by Rachana Patel
Modified: 2015-04-20 13:49 UTC (History)
7 users (show)

Fixed In Version: glusterfs-3.4.0.36rhs-1
Doc Type: Bug Fix
Doc Text:
Previously, after adding a brick the DHT (Distribute Internal Layout) used to be re-written without triggering re-balance. Now, with this update adding bricks does not trigger layout re-write and solves the problem of confusing logs.
Clone Of:
Environment:
Last Closed: 2013-11-27 15:24:44 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2013:1769 0 normal SHIPPED_LIVE Red Hat Storage 2.1 enhancement and bug fix update #1 2013-11-27 20:17:39 UTC

Description Rachana Patel 2013-04-24 15:00:16 UTC
Description of problem:
DHT - rebalance - If no. of 'subvols-per-directory' is less than no. of up sub-volumes then log of each rebalance run says 'found anomalies in <dir>. holes=#X overlaps=#Y'

Version-Release number of selected component (if applicable):
3.4.0.1rhs-1.el6rhs.x86_64

How reproducible:
always

Steps to Reproduce:
1. Create a Distributed volume having 2 or more sub-volume and start the volume.

2. Fuse Mount the volume from the client-1 using “mount -t glusterfs  server:/<volume> <client-1_mount_point>”

mount -t glusterfs XXX:/<volname> /mnt/XXX

3. Change sub-vol per dir option and make sure it is less than no. of up sub-volumes.

4.Run rebalance command for that volume with force option and check status 


5. check log for that rebalance
log will have entry like - 'found anomalies in <dir>. holes=#X overlaps=#Y' for all directory

and it will fix-layout for that directory. (expected behaviour)

On backend verify that layout has been fixed for all directories


[root@mia ~]# gluster volume rebalance v1 status
                                    Node Rebalanced-files          size       scanned      failures         status run time in secs
                               ---------      -----------   -----------   -----------   -----------   ------------   --------------
                               localhost               25        0Bytes           101             0      completed             1.00
             fred.lab.eng.blr.redhat.com               31        0Bytes            82             0      completed             1.00
             fred.lab.eng.blr.redhat.com               31        0Bytes            82             0      completed             1.00
              fan.lab.eng.blr.redhat.com                9        0Bytes            91             0      completed             1.00
volume rebalance: v1: success: 

[root@mia ~]# getfattr -d -m . -e hex /rhs/brick1/v1
getfattr: Removing leading '/' from absolute path names
# file: rhs/brick1/v1
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x0000000100000000000000007ffffffe
trusted.glusterfs.volume-id=0x3315a3d66d524dbcbe69670bc844aee3

[root@fan ~]# getfattr -d -m . -e hex /rhs/brick1/v1
getfattr: Removing leading '/' from absolute path names
# file: rhs/brick1/v1
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x00000001000000000000000000000000
trusted.glusterfs.volume-id=0x3315a3d66d524dbcbe69670bc844aee3

root@fred ~]# getfattr -d -m . -e hex /rhs/brick1/v1
getfattr: Removing leading '/' from absolute path names
# file: rhs/brick1/v1
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x00000001000000007fffffffffffffff
trusted.glusterfs.volume-id=0x3315a3d66d524dbcbe69670bc844aee3

less /var/log/glusterfs/v1-rebalance.log
.....
[2013-04-24 10:12:06.271006] I [dht-layout.c:624:dht_layout_normalize] 0-v1-dht: found anomalies in /. holes=1 overlaps=1
[2013-04-24 10:12:06.272652] I [dht-common.c:2542:dht_setxattr] 0-v1-dht: fixing the layout of /
[2013-04-24 10:12:06.274116] I [dht-rebalance.c:1069:gf_defrag_migrate_data] 0-v1-dht: migrate data called on /
[2013-04-24 10:12:06.330302] I [dht-rebalance.c:1268:gf_defrag_migrate_data] 0-v1-dht: Migration operation on dir / took 0.06 secs
......



6.Dont do any I/O on mount point(dont access any file or dir)

Now again execute same command and check output. 
Once rebalance is completed check log file.
On backend verify that layout has not been changed for directory.


[root@mia ~]# gluster volume rebalance v1 start force
volume rebalance: v1: success: Starting rebalance on volume v1 has been successful.
ID: a4ec8573-0171-4f2c-9320-acb706918bad
[root@mia ~]# gluster volume rebalance v1 status
                                    Node Rebalanced-files          size       scanned      failures         status run time in secs
                               ---------      -----------   -----------   -----------   -----------   ------------   --------------
                               localhost                0        0Bytes            80             0      completed             1.00
             fred.lab.eng.blr.redhat.com                0        0Bytes            80             0      completed             0.00
             fred.lab.eng.blr.redhat.com                0        0Bytes            80             0      completed             0.00
              fan.lab.eng.blr.redhat.com                0        0Bytes            80             0      completed             0.00
volume rebalance: v1: success: 
[root@mia ~]# getfattr -d -m . -e hex /rhs/brick1/v1
getfattr: Removing leading '/' from absolute path names
# file: rhs/brick1/v1
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x0000000100000000000000007ffffffe
trusted.glusterfs.volume-id=0x3315a3d66d524dbcbe69670bc844aee3

[root@fan ~]# getfattr -d -m . -e hex /rhs/brick1/v1
getfattr: Removing leading '/' from absolute path names
# file: rhs/brick1/v1
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x00000001000000000000000000000000
trusted.glusterfs.volume-id=0x3315a3d66d524dbcbe69670bc844aee3

root@fred ~]# getfattr -d -m . -e hex /rhs/brick1/v1
getfattr: Removing leading '/' from absolute path names
# file: rhs/brick1/v1
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x00000001000000007fffffffffffffff
trusted.glusterfs.volume-id=0x3315a3d66d524dbcbe69670bc844aee3

less /var/log/glusterfs/v1-rebalance.log
....
[2013-04-24 10:16:06.438141] I [dht-layout.c:624:dht_layout_normalize] 0-v1-dht: found anomalies in /. holes=1 overlaps=1
[2013-04-24 10:16:06.439634] I [dht-common.c:2542:dht_setxattr] 0-v1-dht: fixing the layout of /
[2013-04-24 10:16:06.441094] I [dht-rebalance.c:1069:gf_defrag_migrate_data] 0-v1-dht: migrate data called on /
[2013-04-24 10:16:06.498667] I [dht-rebalance.c:1268:gf_defrag_migrate_data] 0-v1-dht: Migration operation on dir / took 0.06 secs
....

Actual results:
log will have entry like - 'found anomalies in <dir>. holes=#X overlaps=#Y' for all directory


Expected results:
Once layout has been fixed and nothing has been changed(no new dir, no subvols-per-directory, no add-brick or remove-brick) than why it is getting holes or overlaps?

Additional info:

Comment 3 Amar Tumballi 2013-09-16 12:22:33 UTC
Ravi, can you try with patch http://review.gluster.org/5762 for this?

Comment 4 Amar Tumballi 2013-09-16 13:31:17 UTC
Considering this is a medium priority issue with log level mismatches, should we remove 'Big Bend U1' from Internal whiteboard?

Comment 5 Ravishankar N 2013-09-17 10:08:00 UTC
Tested on glusterfs 3.4.0.33rhs; unable to reproduce this issue (even without patch mentioned in comment # 3). Moving it to ON_QA so that it can be verified and closed.

Comment 6 Rachana Patel 2013-09-26 06:26:08 UTC
Unable to reproduce this with glusterfs 3.4.0.33rhs, hence moving it to verified

Comment 8 errata-xmlrpc 2013-11-27 15:24:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1769.html


Note You need to log in before you can comment on or make changes to this bug.