Bug 762596 (GLUSTER-864) - dht self-heal overwrites layout causing 'missing' files.
Summary: dht self-heal overwrites layout causing 'missing' files.
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: GLUSTER-864
Product: GlusterFS
Classification: Community
Component: distribute
Version: 3.0.3
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Anand Avati
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-04-27 15:21 UTC by Amar Tumballi
Modified: 2015-12-01 16:45 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Regression: RTP
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Description Amar Tumballi 2010-04-27 15:21:53 UTC
Bug in distribute's selfheal/layout code.

BEFORE EXPANSION OF DISTRIBUTE:

root@ubuntu:/mnt/glusterfs/acpi# getfattr -n trusted.glusterfs.dht
/export/*/acpi -e hex
getfattr: Removing leading '/' from absolute path names
# file: export/dir1/acpi
trusted.glusterfs.dht=0x0000000100000000bffffffdffffffff

# file: export/dir2/acpi
trusted.glusterfs.dht=0x0000000100000000000000003ffffffe

# file: export/dir3/acpi
trusted.glusterfs.dht=0x00000001000000003fffffff7ffffffd

# file: export/dir4/acpi
trusted.glusterfs.dht=0x00000001000000007ffffffebffffffc

root@ubuntu:/mnt/glusterfs/acpi#

AFTER EXPANSION OF DISTRIBUTE:

root@ubuntu:~# getfattr -n trusted.glusterfs.dht /export/*/acpi -e hex
getfattr: Removing leading '/' from absolute path names
# file: export/dir1/acpi
trusted.glusterfs.dht=0x0000000100000000000000003ffffffe

# file: export/dir2/acpi
trusted.glusterfs.dht=0x00000001000000003fffffff7ffffffd

# file: export/dir3/acpi
trusted.glusterfs.dht=0x00000001000000007ffffffebffffffc

# file: export/dir4/acpi
trusted.glusterfs.dht=0x0000000100000000bffffffdffffffff

/export/dir5/acpi: trusted.glusterfs.dht: No such attribute
/export/dir6/acpi: trusted.glusterfs.dht: No such attribute

HERE IS THE  TRACE LOG:

[2010-03-05 12:01:47] T [dht-layout.c:305:dht_disk_layout_merge]
distribute: merged to layout: 0 - 1073741822 (type 0) from
localhost2-1
[2010-03-05 12:01:47] T [dht-layout.c:305:dht_disk_layout_merge]
distribute: merged to layout: 1073741823 - 2147483645 (type 0) from
localhost3-1
[2010-03-05 12:01:47] D [dht-common.c:114:dht_lookup_dir_cbk]
distribute: lookup of /acpi on localhost6-1 returned error (No such
file or directory)
[2010-03-05 12:01:47] T [dht-layout.c:305:dht_disk_layout_merge]
distribute: merged to layout: 3221225469 - 4294967295 (type 0) from
localhost1-1
[2010-03-05 12:01:47] D [dht-common.c:114:dht_lookup_dir_cbk]
distribute: lookup of /acpi on localhost5-1 returned error (No such
file or directory)
[2010-03-05 12:01:47] T [dht-layout.c:305:dht_disk_layout_merge]
distribute: merged to layout: 2147483646 - 3221225468 (type 0) from
localhost4-1
[2010-03-05 12:01:47] D [dht-layout.c:589:dht_layout_normalize]
distribute: path=/acpi err=No such file or directory on
subvol=localhost5-1
[2010-03-05 12:01:47] D [dht-layout.c:589:dht_layout_normalize]
distribute: path=/acpi err=No such file or directory on
subvol=localhost6-1
[2010-03-05 12:01:47] D [dht-common.c:164:dht_lookup_dir_cbk]
distribute: fixing assignment on /acpi
[2010-03-05 12:01:47] T
[dht-selfheal.c:371:dht_selfheal_layout_new_directory] distribute:
gave fix: 0 - 1073741822 on localhost1-1 for /acpi
[2010-03-05 12:01:47] T
[dht-selfheal.c:371:dht_selfheal_layout_new_directory] distribute:
gave fix: 1073741823 - 2147483645 on localhost2-1 for /acpi
[2010-03-05 12:01:47] T
[dht-selfheal.c:371:dht_selfheal_layout_new_directory] distribute:
gave fix: 2147483646 - 3221225468 on localhost3-1 for /acpi
[2010-03-05 12:01:47] T
[dht-selfheal.c:371:dht_selfheal_layout_new_directory] distribute:
gave fix: 3221225469 - 4294967291 on localhost4-1 for /acpi
[2010-03-05 12:01:47] T [dht-selfheal.c:269:dht_selfheal_dir_mkdir]
distribute: creating directory /acpi on subvol localhost5-1
[2010-03-05 12:01:47] T [dht-selfheal.c:269:dht_selfheal_dir_mkdir]
distribute: creating directory /acpi on subvol localhost6-1
[2010-03-05 12:01:47] T [dht-selfheal.c:174:dht_selfheal_dir_xattr]
distribute: 4 subvolumes missing xattr for /acpi
[2010-03-05 12:01:47] T
[dht-selfheal.c:124:dht_selfheal_dir_xattr_persubvol] distribute:
setting hash range 0 - 1073741822 (type 0) on subvolume localhost1-1
for /acpi
[2010-03-05 12:01:47] T
[dht-selfheal.c:124:dht_selfheal_dir_xattr_persubvol] distribute:
setting hash range 1073741823 - 2147483645 (type 0) on subvolume
localhost2-1 for /acpi
[2010-03-05 12:01:47] T
[dht-selfheal.c:124:dht_selfheal_dir_xattr_persubvol] distribute:
setting hash range 2147483646 - 3221225468 (type 0) on subvolume
localhost3-1 for /acpi
[2010-03-05 12:01:47] T
[dht-selfheal.c:124:dht_selfheal_dir_xattr_persubvol] distribute:
setting hash range 3221225469 - 4294967295 (type 0) on subvolume
localhost4-1 for /acpi

It is reassigning layout.

check the foll code:

dht_selfheal_layout_alloc_start (xlator_t *this, loc_t *loc,
                                dht_layout_t *layout)
{
       int           start = 0;
       dht_conf_t   *conf = NULL;
       uint32_t      hashval = 0;
       int           ret = 0;

       conf = this->private;

       ret = dht_hash_compute (layout->type, loc->path, &hashval);
       if (ret == 0) {
               start = (hashval % layout->cnt);
       }

       return start;
}

layout->cnt changes after expansion.

Comment 1 Anand Avati 2010-04-30 03:10:22 UTC
PATCH: http://patches.gluster.com/patch/3177 in master (dht: don't overwrite the layout after the subvolume expansion)

Comment 2 Anand Avati 2010-04-30 03:10:30 UTC
PATCH: http://patches.gluster.com/patch/3176 in release-3.0 (dht: don't overwrite the layout after the subvolume expansion)


Note You need to log in before you can comment on or make changes to this bug.