| Summary: | dht self-heal overwrites layout causing 'missing' files. | ||
|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Amar Tumballi <amarts> |
| Component: | distribute | Assignee: | Anand Avati <aavati> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | |
| Severity: | medium | Docs Contact: | |
| Priority: | low | ||
| Version: | 3.0.3 | CC: | anush, chrisw, gluster-bugs, pavan, vraman |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | --- | |
| Regression: | RTP | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
PATCH: http://patches.gluster.com/patch/3177 in master (dht: don't overwrite the layout after the subvolume expansion) PATCH: http://patches.gluster.com/patch/3176 in release-3.0 (dht: don't overwrite the layout after the subvolume expansion) |
Bug in distribute's selfheal/layout code. BEFORE EXPANSION OF DISTRIBUTE: root@ubuntu:/mnt/glusterfs/acpi# getfattr -n trusted.glusterfs.dht /export/*/acpi -e hex getfattr: Removing leading '/' from absolute path names # file: export/dir1/acpi trusted.glusterfs.dht=0x0000000100000000bffffffdffffffff # file: export/dir2/acpi trusted.glusterfs.dht=0x0000000100000000000000003ffffffe # file: export/dir3/acpi trusted.glusterfs.dht=0x00000001000000003fffffff7ffffffd # file: export/dir4/acpi trusted.glusterfs.dht=0x00000001000000007ffffffebffffffc root@ubuntu:/mnt/glusterfs/acpi# AFTER EXPANSION OF DISTRIBUTE: root@ubuntu:~# getfattr -n trusted.glusterfs.dht /export/*/acpi -e hex getfattr: Removing leading '/' from absolute path names # file: export/dir1/acpi trusted.glusterfs.dht=0x0000000100000000000000003ffffffe # file: export/dir2/acpi trusted.glusterfs.dht=0x00000001000000003fffffff7ffffffd # file: export/dir3/acpi trusted.glusterfs.dht=0x00000001000000007ffffffebffffffc # file: export/dir4/acpi trusted.glusterfs.dht=0x0000000100000000bffffffdffffffff /export/dir5/acpi: trusted.glusterfs.dht: No such attribute /export/dir6/acpi: trusted.glusterfs.dht: No such attribute HERE IS THE TRACE LOG: [2010-03-05 12:01:47] T [dht-layout.c:305:dht_disk_layout_merge] distribute: merged to layout: 0 - 1073741822 (type 0) from localhost2-1 [2010-03-05 12:01:47] T [dht-layout.c:305:dht_disk_layout_merge] distribute: merged to layout: 1073741823 - 2147483645 (type 0) from localhost3-1 [2010-03-05 12:01:47] D [dht-common.c:114:dht_lookup_dir_cbk] distribute: lookup of /acpi on localhost6-1 returned error (No such file or directory) [2010-03-05 12:01:47] T [dht-layout.c:305:dht_disk_layout_merge] distribute: merged to layout: 3221225469 - 4294967295 (type 0) from localhost1-1 [2010-03-05 12:01:47] D [dht-common.c:114:dht_lookup_dir_cbk] distribute: lookup of /acpi on localhost5-1 returned error (No such file or directory) [2010-03-05 12:01:47] T [dht-layout.c:305:dht_disk_layout_merge] distribute: merged to layout: 2147483646 - 3221225468 (type 0) from localhost4-1 [2010-03-05 12:01:47] D [dht-layout.c:589:dht_layout_normalize] distribute: path=/acpi err=No such file or directory on subvol=localhost5-1 [2010-03-05 12:01:47] D [dht-layout.c:589:dht_layout_normalize] distribute: path=/acpi err=No such file or directory on subvol=localhost6-1 [2010-03-05 12:01:47] D [dht-common.c:164:dht_lookup_dir_cbk] distribute: fixing assignment on /acpi [2010-03-05 12:01:47] T [dht-selfheal.c:371:dht_selfheal_layout_new_directory] distribute: gave fix: 0 - 1073741822 on localhost1-1 for /acpi [2010-03-05 12:01:47] T [dht-selfheal.c:371:dht_selfheal_layout_new_directory] distribute: gave fix: 1073741823 - 2147483645 on localhost2-1 for /acpi [2010-03-05 12:01:47] T [dht-selfheal.c:371:dht_selfheal_layout_new_directory] distribute: gave fix: 2147483646 - 3221225468 on localhost3-1 for /acpi [2010-03-05 12:01:47] T [dht-selfheal.c:371:dht_selfheal_layout_new_directory] distribute: gave fix: 3221225469 - 4294967291 on localhost4-1 for /acpi [2010-03-05 12:01:47] T [dht-selfheal.c:269:dht_selfheal_dir_mkdir] distribute: creating directory /acpi on subvol localhost5-1 [2010-03-05 12:01:47] T [dht-selfheal.c:269:dht_selfheal_dir_mkdir] distribute: creating directory /acpi on subvol localhost6-1 [2010-03-05 12:01:47] T [dht-selfheal.c:174:dht_selfheal_dir_xattr] distribute: 4 subvolumes missing xattr for /acpi [2010-03-05 12:01:47] T [dht-selfheal.c:124:dht_selfheal_dir_xattr_persubvol] distribute: setting hash range 0 - 1073741822 (type 0) on subvolume localhost1-1 for /acpi [2010-03-05 12:01:47] T [dht-selfheal.c:124:dht_selfheal_dir_xattr_persubvol] distribute: setting hash range 1073741823 - 2147483645 (type 0) on subvolume localhost2-1 for /acpi [2010-03-05 12:01:47] T [dht-selfheal.c:124:dht_selfheal_dir_xattr_persubvol] distribute: setting hash range 2147483646 - 3221225468 (type 0) on subvolume localhost3-1 for /acpi [2010-03-05 12:01:47] T [dht-selfheal.c:124:dht_selfheal_dir_xattr_persubvol] distribute: setting hash range 3221225469 - 4294967295 (type 0) on subvolume localhost4-1 for /acpi It is reassigning layout. check the foll code: dht_selfheal_layout_alloc_start (xlator_t *this, loc_t *loc, dht_layout_t *layout) { int start = 0; dht_conf_t *conf = NULL; uint32_t hashval = 0; int ret = 0; conf = this->private; ret = dht_hash_compute (layout->type, loc->path, &hashval); if (ret == 0) { start = (hashval % layout->cnt); } return start; } layout->cnt changes after expansion.