Bug 1221656
| Summary: | rebalance failing on one of the node | |||
|---|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | SATHEESARAN <sasundar> | |
| Component: | distribute | Assignee: | Nithya Balachandran <nbalacha> | |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | ||
| Severity: | high | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 3.7.0 | CC: | bugs, gluster-bugs, nbalacha, rgowdapp, srangana | |
| Target Milestone: | --- | |||
| Target Release: | --- | |||
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | glusterfs-3.7.2 | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1221696 1227262 (view as bug list) | Environment: | ||
| Last Closed: | 2015-06-20 09:48:17 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | 1221696, 1225997 | |||
| Bug Blocks: | 1227206, 1227262 | |||
|
Description
SATHEESARAN
2015-05-14 13:55:38 UTC
[root@~]# gluster volume rebalance vmstore start
volume rebalance: vmstore: success: Rebalance on vmstore has been started successfully. Use rebalance status command to check status of the rebalance process.
ID: 9372b71c-e6f4-44fb-a2e4-9707443f3457
[root@ ~]# gluster volume rebalance vmstore status
Node Rebalanced-files size scanned failures skipped status run time in secs
--------- ----------- ----------- ----------- ----------- ----------- ------------ --------------
localhost 0 0Bytes 3 0 1 completed 1.00
10.70.37.58 0 0Bytes 0 3 0 failed 0.00
volume rebalance: vmstore: success:
<snip_rebalance_logs>
[2015-05-14 19:17:41.419890] I [dht-rebalance.c:2112:gf_defrag_process_dir] 0-vmstore-dht: migrate data called on /
[2015-05-14 19:17:41.424661] I [dht-common.c:3539:dht_setxattr] 0-vmstore-dht: fixing the layout of /.trashcan
[2015-05-14 19:17:41.424688] I [dht-selfheal.c:1494:dht_fix_layout_of_directory] 0-vmstore-dht: subvolume 0 (vmstore-replicate-0): 101834 chunks
[2015-05-14 19:17:41.424699] I [dht-selfheal.c:1494:dht_fix_layout_of_directory] 0-vmstore-dht: subvolume 1 (vmstore-replicate-1): 101834 chunks
[2015-05-14 19:17:41.424708] I [dht-selfheal.c:1494:dht_fix_layout_of_directory] 0-vmstore-dht: subvolume 2 (vmstore-replicate-2): 101834 chunks
[2015-05-14 19:17:41.434411] I [dht-rebalance.c:2112:gf_defrag_process_dir] 0-vmstore-dht: migrate data called on /.trashcan
[2015-05-14 19:17:41.446254] I [dht-common.c:3539:dht_setxattr] 0-vmstore-dht: fixing the layout of /.trashcan/internal_op
[2015-05-14 19:17:41.446279] I [dht-selfheal.c:1494:dht_fix_layout_of_directory] 0-vmstore-dht: subvolume 0 (vmstore-replicate-0): 101834 chunks
[2015-05-14 19:17:41.446290] I [dht-selfheal.c:1494:dht_fix_layout_of_directory] 0-vmstore-dht: subvolume 1 (vmstore-replicate-1): 101834 chunks
[2015-05-14 19:17:41.446298] I [dht-selfheal.c:1494:dht_fix_layout_of_directory] 0-vmstore-dht: subvolume 2 (vmstore-replicate-2): 101834 chunks
[2015-05-14 19:17:41.453365] I [dht-rebalance.c:2112:gf_defrag_process_dir] 0-vmstore-dht: migrate data called on /.trashcan/internal_op
[2015-05-14 19:17:41.458214] I [dht-common.c:3539:dht_setxattr] 0-vmstore-dht: fixing the layout of /.trashcan/internal_op
[2015-05-14 19:17:41.458542] E [dht-rebalance.c:2368:gf_defrag_settle_hash] 0-vmstore-dht: fix layout on /.trashcan/internal_op failed
[2015-05-14 19:17:41.458824] E [MSGID: 109016] [dht-rebalance.c:2528:gf_defrag_fix_layout] 0-vmstore-dht: Fix layout failed for /.trashcan
</snip_rebalance_logs>
Following is the mail conversation from Nithya to gluster-devel for this issue : <snip> The rebalance failure is due to the interaction of the lookup-unhashed changes and rebalance local crawl changes. </snip> REVIEW: http://review.gluster.org/10788 (dht/rebalance : Fixed rebalance failure) posted (#1) for review on release-3.7 by N Balachandran (nbalacha) REVIEW: http://review.gluster.org/10788 (dht/rebalance : Fixed rebalance failure) posted (#3) for review on release-3.7 by Shyamsundar Ranganathan (srangana) *** Bug 1225997 has been marked as a duplicate of this bug. *** The required changes to fix this bug have not made it into glusterfs-3.7.1. This bug is now getting tracked for glusterfs-3.7.2. REVIEW: http://review.gluster.org/10788 (dht/rebalance : Fixed rebalance failure) posted (#5) for review on release-3.7 by N Balachandran (nbalacha) REVIEW: http://review.gluster.org/10788 (dht/rebalance : Fixed rebalance failure) posted (#6) for review on release-3.7 by Shyamsundar Ranganathan (srangana) COMMIT: http://review.gluster.org/10788 committed in release-3.7 by Raghavendra G (rgowdapp) ------ commit 3e8f9c1da61bf70ed635a655e966df574d1e15cd Author: Nithya Balachandran <nbalacha> Date: Thu May 14 19:33:44 2015 +0530 dht/rebalance : Fixed rebalance failure The rebalance process determines the local subvols for the node it is running on and only acts on files in those subvols. If a dist-rep or dist-disperse volume is created on 2 nodes by dividing the bricks equally across the nodes, one process might determine it has no local_subvols. When trying to update the commit hash, the function attempts to lock all local subvols. On the node with no local_subvols the dht inode lock operation fails, in turn causing the rebalance to fail. In a dist-rep volume with 2 nodes, if brick 0 of each replica set is on node1 and brick 1 is on node2, node2 will find that it has no local subvols. Change-Id: I7d73b5b4bf1c822eae6df2e6f79bd6a1606f4d1c BUG: 1221656 Signed-off-by: Nithya Balachandran <nbalacha> Reviewed-on-master: http://review.gluster.org/10786 Reviewed-by: Shyamsundar Ranganathan <srangana> Reviewed-by: Susant Palai <spalai> Reviewed-on: http://review.gluster.org/10788 Tested-by: Gluster Build System <jenkins.com> Tested-by: NetBSD Build System <jenkins.org> Reviewed-by: Raghavendra G <rgowdapp> Tested-by: Raghavendra G <rgowdapp> This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.2, please reopen this bug report. glusterfs-3.7.2 has been announced on the Gluster Packaging mailinglist [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://www.gluster.org/pipermail/packaging/2015-June/000006.html [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user |