Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1036904

Summary: fix-layout should keep going if the folder it is currently processing gets deleted
Product: [Community] GlusterFS Reporter: Pierre-Francois Laquerre <pierre.francois>
Component: distributeAssignee: bugs <bugs>
Status: CLOSED EOL QA Contact:
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.4.1CC: bugs, gluster-bugs, pierre.francois
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-10-07 12:32:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Pierre-Francois Laquerre 2013-12-02 21:30:40 UTC
Description of problem: fix-layout currently aborts completely if a folder it was working on gets deleted in the background. It should instead ignore the error and move on to the next folder.


Version-Release number of selected component (if applicable): 3.4.1 (also seen in 3.4.0)


How reproducible: always.


Steps to Reproduce:
1. gluster volume rebalance $volname fix-layout start
2. on any server, rm -rf the folder whose layout is currently being fixed (this can be found in the rebalance log) (rm should complete *before* fix-layout moves on to another folder), or one of its subfolders that has not yet been processed (the list of children seems to be built right when fix-layout enters the folder).


Actual results: fix-layout completely aborts on the affected server(s).


Expected results: fix-layout should realize that the folder disappeared, emit a warning *and move on to the next folder in the queue*.


Additional info:

[2013-12-02 00:39:34.718390] W [client-rpc-fops.c:1538:client3_3_inodelk_cbk] 0-bigdata-client-0: remote operation failed: No such file or directory
[2013-12-02 00:39:35.983941] W [client-rpc-fops.c:1538:client3_3_inodelk_cbk] 0-bigdata-client-0: remote operation failed: No such file or directory
[2013-12-02 00:39:35.984261] W [client-rpc-fops.c:1538:client3_3_inodelk_cbk] 0-bigdata-client-1: remote operation failed: No such file or directory
[2013-12-02 00:39:35.984287] I [afr-lk-common.c:1075:afr_lock_blocking] 0-bigdata-replicate-0: unable to lock on even one child
[2013-12-02 00:39:35.984302] I [afr-transaction.c:1063:afr_post_blocking_inodelk_cbk] 0-bigdata-replicate-0: Blocking inodelks failed.
[2013-12-02 00:39:35.987386] E [dht-rebalance.c:1318:gf_defrag_fix_layout] 0-bigdata-dht: Lookup failed on /oldusers/sdas/torchf/qtlua/packages/qt/.svn/tmp/prop-base
[2013-12-02 00:39:35.987427] E [dht-rebalance.c:1431:gf_defrag_fix_layout] 0-bigdata-dht: Fix layout failed for /oldusers/sdas/torchf/qtlua/packages/qt/.svn/tmp/prop-base
[2013-12-02 00:39:35.988212] E [dht-rebalance.c:1431:gf_defrag_fix_layout] 0-bigdata-dht: Fix layout failed for /oldusers/sdas/torchf/qtlua/packages/qt/.svn/tmp
[2013-12-02 00:39:35.988993] E [dht-rebalance.c:1431:gf_defrag_fix_layout] 0-bigdata-dht: Fix layout failed for /oldusers/sdas/torchf/qtlua/packages/qt/.svn
[2013-12-02 00:39:35.989774] E [dht-rebalance.c:1431:gf_defrag_fix_layout] 0-bigdata-dht: Fix layout failed for /oldusers/sdas/torchf/qtlua/packages/qt
[2013-12-02 00:39:35.990562] E [dht-rebalance.c:1431:gf_defrag_fix_layout] 0-bigdata-dht: Fix layout failed for /oldusers/sdas/torchf/qtlua/packages
[2013-12-02 00:39:35.991337] E [dht-rebalance.c:1431:gf_defrag_fix_layout] 0-bigdata-dht: Fix layout failed for /oldusers/sdas/torchf/qtlua
[2013-12-02 00:39:35.992150] E [dht-rebalance.c:1431:gf_defrag_fix_layout] 0-bigdata-dht: Fix layout failed for /oldusers/sdas/torchf
[2013-12-02 00:39:35.992946] E [dht-rebalance.c:1431:gf_defrag_fix_layout] 0-bigdata-dht: Fix layout failed for /oldusers/sdas
[2013-12-02 00:39:35.993745] E [dht-rebalance.c:1431:gf_defrag_fix_layout] 0-bigdata-dht: Fix layout failed for /oldusers
[2013-12-02 00:39:35.994556] I [dht-rebalance.c:1714:gf_defrag_status_get] 0-glusterfs: Rebalance is completed. Time taken is 19501.00 secs
[2013-12-02 00:39:35.994576] I [dht-rebalance.c:1717:gf_defrag_status_get] 0-glusterfs: Files migrated: 0, size: 0, lookups: 0, failures: 9, skipped: 0
[2013-12-02 00:39:36.025637] W [glusterfsd.c:1002:cleanup_and_exit] (-->/lib64/libc.so.6(clone+0x6d) [0x3211ee890d] (-->/lib64/libpthread.so.0() [0x3212607851] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xcd) [0x40533d]))) 0-: received signum (15), shutting down

(in this case, /a/b/c/d/e/f/g/h/i was removed in the middle of fix-layout)

Comment 2 Niels de Vos 2015-05-17 22:00:51 UTC
GlusterFS 3.7.0 has been released (http://www.gluster.org/pipermail/gluster-users/2015-May/021901.html), and the Gluster project maintains N-2 supported releases. The last two releases before 3.7 are still maintained, at the moment these are 3.6 and 3.5.

This bug has been filed against the 3,4 release, and will not get fixed in a 3.4 version any more. Please verify if newer versions are affected with the reported problem. If that is the case, update the bug with a note, and update the version if you can. In case updating the version is not possible, leave a comment in this bug report with the version you tested, and set the "Need additional information the selected bugs from" below the comment box to "bugs".

If there is no response by the end of the month, this bug will get automatically closed.

Comment 3 Kaleb KEITHLEY 2015-10-07 12:32:44 UTC
GlusterFS 3.4.x has reached end-of-life.

If this bug still exists in a later release please reopen this and change the version or open a new bug.