Description of problem: fix-layout currently aborts completely if a folder it was working on gets deleted in the background. It should instead ignore the error and move on to the next folder. Version-Release number of selected component (if applicable): 3.4.1 (also seen in 3.4.0) How reproducible: always. Steps to Reproduce: 1. gluster volume rebalance $volname fix-layout start 2. on any server, rm -rf the folder whose layout is currently being fixed (this can be found in the rebalance log) (rm should complete *before* fix-layout moves on to another folder), or one of its subfolders that has not yet been processed (the list of children seems to be built right when fix-layout enters the folder). Actual results: fix-layout completely aborts on the affected server(s). Expected results: fix-layout should realize that the folder disappeared, emit a warning *and move on to the next folder in the queue*. Additional info: [2013-12-02 00:39:34.718390] W [client-rpc-fops.c:1538:client3_3_inodelk_cbk] 0-bigdata-client-0: remote operation failed: No such file or directory [2013-12-02 00:39:35.983941] W [client-rpc-fops.c:1538:client3_3_inodelk_cbk] 0-bigdata-client-0: remote operation failed: No such file or directory [2013-12-02 00:39:35.984261] W [client-rpc-fops.c:1538:client3_3_inodelk_cbk] 0-bigdata-client-1: remote operation failed: No such file or directory [2013-12-02 00:39:35.984287] I [afr-lk-common.c:1075:afr_lock_blocking] 0-bigdata-replicate-0: unable to lock on even one child [2013-12-02 00:39:35.984302] I [afr-transaction.c:1063:afr_post_blocking_inodelk_cbk] 0-bigdata-replicate-0: Blocking inodelks failed. [2013-12-02 00:39:35.987386] E [dht-rebalance.c:1318:gf_defrag_fix_layout] 0-bigdata-dht: Lookup failed on /oldusers/sdas/torchf/qtlua/packages/qt/.svn/tmp/prop-base [2013-12-02 00:39:35.987427] E [dht-rebalance.c:1431:gf_defrag_fix_layout] 0-bigdata-dht: Fix layout failed for /oldusers/sdas/torchf/qtlua/packages/qt/.svn/tmp/prop-base [2013-12-02 00:39:35.988212] E [dht-rebalance.c:1431:gf_defrag_fix_layout] 0-bigdata-dht: Fix layout failed for /oldusers/sdas/torchf/qtlua/packages/qt/.svn/tmp [2013-12-02 00:39:35.988993] E [dht-rebalance.c:1431:gf_defrag_fix_layout] 0-bigdata-dht: Fix layout failed for /oldusers/sdas/torchf/qtlua/packages/qt/.svn [2013-12-02 00:39:35.989774] E [dht-rebalance.c:1431:gf_defrag_fix_layout] 0-bigdata-dht: Fix layout failed for /oldusers/sdas/torchf/qtlua/packages/qt [2013-12-02 00:39:35.990562] E [dht-rebalance.c:1431:gf_defrag_fix_layout] 0-bigdata-dht: Fix layout failed for /oldusers/sdas/torchf/qtlua/packages [2013-12-02 00:39:35.991337] E [dht-rebalance.c:1431:gf_defrag_fix_layout] 0-bigdata-dht: Fix layout failed for /oldusers/sdas/torchf/qtlua [2013-12-02 00:39:35.992150] E [dht-rebalance.c:1431:gf_defrag_fix_layout] 0-bigdata-dht: Fix layout failed for /oldusers/sdas/torchf [2013-12-02 00:39:35.992946] E [dht-rebalance.c:1431:gf_defrag_fix_layout] 0-bigdata-dht: Fix layout failed for /oldusers/sdas [2013-12-02 00:39:35.993745] E [dht-rebalance.c:1431:gf_defrag_fix_layout] 0-bigdata-dht: Fix layout failed for /oldusers [2013-12-02 00:39:35.994556] I [dht-rebalance.c:1714:gf_defrag_status_get] 0-glusterfs: Rebalance is completed. Time taken is 19501.00 secs [2013-12-02 00:39:35.994576] I [dht-rebalance.c:1717:gf_defrag_status_get] 0-glusterfs: Files migrated: 0, size: 0, lookups: 0, failures: 9, skipped: 0 [2013-12-02 00:39:36.025637] W [glusterfsd.c:1002:cleanup_and_exit] (-->/lib64/libc.so.6(clone+0x6d) [0x3211ee890d] (-->/lib64/libpthread.so.0() [0x3212607851] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xcd) [0x40533d]))) 0-: received signum (15), shutting down (in this case, /a/b/c/d/e/f/g/h/i was removed in the middle of fix-layout)
GlusterFS 3.7.0 has been released (http://www.gluster.org/pipermail/gluster-users/2015-May/021901.html), and the Gluster project maintains N-2 supported releases. The last two releases before 3.7 are still maintained, at the moment these are 3.6 and 3.5. This bug has been filed against the 3,4 release, and will not get fixed in a 3.4 version any more. Please verify if newer versions are affected with the reported problem. If that is the case, update the bug with a note, and update the version if you can. In case updating the version is not possible, leave a comment in this bug report with the version you tested, and set the "Need additional information the selected bugs from" below the comment box to "bugs". If there is no response by the end of the month, this bug will get automatically closed.
GlusterFS 3.4.x has reached end-of-life. If this bug still exists in a later release please reopen this and change the version or open a new bug.