Description of problem: ======================= On a distribute volume, triggered remove-brick start and while migration is in progress I had continuously sent lookup and renames on a directory. Rebalance failed during the remove-brick process. Version-Release number of selected component (if applicable): 3.7.9-10.el7rhgs.x86_64 How reproducible: ================= Not always Steps to Reproduce: =================== 1. Create a 12 brick distributed volume and start it. 2. FUSE mount the volume on a client. 3. From mount point, start untarring Linux kernel package, create directories and files. 4. Run a continuous loop of ls -lRt from the mount point. 5. Remove few bricks from the volume. 6. While migration is in progress keep rename directories. 7. Check for the remove-brick status. Actual results: ============== Rebalance failed during remove-brick process. Expected results: ================= Rebalance should not fail and all the files from the bricks that were being removed should be migrated to other bricks. Additional info: ================
In the current code, if it is a remove-brick operation we abort migration for any kind of failures. <code snippet from gf_defrag_fix_layout> ret = syncop_lookup (this, loc, &iatt, NULL, NULL, NULL); if (ret) { gf_log (this->name, GF_LOG_ERROR, "Lookup failed on %s", loc->path); ret = -1; goto out; } <gf_defrag_fix_layout> Since, the reproducer involves rename directories, it is a race condition where readdirp has returned the old name, lookup as part of fix_layout happens post rename leading to failure. I think we can make remove-brick ignore ENOENT errors(Not sure about ESTALE). For ESTALE may need to consider all the cases. Will send a patch once I resolve the ESTALE part.
Since the operation is remove-brick, and the race pointed in comment 3 can result in directories being not migrated (no fix-layout + no migration for the entire sub-tree). In my opinion we should retain the failure as it is, which will be an indication to admin that there may be files left on the removed brick. Nithya, need your input on this.
upstream patch: http://review.gluster.org/#/c/15846
While verifying Bug 1400037 on 3.8.4-8 . I saw below observation:- 1) continuous multiple errors for failed look up :- <SNIP> 2016-12-13 11:22:11.661154] E [dht-rebalance.c:3334:gf_defrag_fix_layout] 0-samsung-dht: /file_srcdir/dhcp47-116.lab.eng.blr.redhat.com/thrd_07/d_005/_07_500_.d lookup failed with 2 [2016-12-13 11:22:11.663923] E [dht-rebalance.c:3334:gf_defrag_fix_layout] 0-samsung-dht: /file_srcdir/dhcp47-116.lab.eng.blr.redhat.com/thrd_07/d_005/_07_501_.d lookup failed with 2 </SNIP> 2) Setxattr failed ERROR:- [2016-12-13 11:28:05.935641] E [dht-rebalance.c:3348:gf_defrag_fix_layout] 0-samsung-dht: Setxattr failed for /file_srcdir/dhcp47-116.lab.eng.blr.redhat.com/thrd_05/d_009/d_007/_05_9732_.d 3) Fix layout failing on ERRORs:- [2016-12-13 11:28:05.936034] E [MSGID: 109016] [dht-rebalance.c:3378:gf_defrag_fix_layout] 0-samsung-dht: Fix layout failed for /file_srcdir/dhcp47-116.lab.eng.blr.redhat.com/thrd_05/d_009/d_007 [2016-12-13 11:28:05.936398] E [MSGID: 109016] [dht-rebalance.c:3378:gf_defrag_fix_layout] 0-samsung-dht: Fix layout failed for /file_srcdir/dhcp47-116.lab.eng.blr.redhat.com/thrd_05/d_009 [2016-12-13 11:28:05.936739] E [MSGID: 109016] [dht-rebalance.c:3378:gf_defrag_fix_layout] 0-samsung-dht: Fix layout failed for /file_srcdir/dhcp47-116.lab.eng.blr.redhat.com/thrd_05 [2016-12-13 11:28:05.937916] E [MSGID: 109016] [dht-rebalance.c:3378:gf_defrag_fix_layout] 0-samsung-dht: Fix layout failed for /file_srcdir/dhcp47-116.lab.eng.blr.redhat.com [2016-12-13 11:28:05.938438] E [MSGID: 109016] [dht-rebalance.c:3378:gf_defrag_fix_layout] 0-samsung-dht: Fix layout failed for /file_srcdir 4) Rebalance Failure:- [2016-12-13 11:28:05.939820] I [MSGID: 109028] [dht-rebalance.c:4126:gf_defrag_status_get] 0-samsung-dht: Rebalance is failed. Time taken is 378.00 secs [2016-12-13 11:28:05.939848] I [MSGID: 109028] [dht-rebalance.c:4130:gf_defrag_status_get] 0-samsung-dht: Files migrated: 0, size: 0, lookups: 0, failures: 7, skipped: 0
Here is a update on the scope of the fix that is in upstream right now. The patch is certainly immune to one directory rename, but not continuous dir renames e.g. renaming 1->2, 2->3, 3->4 and so forth. The patch in it's current state does try to get the new name of the directory and move on with the new name. But in the scenario of continuous renames, even the new name rebalance got, wouldn't be existing, since client would have renamed that entry as well. As part of my testing, I renamed the directory just before fix-layout was called and rebalance carried on successfully for the new name. Want to set the right expectation here, so that there would be no surprises. Regards, Susant
(In reply to Susant Kumar Palai from comment #4) > Since the operation is remove-brick, and the race pointed in comment 3 can > result in directories being not migrated (no fix-layout + no migration for > the entire sub-tree). > > In my opinion we should retain the failure as it is, which will be an > indication to admin that there may be files left on the removed brick. > > Nithya, need your input on this. Yes, I agree.
*** Bug 1368093 has been marked as a duplicate of this bug. ***