1368437 – Remove-brick: Remove-brick rebalance failed during continuous lookup+directory rename

Bug 1368437 - Remove-brick: Remove-brick rebalance failed during continuous lookup+directory rename

Summary: Remove-brick: Remove-brick rebalance failed during continuous lookup+director...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	distribute
Sub Component:
Version:	rhgs-3.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Csaba Henk
QA Contact:	Sayalee
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1368093 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-08-19 11:50 UTC by Prasad Desala
Modified:	2020-12-17 08:52 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1395221 (view as bug list)
Environment:
Last Closed:	2020-12-17 08:52:15 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Prasad Desala 2016-08-19 11:50:31 UTC

Description of problem:
=======================
On a distribute volume, triggered remove-brick start and while migration is in progress I had continuously sent lookup and renames on a directory. 

Rebalance failed during the remove-brick process.

Version-Release number of selected component (if applicable):

3.7.9-10.el7rhgs.x86_64

How reproducible:
=================
Not always

Steps to Reproduce:
===================
1. Create a 12 brick distributed volume and start it.
2. FUSE mount the volume on a client.
3. From mount point, start untarring Linux kernel package, create directories and files.
4. Run a continuous loop of ls -lRt from the mount point.
5. Remove few bricks from the volume.
6. While migration is in progress keep rename directories.
7. Check for the remove-brick status.

Actual results:
==============
Rebalance failed during remove-brick process.

Expected results:
=================
Rebalance should not fail and all the files from the bricks that were being removed should be migrated to other bricks.

Additional info:
================

Comment 3 Susant Kumar Palai 2016-10-14 12:37:40 UTC

In the current code, if it is a remove-brick operation we abort migration for any kind of failures.

<code snippet from gf_defrag_fix_layout>
        ret = syncop_lookup (this, loc, &iatt, NULL, NULL, NULL);               
        if (ret) {                                                              
                gf_log (this->name, GF_LOG_ERROR, "Lookup failed on %s",        
                        loc->path);                                             
                ret = -1;                                                       
                goto out;                                                       
        }
<gf_defrag_fix_layout>

Since, the reproducer involves rename directories, it is a race condition where readdirp has returned the old name, lookup as part of fix_layout happens post rename leading to failure.

I think we can make remove-brick ignore ENOENT errors(Not sure about ESTALE).
For ESTALE may need to consider all the cases.

Will send a patch once I resolve the ESTALE part.

Comment 4 Susant Kumar Palai 2016-11-10 06:47:32 UTC

Since the operation is remove-brick, and the race pointed in comment 3 can result in directories being not migrated (no fix-layout + no migration for the entire sub-tree). 

In my opinion we should retain the failure as it is, which will be an indication to admin that there may be files left on the removed brick.

Nithya, need your input on this.

Comment 5 Susant Kumar Palai 2016-11-15 12:56:45 UTC

upstream patch: http://review.gluster.org/#/c/15846

Comment 7 Karan Sandha 2016-12-14 06:54:42 UTC

While verifying Bug 1400037  on 3.8.4-8 . I saw below observation:-

1) continuous multiple errors  for failed look up :-
<SNIP>
2016-12-13 11:22:11.661154] E [dht-rebalance.c:3334:gf_defrag_fix_layout] 0-samsung-dht: /file_srcdir/dhcp47-116.lab.eng.blr.redhat.com/thrd_07/d_005/_07_500_.d lookup failed with 2
[2016-12-13 11:22:11.663923] E [dht-rebalance.c:3334:gf_defrag_fix_layout] 0-samsung-dht: /file_srcdir/dhcp47-116.lab.eng.blr.redhat.com/thrd_07/d_005/_07_501_.d lookup failed with 2
</SNIP>

2) Setxattr failed ERROR:-

[2016-12-13 11:28:05.935641] E [dht-rebalance.c:3348:gf_defrag_fix_layout] 0-samsung-dht: Setxattr failed for /file_srcdir/dhcp47-116.lab.eng.blr.redhat.com/thrd_05/d_009/d_007/_05_9732_.d

3) Fix layout failing on ERRORs:-

[2016-12-13 11:28:05.936034] E [MSGID: 109016] [dht-rebalance.c:3378:gf_defrag_fix_layout] 0-samsung-dht: Fix layout failed for /file_srcdir/dhcp47-116.lab.eng.blr.redhat.com/thrd_05/d_009/d_007
[2016-12-13 11:28:05.936398] E [MSGID: 109016] [dht-rebalance.c:3378:gf_defrag_fix_layout] 0-samsung-dht: Fix layout failed for /file_srcdir/dhcp47-116.lab.eng.blr.redhat.com/thrd_05/d_009
[2016-12-13 11:28:05.936739] E [MSGID: 109016] [dht-rebalance.c:3378:gf_defrag_fix_layout] 0-samsung-dht: Fix layout failed for /file_srcdir/dhcp47-116.lab.eng.blr.redhat.com/thrd_05
[2016-12-13 11:28:05.937916] E [MSGID: 109016] [dht-rebalance.c:3378:gf_defrag_fix_layout] 0-samsung-dht: Fix layout failed for /file_srcdir/dhcp47-116.lab.eng.blr.redhat.com
[2016-12-13 11:28:05.938438] E [MSGID: 109016] [dht-rebalance.c:3378:gf_defrag_fix_layout] 0-samsung-dht: Fix layout failed for /file_srcdir


4) Rebalance Failure:-

[2016-12-13 11:28:05.939820] I [MSGID: 109028] [dht-rebalance.c:4126:gf_defrag_status_get] 0-samsung-dht: Rebalance is failed. Time taken is 378.00 secs
[2016-12-13 11:28:05.939848] I [MSGID: 109028] [dht-rebalance.c:4130:gf_defrag_status_get] 0-samsung-dht: Files migrated: 0, size: 0, lookups: 0, failures: 7, skipped: 0

Comment 8 Susant Kumar Palai 2016-12-14 09:43:01 UTC

Here is a update on the scope of the fix that is in upstream right now. 

The patch is certainly immune to one directory rename, but not continuous dir renames e.g. renaming 1->2, 2->3, 3->4 and so forth.

The patch in it's current state does try to get the new name of the directory and move on with the new name. But in the scenario of continuous renames, even the new name rebalance got, wouldn't be existing, since client would have renamed that entry as well.

As part of my testing, I renamed the directory just before fix-layout was called and rebalance carried on successfully for the new name.

Want to set the right expectation here, so that there would be no surprises.

Regards,
Susant

Comment 11 Nithya Balachandran 2017-12-05 06:26:54 UTC

(In reply to Susant Kumar Palai from comment #4)
> Since the operation is remove-brick, and the race pointed in comment 3 can
> result in directories being not migrated (no fix-layout + no migration for
> the entire sub-tree). 
> 
> In my opinion we should retain the failure as it is, which will be an
> indication to admin that there may be files left on the removed brick.
> 
> Nithya, need your input on this.


Yes, I agree.

Comment 14 Susant Kumar Palai 2018-11-19 09:00:41 UTC

*** Bug 1368093 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.