Bug 1661889

Summary:	Metadata heal picks different brick each time as source if there are no pending xattrs.
Product:	[Community] GlusterFS	Reporter:	Ravishankar N <ravishankar>
Component:	replicate	Assignee:	Ravishankar N <ravishankar>
Status:	CLOSED UPSTREAM	QA Contact:
Severity:	medium	Docs Contact:
Priority:	medium
Version:	mainline	CC:	bugs, guillaume.pavese, ksubrahm, pkarampu
Target Milestone:	---	Keywords:	Triaged
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2020-03-12 14:48:42 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Ravishankar N 2018-12-24 09:13:44 UTC

Description of problem:
There were a few instances reported both upstream and downstream where a RHHI setup had missing shard xattrs on the file for all 3 copies of the replica, potentially leading to VM pause.

Comments that I had made on a downstream BZ regarding this problem:

> As for the xattrs missing on all bricks of the replica, even
> though metadata heal does a removexattr and setxattr as part of healing, it
> does so only on the 'sink' bricks. The xattr must still remain on the
> 'source' brick. I'm going through the code and seeing if there is a
> possibility of picking the brick where say the removexattr suceeded and
> setxattr failed, as a source for a subsequent spurious metadata heal so that
> it gets removed on all bricks.

> Okay, so I found one corner case where xattrs can go missing from all
> bricks. If there is a metadata heal triggered (genuine or spurious as in
> this case due to mismatching bitrot xattrs) and there are no afr pending
> xattrs indicating which brick(s) is/are good and bad, then all bricks are
> considered sources.  afr_choose_source_by_policy() then picks the local
> brick as a source, and the other ones are considered sinks and the metadata
> heal is initiated.
> 
> One mount can pick up one local brick (say brick1) as source. During
> metadata heal, the removexattr succeeds on 2 sink bricks (brick2 and brick3)
> but setxattr fails because of say ENOTCONN. Thus 2 bricks have their shard
> xattrs missing.
> In RHHI setup, it can so happen that another mount which is local to one of
> the 2 sink bricks can again trigger metadata heal on the same file, this
> time picking one of the bad bricks (say brick2) as a source. Brick1 is now a
> sink for this heal and the shard xattr gets removed from it, resulting in
> all 3 bricks left without the xattr. Let me see what is the best way to fix
> this.


Version-Release number of selected component (if applicable):
It had a high chance of occurring in glusterfs 3.8 (RHGS-3.3.1) if bitrot was enabled and then disabled, which caused spurious metadata heals to be launched during each lookup on the file. (The birtot bug itself has been fixed in subsequent releases).

Comment 1 Worker Ant 2018-12-24 09:17:20 UTC

REVIEW: https://review.gluster.org/21922 (afr: mark pending xattrs as a part of metadata heal) posted (#1) for review on master by Ravishankar N

Comment 2 Amar Tumballi 2019-06-25 13:13:52 UTC

Looks like there is another design by Karthik for the issue as per the reviews in upstream. Should this be closed as duplicate of that? or is karthik also going to update this bug only?

Comment 3 Ravishankar N 2019-06-26 05:19:40 UTC

(In reply to Amar Tumballi from comment #2)
> Looks like there is another design by Karthik for the issue as per the
> reviews in upstream. Should this be closed as duplicate of that? or is
> karthik also going to update this bug only?

Amar are you referring to BZ 1717819/ https://review.gluster.org/#/c/glusterfs/+/22831/? That bug is different than this one...

Comment 4 Amar Tumballi 2019-07-01 05:42:37 UTC

Ravi, I was referring the patch in bug: https://review.gluster.org/21922... and went through the discussions in patch, which suggested there is some thing else worked upon.

Comment 5 Worker Ant 2020-03-12 14:48:42 UTC

This bug is moved to https://github.com/gluster/glusterfs/issues/1067, and will be tracked there from now on. Visit GitHub issues URL for further details