Bug 1805097

Summary: Changes to self-heal logic w.r.t. detecting metadata split-brains
Product: [Community] GlusterFS Reporter: Karthik U S <ksubrahm>
Component: replicateAssignee: Karthik U S <ksubrahm>
Status: CLOSED NEXTRELEASE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6CC: bugs
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1717819 Environment:
Last Closed: 2020-02-25 07:07:48 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1717819    
Bug Blocks: 1806846    

Description Karthik U S 2020-02-20 09:06:17 UTC
+++ This bug was initially created as a clone of Bug #1717819 +++

Description of problem:

We currently don't have a roll-back/undoing of post-ops if quorum is not met. Though the FOP is still unwound with failure, the xattrs remain on the disk. Due to these partial post-ops and partial heals (healing only when 2 bricks are up), we can end up in metadata split-brain purely from the afr xattrs point of view i.e each brick is blamed by atleast one of the others for metadata. These scenarios are hit when there is frequent connect/disconnect of the client/shd to the bricks.

Comment 1 Worker Ant 2020-02-20 09:19:46 UTC
REVIEW: https://review.gluster.org/24155 (Cluster/afr: Don't treat all bricks having metadata pending as split-brain) posted (#1) for review on release-6 by Karthik U S

Comment 2 Worker Ant 2020-02-25 07:07:48 UTC
REVIEW: https://review.gluster.org/24155 (Cluster/afr: Don't treat all bricks having metadata pending as split-brain) merged (#2) on release-6 by hari gowtham