Bug 1537480

Summary: quorum-reads option can give inconsistent reads
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Pranith Kumar K <pkarampu>
Component: replicateAssignee: Karthik U S <ksubrahm>
Status: CLOSED DEFERRED QA Contact: Nag Pavan Chilakam <nchilaka>
Severity: medium Docs Contact:
Priority: medium    
Version: rhgs-3.3CC: ksubrahm, pkarampu, ravishankar, rhs-bugs, sabose, storage-qa-internal, vavuthu
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1541438 (view as bug list) Environment:
Last Closed: 2020-02-06 07:39:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1541438    
Bug Blocks:    

Description Pranith Kumar K 2018-01-23 10:15:56 UTC
Description of problem:
For a file, Brick-A has pending operations on Brick-B, Brick-B has pending operations on Brick-C and Brick-C has pending operations on Brick-A. Since no two other bricks are blaming one brick any of these bricks can be considered as a good copy and a heal can be done. Reads will fail until heal happens.
The consistent read issue we found happens when Any one of the bricks go down in this state. If Brick-A goes down, Reads will be served from Brick-B and if Brick-B goes down Reads will be served from Brick-C. If Brick-C goes down reads will be served from Brick-A. All these reads could give different content.


Version-Release number of selected component (if applicable):


How reproducible:
It is extremely difficult to hit this case. We are mostly going to simulate it by putting breakpoints in gdb.

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Karthik U S 2018-02-12 07:30:48 UTC
To fix this issue we need to change the read, write and self-heal transactions, which are the heart of AFR transactions. It is better to give some time for it to become stable upstream before taking it to downstream. So targeting it for the later release.

Comment 3 Karthik U S 2018-02-12 07:32:41 UTC
Upstream patch: https://review.gluster.org/#/c/19477/