Bug 1624358

Summary: Thin-arbiter reads do not rely on in-memory information.
Product: [Community] GlusterFS Reporter: Ravishankar N <ravishankar>
Component: replicateAssignee: Ravishankar N <ravishankar>
Status: CLOSED UPSTREAM QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: mainlineCC: aspandey, bugs, ksubrahm, pasik, pkarampu
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-03-12 12:47:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ravishankar N 2018-08-31 10:28:02 UTC
Description of problem:

This is how TA reads work as of patch https://review.gluster.org/#/c/glusterfs/+/20994/2 :

If both data bricks are up, read subvol will be based on read_subvols. If only one data brick is up: - First qeury the data-brick that is up. If it blames the other brick, allow the reads. - If if doesn't, query the TA to obtain the source of truth.


However, we need to see if we can re-use AFR_TA_DOM_NOTIFY lock even for read-txns so that once ta_bad_child_index is stored in-memory, we can reuse that for subsequent reads until shd resets it after healing.

The rough changes  to read-txn if any 1 data brick is down would be (subject to discussion and acceptance):

0. If priv->bad_child_index is valid, goto 4
1. get afr xattrs of data=brick that is up.
2. If it contains pending xattrs, update priv->bad_child_index with that value
3. Otherwise:
{
      TA LOCK (AFR_TA_DOM_NOTIFY)
      TA LOCK (AFR_TA_DOM_MODIFY)
      get afr xattr from TA.
      update priv->bad_child_index if xattr present on TA
      TA UNLOCK (AFR_TA_DOM_MODIFY)
      If  priv->bad_child_index  is still AFR_CHILD_UNKNOWN
           {TA UNLOCK (AFR_TA_DOM_NOTIFY) }
      else retain AFR_TA_DOM_NOTIFY
}
4. Serve read depending on the data brick that is up is good or bad


While this is simple, some of the problems that need solving are
-what happens when multiple read requests come?
-what happens when reads/writes both come

We must ensure that from a given mount only one AFR_TA_DOM_NOTIFY lock is sent to TA irrespective of the no. of reads and writes.

Comment 1 Yaniv Kaul 2019-07-01 06:05:30 UTC
Status?

Comment 2 Ravishankar N 2019-07-02 04:03:27 UTC
Not being looked into at this moment. Bug was created to keep track of this technical debt.

Comment 3 Worker Ant 2020-03-12 12:47:20 UTC
This bug is moved to https://github.com/gluster/glusterfs/issues/949, and will be tracked there from now on. Visit GitHub issues URL for further details