Bug 1536257 - Take full lock on files in 3 way replication
Summary: Take full lock on files in 3 way replication
Keywords:
Status: CLOSED DUPLICATE of bug 1535438
Alias: None
Product: GlusterFS
Classification: Community
Component: replicate
Version: 3.13
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On: 1535438
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-01-19 01:10 UTC by Pranith Kumar K
Modified: 2018-01-31 05:52 UTC (History)
5 users (show)

Fixed In Version:
Clone Of: 1535438
Environment:
Last Closed: 2018-01-31 05:52:03 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Pranith Kumar K 2018-01-19 01:10:57 UTC
+++ This bug was initially created as a clone of Bug #1535438 +++

Description of problem:

Need a way to take full lock on files in replica 3 volume, which helps to prevent the files going to split brain.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

--- Additional comment from Worker Ant on 2018-01-17 07:13:35 EST ---

REVIEW: https://review.gluster.org/19218 (cluster/afr: Adding option to take full file lock) posted (#1) for review on master by Karthik U S

--- Additional comment from Worker Ant on 2018-01-18 19:15:53 EST ---

COMMIT: https://review.gluster.org/19218 committed in master by \"Karthik U S\" <ksubrahm> with a commit message- cluster/afr: Adding option to take full file lock

Problem:
In replica 3 volumes there is a possibilities of ending up in split
brain scenario, when multiple clients writing data on the same file
at non overlapping regions in parallel.

Scenario:
- Initially all the copies are good and all the clients gets the value
  of data readables as all good.
- Client C0 performs write W1 which fails on brick B0 and succeeds on
  other two bricks.
- C1 performs write W2 which fails on B1 and succeeds on other two bricks.
- C2 performs write W3 which fails on B2 and succeeds on other two bricks.
- All the 3 writes above happen in parallel and fall on different ranges
  so afr takes granular locks and all the writes are performed in parallel.
  Since each client had data-readables as good, it does not see
  file going into split-brain in the in_flight_split_brain check, hence
  performs the post-op marking the pending xattrs. Now all the bricks
  are being blamed by each other, ending up in split-brain.

Fix:
Have an option to take either full lock or range lock on files while
doing data transactions, to prevent the possibility of ending up in
split brains. With this change, by default the files will take full
lock while doing IO. If you want to make use of the old range lock
change the value of "cluster.full-lock" to "no".

Change-Id: I7893fa33005328ed63daa2f7c35eeed7c5218962
BUG: 1535438
Signed-off-by: karthik-us <ksubrahm>

Comment 1 Karthik U S 2018-01-31 05:52:03 UTC
Closing this bug hence it is fixed in 3.13 as part of the BZ #1535438.

*** This bug has been marked as a duplicate of bug 1535438 ***


Note You need to log in before you can comment on or make changes to this bug.