1552414 – Take full lock on files in 3 way replication

Bug 1552414 - Take full lock on files in 3 way replication

Summary: Take full lock on files in 3 way replication

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	replicate
Sub Component:
Version:	rhgs-3.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Target Release:	RHGS 3.4.0
Assignee:	Karthik U S
QA Contact:	Vijay Avuthu
Docs Contact:
URL:
Whiteboard:
Depends On:	1535438
Blocks:	1503137
TreeView+	depends on / blocked

Reported:	2018-03-07 05:43 UTC by Karthik U S
Modified:	2018-09-20 04:40 UTC (History)
CC List:	6 users (show)
Fixed In Version:	glusterfs-3.12.2-6
Doc Type:	Bug Fix
Doc Text:	In replica 3 volumes, there was a possibility of ending up in split brain, when multiple clients simultaneously write data on the same file at non overlapping regions. With the new cluster.full-lock option, you can take full file lock which helps you in maintaining data consistency and avoid ending up in split-brain. By default, the cluster.full-lock option is set to take full file lock and can be reconfigured to take range locks, if needed.
Clone Of:
Environment:
Last Closed:	2018-09-04 06:44:11 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2018:2607	0	None	None	None	2018-09-04 06:45:12 UTC

Description Karthik U S 2018-03-07 05:43:42 UTC

Description of problem:
In replica 3 volumes there is a possibilities of ending up in split brain, when multiple clients writing data on the same file at non overlapping regions in parallel.

Version-Release number of selected component (if applicable):


How reproducible:
It is very rare to hit hit this case, but we need to imitate the following scenario using gdb to test this.

Steps to Reproduce:
- Client C0 performs write W1 which fails on brick B0 and succeeds on other two bricks.
- C1 performs write W2 which fails on B1 and succeeds on other two bricks. - C2 performs write W3 which fails on B2 and succeeds on other two bricks.
- All the 3 writes above happen in parallel and fall on different ranges so afr takes granular locks and all the writes are performed in parallel. Since each client had data-readables as good initially, it does not see file going into split-brain in the in_flight_split_brain check, hence performs the post-op marking the pending xattrs. Now all the bricks are being blamed by each other, ending up in split-brain.


Actual results:
We will end up in split brain in replica 3 volumes.

Expected results:
We should not end up in split brain in replica 3 volumes.

Additional info:

Comment 2 Karthik U S 2018-03-07 08:44:55 UTC

Downstream patch: https://code.engineering.redhat.com/gerrit/#/c/131966/

Comment 3 Karthik U S 2018-03-09 07:16:43 UTC

Upstream patch: https://review.gluster.org/#/c/19218/

Comment 8 Vijay Avuthu 2018-08-08 07:21:37 UTC

Update:
=======

Build Used: glusterfs-3.12.2-15.el7rhgs.x86_64

Discussed with Karthik and below are the scenarios covered

Scenario 1:

1) create 1 * 3 volume and start
2) Disable eager lock
3) check full-lock is on or not  ( should be on )
4) write 1GB of file from mount point
5) take state dump for the volume while write is in progress
6) verify the state dump 
       - there should be ONE active lock which contains start=0, len=0
       - others should be blocked

scenario 2:

1) create 1 * 3 volume and start
2) Disable eager lock
3) disable full-lock 
4) write 1GB of file from mount point
5) take state dump for the volume while write is in progress
6) verify the state dump 
       - there should be range locks
       - locks are not blocked

scenario 3:

1) create 1 * 3 volume and start
2) Enable eager lock
3) enable full-lock      
4) write 1GB of same file from from 2 clients
5) take state dump for the volume while write is in progress
6) verify the state dump 
       - there should be ONE active lock which contains start=0, len=0 
       - others should be blocked


scenario 4:

1) create 1 * 3 volume and start
2) Enable eager lock
3) disable full-lock 
4) overwrite 1GB of same file ( which was written in previous scenario) from from 2 clients
5) take state dump for the volume while write is in progress
6) verify the state dump 
       - there should be range locks which are active


Moving status to verified

Comment 12 errata-xmlrpc 2018-09-04 06:44:11 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2607

Note You need to log in before you can comment on or make changes to this bug.