Bug 1328224 - RFE : Feature: Automagic unsplit-brain policies for AFR
Summary: RFE : Feature: Automagic unsplit-brain policies for AFR
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: replicate
Version: mainline
Hardware: All
OS: All
medium
medium
Target Milestone: ---
Assignee: Ravishankar N
QA Contact:
URL:
Whiteboard:
Depends On: 1262161
Blocks: 1339639
TreeView+ depends on / blocked
 
Reported: 2016-04-18 18:52 UTC by Ravishankar N
Modified: 2017-03-27 18:12 UTC (History)
3 users (show)

Fixed In Version: glusterfs-3.9.0
Clone Of: 1262161
: 1339639 (view as bug list)
Environment:
Last Closed: 2017-03-27 18:12:03 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Ravishankar N 2016-04-18 18:52:40 UTC
+++ This bug was initially created as a clone of Bug #1262161 +++

Description of problem:
From time to time, GlusterFS users, admins (and even developers) can do unfortunate things to a volume which cause split-brain to files and directories.  In such cases where the so-called "wise fool" algorithm (aka change logs) cannot determine a clean version of the file an IO error will be bubbled up to the user; thus ruining their GlusterFS clustered storage experience.

The present solution for these cases is to go into the backend and delete or move the copies of the file that aren't desired, or "pinning" to a specific replica index (which is basically choosing randomly).  For large scale installations of GlusterFS this really isn't a workable solution, and quite often a simple heuristic based on time, size or majority will suffice to resolve things automagically to most end-users satisfaction.

This patch introduces policy based split-brain resolution.

Version-Release number of selected component (if applicable):
v3.6.x

How reproducible:
100%

Steps to Reproduce:
N/A

Actual results:
N/A

Expected results:
N/A

Additional info:
N/A

--- Additional comment from  on 2015-09-10 22:12:46 EDT ---

Also, to be clear this patch should patch cleanly to the release-3.6 branch.

Comment 1 Vijay Bellur 2016-04-18 19:01:38 UTC
REVIEW: http://review.gluster.org/14026 (afr: Automagic unsplit-brain by [ctime|mtime|size|majority]) posted (#1) for review on master by Ravishankar N (ravishankar)

Comment 2 Ravishankar N 2016-04-25 10:13:54 UTC
Moved it to MODIFIED by mistake. Changing to POST.

Comment 3 Vijay Bellur 2016-05-02 13:17:35 UTC
REVIEW: http://review.gluster.org/14026 (afr: Automagic unsplit-brain by [ctime|mtime|size|majority]) posted (#2) for review on master by Ravishankar N (ravishankar)

Comment 4 Vijay Bellur 2016-05-11 07:42:50 UTC
REVIEW: http://review.gluster.org/14026 (afr: Automagic unsplit-brain by [ctime|mtime|size|majority]) posted (#3) for review on master by Ravishankar N (ravishankar)

Comment 5 Vijay Bellur 2016-05-24 05:25:07 UTC
REVIEW: http://review.gluster.org/14026 (afr: Automagic unsplit-brain by [ctime|mtime|size|majority]) posted (#4) for review on master by Ravishankar N (ravishankar)

Comment 6 Vijay Bellur 2016-05-25 08:59:45 UTC
REVIEW: http://review.gluster.org/14026 (afr: Automagic unsplit-brain by [ctime|mtime|size|majority]) posted (#5) for review on master by Ravishankar N (ravishankar)

Comment 7 Vijay Bellur 2016-05-25 18:55:11 UTC
COMMIT: http://review.gluster.org/14026 committed in master by Jeff Darcy (jdarcy) 
------
commit 2f29065ae4715c9c4a9d20c4d15311bebd3ddb0e
Author: Ravishankar N <ravishankar>
Date:   Mon May 2 18:45:44 2016 +0530

    afr: Automagic unsplit-brain by [ctime|mtime|size|majority]
    
    Introduce cluster.favorite-child-policy which when enabled with
    [ctime|mtime|size|majority], automatically heals files that are in
    split-brian.
    
    The majority policy will not pick a source if there is no majority.
    The other three policies pick the first brick with a valid reply and
    non-zero ctime/mtime/size as source.
    
    Change-Id: I3c099a0404082213860f74f2c9b4d207cfaedb76
    BUG: 1328224
    Original-author: Richard Wareing <rwareing>
    Signed-off-by: Ravishankar N <ravishankar>
    Reviewed-on: http://review.gluster.org/14026
    Smoke: Gluster Build System <jenkins.com>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.com>
    Reviewed-by: Anuradha Talur <atalur>
    Reviewed-by: Jeff Darcy <jdarcy>

Comment 8 Shyamsundar 2017-03-27 18:12:03 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.9.0, please open a new bug report.

glusterfs-3.9.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2016-November/029281.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.