Bug 977797

Summary: meta-data split-brain prevents entry/data self-heal of dir/file respectively
Product: [Community] GlusterFS Reporter: vsomyaju
Component: replicateAssignee: vsomyaju
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: pre-releaseCC: gluster-bugs, nsathyan
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.5.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-04-17 11:42:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description vsomyaju 2013-06-25 10:58:17 UTC
Description of problem:
If metadata split-brain is detected, data/entry self is not taking place.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.  Create replicate volume
2.   set self heal daemon off.
3..  Kill one brick.
4.  Perform action at mount point to change metadata
5.  Kill another brick and bring back the first brick.
6.  Again perform action to change the metadata at mount point.
7.   Bring back latest killed brick.
8.   Now files are in metadata split-brain. Kill any brick and change the
     data at the mount point. Bring back the killed  brick.
9.   ls at mount-point will not successfully heal data-split brain

Actual results:
Data wont get healed.

Expected results:
Data should get healed although  there is metadata split-brain.

Additional info:

Comment 1 Anand Avati 2013-06-25 11:01:34 UTC
REVIEW: http://review.gluster.org/5253 (cluster/afr: Allow data/entry self heal for metadata split-brain) posted (#2) for review on master by venkatesh somyajulu (vsomyaju)

Comment 2 Anand Avati 2013-06-26 13:05:45 UTC
REVIEW: http://review.gluster.org/5253 (cluster/afr: Allow data/entry self heal for metadata split-brain) posted (#3) for review on master by venkatesh somyajulu (vsomyaju)

Comment 3 Anand Avati 2013-06-27 08:56:34 UTC
REVIEW: http://review.gluster.org/5253 (cluster/afr: Allow data/entry self heal for metadata split-brain) posted (#4) for review on master by venkatesh somyajulu (vsomyaju)

Comment 4 Anand Avati 2013-06-28 13:44:19 UTC
REVIEW: http://review.gluster.org/5253 (cluster/afr: Allow data/entry self heal for metadata split-brain) posted (#5) for review on master by venkatesh somyajulu (vsomyaju)

Comment 5 Anand Avati 2013-07-02 17:25:53 UTC
COMMIT: http://review.gluster.org/5253 committed in master by Vijay Bellur (vbellur) 
------
commit ef8092fab7b6fa5a16cc0e22b75945758519d5a6
Author: Venkatesh Somyajulu <vsomyaju>
Date:   Fri Jun 28 19:11:47 2013 +0530

    cluster/afr: Allow data/entry self heal for metadata split-brain
    
    Problem:
    Currently whenever there is metadata split-brain, a variable
    sh->op_failed is set to 1 to denote that self heal got failed.
    But if we proceed for data self heal, even code-path of data
    self heal also relies on the sh->op_failed variable. So if will
    check for sh->op_failed variable and will eventually fails to
    do data self heal. So needed a mechanism to allow data self heal
    even if metadata is in split brain.
    
    Fix:
    Some data structure revamp is done in
    http://review.gluster.com/#/c/5106/ fix and this patch is
    based on the above fix. Now we can store which particular self-heal
    got failed i.e GFID_OR_MISSING_ENTRY_SELF_HEAL, METADATA, DATA,
    ENTRY. And we can do two types of self heal failure check.
    1. Individual type check: We can check which among all four
       (Metadata, Data, Gfid or missing entry, entry self heal)
       got failed.
    
    2. In afr_self_heal_completion_cbk, we need to make check
       based on the fact that if any specific self heal got failed treat
       the complete self heal as failure so that it will populate
       corresponding circular buffer of event history accordingly.
    
    Change-Id: Icb91e513bcc752386fc8a78812405cfabe5cac2d
    BUG: 977797
    Signed-off-by: Venkatesh Somyajulu <vsomyaju>
    Reviewed-on: http://review.gluster.org/5253
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 6 Niels de Vos 2014-04-17 11:42:56 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.5.0, please reopen this bug report.

glusterfs-3.5.0 has been announced on the Gluster Developers mailinglist [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/6137
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user