977797 – meta-data split-brain prevents entry/data self-heal of dir/file respectively

Bug 977797 - meta-data split-brain prevents entry/data self-heal of dir/file respectively

Summary: meta-data split-brain prevents entry/data self-heal of dir/file respectively

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	replicate
Sub Component:
Version:	pre-release
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	vsomyaju
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2013-06-25 10:58 UTC by vsomyaju
Modified:	2015-03-05 00:06 UTC (History)
CC List:	2 users (show)
Fixed In Version:	glusterfs-3.5.0
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2014-04-17 11:42:56 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description vsomyaju 2013-06-25 10:58:17 UTC

Description of problem:
If metadata split-brain is detected, data/entry self is not taking place.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.  Create replicate volume
2.   set self heal daemon off.
3..  Kill one brick.
4.  Perform action at mount point to change metadata
5.  Kill another brick and bring back the first brick.
6.  Again perform action to change the metadata at mount point.
7.   Bring back latest killed brick.
8.   Now files are in metadata split-brain. Kill any brick and change the
     data at the mount point. Bring back the killed  brick.
9.   ls at mount-point will not successfully heal data-split brain

Actual results:
Data wont get healed.

Expected results:
Data should get healed although  there is metadata split-brain.

Additional info:

Comment 1 Anand Avati 2013-06-25 11:01:34 UTC

REVIEW: http://review.gluster.org/5253 (cluster/afr: Allow data/entry self heal for metadata split-brain) posted (#2) for review on master by venkatesh somyajulu (vsomyaju)

Comment 2 Anand Avati 2013-06-26 13:05:45 UTC

REVIEW: http://review.gluster.org/5253 (cluster/afr: Allow data/entry self heal for metadata split-brain) posted (#3) for review on master by venkatesh somyajulu (vsomyaju)

Comment 3 Anand Avati 2013-06-27 08:56:34 UTC

REVIEW: http://review.gluster.org/5253 (cluster/afr: Allow data/entry self heal for metadata split-brain) posted (#4) for review on master by venkatesh somyajulu (vsomyaju)

Comment 4 Anand Avati 2013-06-28 13:44:19 UTC

REVIEW: http://review.gluster.org/5253 (cluster/afr: Allow data/entry self heal for metadata split-brain) posted (#5) for review on master by venkatesh somyajulu (vsomyaju)

Comment 5 Anand Avati 2013-07-02 17:25:53 UTC

COMMIT: http://review.gluster.org/5253 committed in master by Vijay Bellur (vbellur) 
------
commit ef8092fab7b6fa5a16cc0e22b75945758519d5a6
Author: Venkatesh Somyajulu <vsomyaju>
Date:   Fri Jun 28 19:11:47 2013 +0530

    cluster/afr: Allow data/entry self heal for metadata split-brain
    
    Problem:
    Currently whenever there is metadata split-brain, a variable
    sh->op_failed is set to 1 to denote that self heal got failed.
    But if we proceed for data self heal, even code-path of data
    self heal also relies on the sh->op_failed variable. So if will
    check for sh->op_failed variable and will eventually fails to
    do data self heal. So needed a mechanism to allow data self heal
    even if metadata is in split brain.
    
    Fix:
    Some data structure revamp is done in
    http://review.gluster.com/#/c/5106/ fix and this patch is
    based on the above fix. Now we can store which particular self-heal
    got failed i.e GFID_OR_MISSING_ENTRY_SELF_HEAL, METADATA, DATA,
    ENTRY. And we can do two types of self heal failure check.
    1. Individual type check: We can check which among all four
       (Metadata, Data, Gfid or missing entry, entry self heal)
       got failed.
    
    2. In afr_self_heal_completion_cbk, we need to make check
       based on the fact that if any specific self heal got failed treat
       the complete self heal as failure so that it will populate
       corresponding circular buffer of event history accordingly.
    
    Change-Id: Icb91e513bcc752386fc8a78812405cfabe5cac2d
    BUG: 977797
    Signed-off-by: Venkatesh Somyajulu <vsomyaju>
    Reviewed-on: http://review.gluster.org/5253
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 6 Niels de Vos 2014-04-17 11:42:56 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.5.0, please reopen this bug report.

glusterfs-3.5.0 has been announced on the Gluster Developers mailinglist [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/6137
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Note You need to log in before you can comment on or make changes to this bug.