Bug 1232173 - Incomplete self-heal and split-brain on directories found when self-healing files/dirs on a replaced disk
Summary: Incomplete self-heal and split-brain on directories found when self-healing f...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: replicate
Version: 3.7.1
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Anuradha
QA Contact:
URL:
Whiteboard:
Depends On: 1207829 1255611
Blocks: 1140649
TreeView+ depends on / blocked
 
Reported: 2015-06-16 08:55 UTC by Anuradha
Modified: 2016-09-20 02:00 UTC (History)
7 users (show)

Fixed In Version: glusterfs-3.7.3
Clone Of: 1207829
Environment:
Last Closed: 2015-07-30 09:47:26 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Comment 1 Anand Avati 2015-06-16 11:56:51 UTC
REVIEW: http://review.gluster.org/11253 (glusterd/ afr : set afr pending xattrs on replace brick) posted (#1) for review on release-3.7 by Anuradha Talur (atalur)

Comment 2 Anand Avati 2015-06-16 12:11:05 UTC
REVIEW: http://review.gluster.org/11254 (cluster/afr : set pending xattrs for replaced brick) posted (#2) for review on release-3.7 by Anuradha Talur (atalur)

Comment 3 Anand Avati 2015-06-22 06:52:26 UTC
REVIEW: http://review.gluster.org/11253 (glusterd/ afr : set afr pending xattrs on replace brick) posted (#2) for review on release-3.7 by Anuradha Talur (atalur)

Comment 4 Anand Avati 2015-06-26 07:09:15 UTC
REVIEW: http://review.gluster.org/11254 (cluster/afr : set pending xattrs for replaced brick) posted (#4) for review on release-3.7 by Anuradha Talur (atalur)

Comment 5 Anand Avati 2015-06-27 11:27:16 UTC
COMMIT: http://review.gluster.org/11253 committed in release-3.7 by Pranith Kumar Karampuri (pkarampu) 
------
commit e28ac41c5ffc7b87f09b5bf2fe7f43cd4d4a5af5
Author: Anuradha <atalur>
Date:   Fri Jun 5 16:46:39 2015 +0530

    glusterd/ afr : set afr pending xattrs on replace brick
    
             Backport of: http://review.gluster.org/10076/
    
    This patch is part one change to prevent data loss
    in a replicate volume on doing a replace-brick commit
    force operation.
    
    Problem: After doing replace-brick commit force, there is a
    chance that self heal happens from the replaced (sink) brick
    rather than the source brick leading to data loss.
    
    Solution: During the commit phase of replace brick, after old
    brick is brought down, create a temporary mount and perform
    setfattr operation (on virtual xattr) indicating AFR to mark
    the replaced brick as sink.
    
    As a part of this change replace-brick command is being changed
    to use mgmt_v3 framework rather than op-state-machine framework.
    
    Many thanks to Krishnan Parthasarathi for helping me out on this.
    
    Change-Id: If0d51b5b3cef5b34d5672d46ea12eaa9d35fd894
    BUG: 1232173
    Signed-off-by: Anuradha Talur <atalur>
    Reviewed-on: http://review.gluster.org/11253
    Tested-by: NetBSD Build System <jenkins.org>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Ravishankar N <ravishankar>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>

Comment 6 Anand Avati 2015-06-27 13:09:02 UTC
COMMIT: http://review.gluster.org/11254 committed in release-3.7 by Pranith Kumar Karampuri (pkarampu) 
------
commit b319d712e97e1074cc6030220d00970d1262458b
Author: Anuradha <atalur>
Date:   Thu Jun 11 14:58:05 2015 +0530

    cluster/afr : set pending xattrs for replaced brick
    
           Backport of: http://review.gluster.org/10448/
                      & http://review.gluster.org/11416
    
    This patch is part two change to prevent data loss
    in a replicate volume on doing a replace-brick commit
    force operation.
    
    Problem: After doing replace-brick commit force, there is a
    chance that self heal might happen from the replaced (sink) brick
    rather than the source brick leading to data loss.
    
    Solution: Mark pending changelogs on afr children for
    the replaced afr-child so that heal is performed in the
    correct direction.
    
    Credits to Ravishankar N for patch 11416.
    
    Change-Id: Icb9807e49b4c1c4f1dcab115318d9a58ccf95675
    BUG: 1232173
    Reviewed-on: http://review.gluster.org/10448
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>
    Tested-by: NetBSD Build System <jenkins.org>
    Reviewed-by: Krutika Dhananjay <kdhananj>
    Signed-off-by: Anuradha Talur <atalur>
    Reviewed-on: http://review.gluster.org/11254
    Tested-by: Gluster Build System <jenkins.com>

Comment 7 Kaushal 2015-07-30 09:47:26 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.3, please open a new bug report.

glusterfs-3.7.3 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/12078
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 8 Kaushal 2015-07-30 09:49:13 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.3, please open a new bug report.

glusterfs-3.7.3 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/12078
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.