Bug 1276203 - add-brick on a replicate volume could lead to data-loss
Summary: add-brick on a replicate volume could lead to data-loss
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: replicate
Version: mainline
Hardware: Unspecified
OS: Unspecified
high
unspecified
Target Milestone: ---
Assignee: Anuradha
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1248998 Gluster-HC-2 1320020 1365455 1366440 1366444
TreeView+ depends on / blocked
 
Reported: 2015-10-29 05:14 UTC by Anuradha
Modified: 2016-09-20 02:00 UTC (History)
4 users (show)

Fixed In Version: glusterfs-3.8rc2
Clone Of:
: 1320020 (view as bug list)
Environment:
Last Closed: 2016-06-16 13:41:47 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Anuradha 2015-10-29 05:14:26 UTC
Description of problem:
On increasing the replica count of a replicate volume (by add-brick command),
self-heal upon failure of a fop on old-brick and success on newly added brick could lead to reverse heal and hence data loss.

Pending xattrs should be marked indicating the new brick doesn't have the latest copy of data yet.
Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Vijay Bellur 2015-10-29 05:18:04 UTC
REVIEW: http://review.gluster.org/12451 (glusterd / afr : Enable auto heal when replica count increases) posted (#1) for review on master by Anuradha Talur (atalur)

Comment 2 Vijay Bellur 2015-10-29 07:40:55 UTC
REVIEW: http://review.gluster.org/12451 (glusterd / afr : Enable auto heal when replica count increases) posted (#2) for review on master by Anuradha Talur (atalur)

Comment 3 Vijay Bellur 2016-01-13 06:18:27 UTC
REVIEW: http://review.gluster.org/12451 (glusterd / afr : Enable auto heal when replica count increases) posted (#3) for review on master by Anuradha Talur (atalur)

Comment 4 Vijay Bellur 2016-01-13 06:18:30 UTC
REVIEW: http://review.gluster.org/12454 (afr :  Enable auto heal when replica count increases) posted (#3) for review on master by Anuradha Talur (atalur)

Comment 5 Vijay Bellur 2016-02-03 07:35:54 UTC
REVIEW: http://review.gluster.org/12454 (afr :  Enable auto heal when replica count increases) posted (#4) for review on master by Anuradha Talur (atalur)

Comment 6 Vijay Bellur 2016-02-23 05:29:21 UTC
REVIEW: http://review.gluster.org/12451 (glusterd / afr : Enable auto heal when replica count increases) posted (#4) for review on master by Anuradha Talur (atalur)

Comment 7 Vijay Bellur 2016-02-23 05:29:24 UTC
REVIEW: http://review.gluster.org/12454 (afr : Enable auto heal when replica count increases) posted (#5) for review on master by Anuradha Talur (atalur)

Comment 8 Vijay Bellur 2016-02-23 06:43:38 UTC
REVIEW: http://review.gluster.org/12451 (glusterd / afr : Enable auto heal when replica count increases) posted (#5) for review on master by Anuradha Talur (atalur)

Comment 9 Vijay Bellur 2016-02-23 06:43:41 UTC
REVIEW: http://review.gluster.org/12454 (afr : Enable auto heal when replica count increases) posted (#6) for review on master by Anuradha Talur (atalur)

Comment 10 Vijay Bellur 2016-02-29 05:14:15 UTC
REVIEW: http://review.gluster.org/12451 (glusterd / afr : Enable auto heal when replica count increases) posted (#6) for review on master by Anuradha Talur (atalur)

Comment 11 Vijay Bellur 2016-03-02 07:32:46 UTC
REVIEW: http://review.gluster.org/12451 (glusterd / afr : Enable auto heal when replica count increases) posted (#7) for review on master by Anuradha Talur (atalur)

Comment 12 Vijay Bellur 2016-03-03 08:55:01 UTC
REVIEW: http://review.gluster.org/12454 (afr : Enable auto heal when replica count increases) posted (#7) for review on master by Anuradha Talur (atalur)

Comment 13 Vijay Bellur 2016-03-03 12:27:31 UTC
REVIEW: http://review.gluster.org/12454 (afr : Enable auto heal when replica count increases) posted (#8) for review on master by Anuradha Talur (atalur)

Comment 14 Vijay Bellur 2016-03-14 08:25:47 UTC
REVIEW: http://review.gluster.org/12451 (glusterd / afr : Enable auto heal when replica count increases) posted (#8) for review on master by Anuradha Talur (atalur)

Comment 15 Vijay Bellur 2016-03-14 08:25:50 UTC
REVIEW: http://review.gluster.org/12454 (afr : Enable auto heal when replica count increases) posted (#9) for review on master by Anuradha Talur (atalur)

Comment 16 Vijay Bellur 2016-03-16 05:25:46 UTC
REVIEW: http://review.gluster.org/12451 (glusterd / afr : Enable auto heal when replica count increases) posted (#9) for review on master by Anuradha Talur (atalur)

Comment 17 Vijay Bellur 2016-03-17 06:18:59 UTC
REVIEW: http://review.gluster.org/12454 (afr : Enable auto heal when replica count increases) posted (#10) for review on master by Anuradha Talur (atalur)

Comment 18 Vijay Bellur 2016-03-21 17:51:11 UTC
COMMIT: http://review.gluster.org/12451 committed in master by Atin Mukherjee (amukherj) 
------
commit 020bc022c342c4c015e29c63399757e36d653a49
Author: Anuradha Talur <atalur>
Date:   Wed Mar 16 10:55:09 2016 +0530

    glusterd / afr : Enable auto heal when replica count increases
    
    In replicate volumes, when a brick is added to a replicate
    group, heal to the new brick should be triggered.
    Also, the new brick should not be considered as source for
    healing till it is up to date.
    
    Previously, extended attributes had to be set manually on
    the bricks for this to happen. This patch is part 1 patch
    to automate this process.
    
    Change-Id: I29958448618372bfde23bf1dac5dd23dba1ad98f
    BUG: 1276203
    Signed-off-by: Anuradha Talur <atalur>
    Reviewed-on: http://review.gluster.org/12451
    Reviewed-by: Atin Mukherjee <amukherj>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.com>
    Reviewed-by: Ravishankar N <ravishankar>
    Smoke: Gluster Build System <jenkins.com>

Comment 19 Vijay Bellur 2016-03-22 05:37:28 UTC
COMMIT: http://review.gluster.org/12454 committed in master by Pranith Kumar Karampuri (pkarampu) 
------
commit 8eaa3506ead4f11b81b146a9e56575c79f3aad7b
Author: Anuradha Talur <atalur>
Date:   Tue Feb 23 10:56:51 2016 +0530

    afr : Enable auto heal when replica count increases
    
    This patch is part two change to prevent data loss
    in a replicate volume on doing a add-brick operation.
    
    Problem: After doing add-brick, there is a chance
    that self heal might happen from the newly added
    brick rather than the source brick, leading to data loss.
    
    Solution: Mark pending changelogs on afr children for
    the new afr-child so that heal is performed in the
    correct direction.
    
    Change-Id: I11871e55eef3593aec874f92214a2d97da229b17
    BUG: 1276203
    Signed-off-by: Anuradha Talur <atalur>
    Reviewed-on: http://review.gluster.org/12454
    Smoke: Gluster Build System <jenkins.com>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>
    Tested-by: Pranith Kumar Karampuri <pkarampu>
    CentOS-regression: Gluster Build System <jenkins.com>
    NetBSD-regression: NetBSD Build System <jenkins.org>

Comment 20 Niels de Vos 2016-06-16 13:41:47 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.