Bug 1104861

Summary: AFR: self-heal metadata can be corrupted with remove-brick
Product: [Community] GlusterFS Reporter: Joe Julian <joe>
Component: replicateAssignee: Pranith Kumar K <pkarampu>
Status: CLOSED WONTFIX QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.4.3CC: gluster-bugs, ravishankar
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-07-13 09:15:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Joe Julian 2014-06-04 20:41:26 UTC
Description of problem:
If you remove a brick with pending self-heals the metadata will now point to the wrong brick.

We had a replica 2 volume and 1 brick failed. We added another brick and started the self-heal. Wanting to see the actual status of the heal, I removed the dead brick from the volume expecting the pending attributes for that brick would be removed and/or ignored. Instead the pending attributes remained and the third brick was now referenced such that the attributes for the former second brick now point to the third.

Version-Release number of selected component (if applicable):
3.4.3

How reproducible:
Always

Steps to Reproduce:
1. Create a replica 3 volume
2. Down brick 2
3. Do some actions to the files on the volume
4. gluster volume remove-brick myvol replica 2 server2:/brick force

Actual results:
# file: ../../85/4b/854bccb8-9119-4449-9906-b57008aef492
trusted.afr.gv-swift-client-0=0x000000000000000000000000
trusted.afr.gv-swift-client-1=0x000000020000000100000000
trusted.afr.gv-swift-client-2=0x000000000000000000000000
trusted.gfid=0x854bccb8911944499906b57008aef492

# From myvol-fuse.vol:
volume myvol-replicate-0
    type cluster/replicate
    subvolumes myvol-client-0 myvol-client-1
end-volume

Expected results:

I expected the identification for the client to remain the same, ie.
    volume myvol-replicate-0
        type cluster/replicate
        subvolumes myvol-client-0 myvol-client-2
    end-volume

and any entries in indices where client-0 and client-2 are clean to be removed from indices.

Comment 1 Pranith Kumar K 2014-06-05 01:59:55 UTC
http://review.gluster.org/#/c/7122
http://review.gluster.org/7155

Patches above prevent this problem. We found this problem at the time of snapshot development.

CC Ravi

Comment 2 Pranith Kumar K 2014-06-05 02:02:38 UTC
Joe,

Ravi writes really good documents :-).

Check this page out for more information:
http://www.gluster.org/community/documentation/index.php/Features/persistent-AFR-changelog-xattributes

pranith

Comment 3 Joe Julian 2014-06-05 05:04:12 UTC
Can this get backported to release-3.4 and release-3.5?

Comment 4 Ravishankar N 2014-06-05 05:19:47 UTC
I don't think that is possible. The fix was tied to the next op-version (i.e. GD_OP_VERSION_MAX which is 4 for release 3.6) so that there are no heterogenous nodes (i.e. the feature won't work until all nodes are upgraded to 3.6). If we backport it to previous releases, even if one of the nodes were not upgraded, we don't have a way to figure that out, which could lead to inconsistent volfiles amongst nodes.

Comment 5 Ravishankar N 2014-06-05 05:23:29 UTC
To be more clear, for a 1x3 replica, if the middle brick were removed, the nodes which have the fix will use trusted.afr.gv-swift-client-{0,2} for AFR's changelogs while the ones that were not upgraded will still use trusted.afr.gv-swift-client-{0,1}

Comment 6 Pranith Kumar K 2014-07-13 06:52:36 UTC
Ravi,
   Could you close this bug if the bug can't be backported?

Pranith