Bug 1405130

Summary: `gluster volume heal <vol-name> split-brain' does not heal if data/metadata/entry self-heal options are turned off
Product: [Community] GlusterFS Reporter: Ravishankar N <ravishankar>
Component: replicateAssignee: Ravishankar N <ravishankar>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.8CC: bugs, ravishankar, ssampat
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.8.8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1405126 Environment:
Last Closed: 2017-01-16 12:27:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1233608, 1234054, 1403840, 1405126    
Bug Blocks: 1223636    

Description Ravishankar N 2016-12-15 16:41:56 UTC
+++ This bug was initially created as a clone of Bug #1405126 +++

+++ This bug was initially created as a clone of Bug #1234054 +++

+++ This bug was initially created as a clone of Bug #1233608 +++

Description of problem:
------------------------
If a file in data/metadata/entry split-brain is attempted to be healed using the `gluster volume heal <vol-name> split-brain' command, the heal fails if the respective data/metadata/entry self-heal volume option is turned off. This should not be the case, as glfsheal should not take these options into consideration.

See below, sample output of the command -

# gluster v heal rep2 split-brain source-brick 10.70.37.134:/rhs/brick6/b1/ /bar
Healing /bar failed: File not in split-brain.
Volume heal failed.

# gluster v heal rep2 split-brain bigger-file /bar                                                                                                   
Healing /bar failed: File not in split-brain.
Volume heal failed.

Volume configuration -

# gluster v info rep2
 
Volume Name: rep2
Type: Replicate
Volume ID: 0bf8fb07-8b09-4be8-94e7-29f4d3d7632f
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 10.70.37.208:/rhs/brick6/b1
Brick2: 10.70.37.134:/rhs/brick6/b1
Options Reconfigured:
cluster.entry-self-heal: off
cluster.data-self-heal: off
cluster.metadata-self-heal: off
cluster.self-heal-daemon: off
features.uss: on
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
performance.readdir-ahead: on


Version-Release number of selected component (if applicable):
--------------------------------------------------------------
glusterfs-3.7.1-3.el6rhs.x86_64

How reproducible:
------------------
100%

Steps to Reproduce:
--------------------
1. Set the following options on a 1x2 volume -
 
cluster.entry-self-heal: off
cluster.data-self-heal: off
cluster.metadata-self-heal: off
cluster.self-heal-daemon: off

2. Kill one brick of the replica set.
3. From the mount write to an existing file, or perform metadata operations like chmod.
4. Start the volume with force.
5. Kill the other brick in the replica set.
6. Perform data/metadata operations on the same file.
7. Start the volume with force and try to heal the now split-brained file using the above mentioned CLI.

Actual results:
----------------
Heal fails.

Expected results:
------------------
Heal is expected to succeed.

Additional info:

--- Additional comment from Shruti Sampat on 2015-06-19 07:48:07 EDT ---

Heal also fails when trying to resolve split-brain from the client by setting extended attributes, when the data/metadata self-heal options are turned off. With the options turned on, heal works as expected. This needs to be fixed too as part of this BZ.

--- Additional comment from Ravishankar N on 2015-06-20 12:03:24 EDT ---

The bug in the description needs to be fixed. After the initial discussion with Shruti, I was giving some more thought to the expected behaviour for comment #1 It seems to me that if the client side heal options are disabled via volume set, then split-brain healing from mount (via setfattr interface) should also honour that. i.e. it should not heal the file. 

If a particular client wants to override the heal options which are disabled on an entire volume basis, it can always mount the volume with the heal options enabled as fuse mount options. ( --xlator-option *replicate*.data-self-heal=on etc.)

--- Additional comment from Worker Ant on 2016-12-15 11:40:33 EST ---

REVIEW: http://review.gluster.org/16143 (glfsheal: Explicitly enable self-heal xlator options) posted (#1) for review on release-3.9 by Ravishankar N (ravishankar)

Comment 1 Worker Ant 2016-12-15 16:44:04 UTC
REVIEW: http://review.gluster.org/16144 (glfsheal: Explicitly enable self-heal xlator options) posted (#1) for review on release-3.8 by Ravishankar N (ravishankar)

Comment 2 Worker Ant 2016-12-16 01:34:07 UTC
COMMIT: http://review.gluster.org/16144 committed in release-3.8 by Pranith Kumar Karampuri (pkarampu) 
------
commit 67feb849ef96431ad869ce904c5de727195758cf
Author: Ravishankar N <ravishankar>
Date:   Wed Dec 14 22:48:20 2016 +0530

    glfsheal: Explicitly enable self-heal xlator options
    
    Enable data, metadata and entry self-heal as xlator-options so that glfs-heal.c
    can heal split-brain files even if they are disabled on the volume via volume
    set commands.
    
    > Reviewed-on: http://review.gluster.org/11333
    > Smoke: Gluster Build System <jenkins.org>
    > Reviewed-by: Pranith Kumar Karampuri <pkarampu>
    > Tested-by: Pranith Kumar Karampuri <pkarampu>
    > NetBSD-regression: NetBSD Build System <jenkins.org>
    > CentOS-regression: Gluster Build System <jenkins.org>
    (cherry picked from commit 209c2d447be874047cb98d86492b03fa807d1832)
    
    Change-Id: Ic191a1017131db1ded94d97c932079d7bfd79457
    BUG: 1405130
    Signed-off-by: Ravishankar N <ravishankar>
    Reviewed-on: http://review.gluster.org/16144
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>

Comment 3 Worker Ant 2016-12-21 23:22:19 UTC
REVIEW: http://review.gluster.org/16261 (glfsheal: Explicitly enable self-heal xlator options) posted (#1) for review on release-3.8-fb by Kevin Vigor (kvigor)

Comment 4 Niels de Vos 2017-01-16 12:27:19 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.8, please open a new bug report.

glusterfs-3.8.8 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://lists.gluster.org/pipermail/announce/2017-January/000064.html
[2] https://www.gluster.org/pipermail/gluster-users/