Bug 1319406

Summary: gluster volume heal info shows conservative merge entries as in split-brain
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Pranith Kumar K <pkarampu>
Component: replicateAssignee: Pranith Kumar K <pkarampu>
Status: CLOSED ERRATA QA Contact: Nag Pavan Chilakam <nchilaka>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rhgs-3.1CC: asrivast, olim, pkarampu, rhinduja, rhs-bugs, storage-qa-internal
Target Milestone: ---Keywords: ZStream
Target Release: RHGS 3.1.3   
Hardware: Unspecified   
OS: Unspecified   
URL: 1319406
Whiteboard:
Fixed In Version: glusterfs-3.7.9-2 Doc Type: Bug Fix
Doc Text:
When directory operations failed with errors other than the brick being offline, the parent directory that contained entries that failed was shown as being in a split-brain state even when it was not. This has been corrected so that state is shown correctly in this situation.
Story Points: ---
Clone Of:
: 1322253 (view as bug list) Environment:
Last Closed: 2016-06-23 05:03:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1311817, 1322253, 1326212    

Description Pranith Kumar K 2016-03-19 17:04:48 UTC
Description of problem:

This is the sample directory on which even when there is no split-brain, it is shown as split-brain.

root@localhost - ~ 
22:25:02 :( ⚡ getfattr -d -m. -e hex /home/gfs/r2_?/a
getfattr: Removing leading '/' from absolute path names
# file: home/gfs/r2_0/a
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.gfid=0x0c450469ba184449b5808625245e2e8a

# file: home/gfs/r2_1/a
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.r2-client-0=0x000000010000000100000000
trusted.gfid=0x0c450469ba184449b5808625245e2e8a


root@localhost - ~ 
22:25:04 :) ⚡ getfattr -d -m. -e hex /home/gfs/r2_?
getfattr: Removing leading '/' from absolute path names
# file: home/gfs/r2_0
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.r2-client-0=0x000000000000000000000001
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0x83b0576aa0924f71aa108c5a54bb793a

# file: home/gfs/r2_1
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.r2-client-0=0x000000000000000000000001
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0x83b0576aa0924f71aa108c5a54bb793a


root@localhost - ~ 
22:25:09 :) ⚡ gluster volume heal r2 info
Brick localhost.localdomain:/home/gfs/r2_0
/ - Is in split-brain

Status: Connected
Number of entries: 1

Brick localhost.localdomain:/home/gfs/r2_1
/a 
/ - Is in split-brain

Status: Connected
Number of entries: 2


root@localhost - ~ 
22:25:19 :) ⚡ gluster volume set r2 entry-self-heal on
volume set: success

root@localhost - ~ 
22:26:40 :) ⚡ gluster volume heal r2 enable
Enable heal on volume r2 has been successful 

root@localhost - ~ 
22:26:45 :) ⚡ gluster volume heal r2
Launching heal operation to perform index self heal on volume r2 has been successful 
Use heal info commands to check status

root@localhost - ~ 
22:26:49 :) ⚡ gluster volume heal r2 info
Brick localhost.localdomain:/home/gfs/r2_0
Status: Connected
Number of entries: 0

Brick localhost.localdomain:/home/gfs/r2_1
Status: Connected
Number of entries: 0


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 5 Nag Pavan Chilakam 2016-05-23 11:42:23 UTC
QATP:
====
1) have a afr volume
2) mount the volume and create some dirs and files in them
3)now bring down one of the replica bricks
4)now from mount change the permissions and ownership of some dirs and their files
5)disable self heal deamon and all the client side heal options(data,metadata,entry so as to avoid client side healing)
6)now bring back the brick online
7)keep monitoring the  heal info and heal info split-brain in a loop till test case is complete
8)now enable the heal and start a heal of the volume
9)now check the healing

the heal must be completed, and the new filepermissions and ownership must be updated in the sink brick which can be checked in the backend brick

Also, no split brains errors must be seen
all heals must pass successfully

Comment 6 Nag Pavan Chilakam 2016-05-23 11:54:53 UTC
Validation:
=========
got the above case automated and ran it both manually and automated
the case passed. hence moving to verified
[root@dhcp35-191 ~]# rpm -qa|grep gluste
glusterfs-cli-3.7.9-6.el7rhgs.x86_64
glusterfs-libs-3.7.9-6.el7rhgs.x86_64
glusterfs-fuse-3.7.9-6.el7rhgs.x86_64
glusterfs-client-xlators-3.7.9-6.el7rhgs.x86_64
glusterfs-server-3.7.9-6.el7rhgs.x86_64
python-gluster-3.7.9-5.el7rhgs.noarch
glusterfs-3.7.9-6.el7rhgs.x86_64
glusterfs-api-3.7.9-6.el7rhgs.x86_64

Comment 8 Pranith Kumar K 2016-06-15 09:03:17 UTC
When directory operations failed with errors other than the brick being offline, *Parent directory containing these* entries that failed were shown as being in a split-brain state even when they were not. This has been corrected so that state is shown correctly in this situation.

Added the correction in '*'

Comment 9 Pranith Kumar K 2016-06-15 11:23:59 UTC
Looks good to me.

Comment 11 errata-xmlrpc 2016-06-23 05:03:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1240