1319406 – gluster volume heal info shows conservative merge entries as in split-brain

Bug 1319406 - gluster volume heal info shows conservative merge entries as in split-brain

Summary: gluster volume heal info shows conservative merge entries as in split-brain

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	replicate
Sub Component:
Version:	rhgs-3.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Target Release:	RHGS 3.1.3
Assignee:	Pranith Kumar K
QA Contact:	Nag Pavan Chilakam
Docs Contact:
URL:	1319406
Whiteboard:
Depends On:
Blocks:	1311817 1322253 1326212
TreeView+	depends on / blocked

Reported:	2016-03-19 17:04 UTC by Pranith Kumar K
Modified:	2016-09-17 12:15 UTC (History)
CC List:	6 users (show)
Fixed In Version:	glusterfs-3.7.9-2
Doc Type:	Bug Fix
Doc Text:	When directory operations failed with errors other than the brick being offline, the parent directory that contained entries that failed was shown as being in a split-brain state even when it was not. This has been corrected so that state is shown correctly in this situation.
Clone Of:
Clones:	1322253 (view as bug list)
Environment:
Last Closed:	2016-06-23 05:03:59 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2016:1240	0	normal	SHIPPED_LIVE	Red Hat Gluster Storage 3.1 Update 3	2016-06-23 08:51:28 UTC

Description Pranith Kumar K 2016-03-19 17:04:48 UTC

Description of problem:

This is the sample directory on which even when there is no split-brain, it is shown as split-brain.

root@localhost - ~ 
22:25:02 :( ⚡ getfattr -d -m. -e hex /home/gfs/r2_?/a
getfattr: Removing leading '/' from absolute path names
# file: home/gfs/r2_0/a
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.gfid=0x0c450469ba184449b5808625245e2e8a

# file: home/gfs/r2_1/a
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.r2-client-0=0x000000010000000100000000
trusted.gfid=0x0c450469ba184449b5808625245e2e8a


root@localhost - ~ 
22:25:04 :) ⚡ getfattr -d -m. -e hex /home/gfs/r2_?
getfattr: Removing leading '/' from absolute path names
# file: home/gfs/r2_0
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.r2-client-0=0x000000000000000000000001
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0x83b0576aa0924f71aa108c5a54bb793a

# file: home/gfs/r2_1
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.r2-client-0=0x000000000000000000000001
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0x83b0576aa0924f71aa108c5a54bb793a


root@localhost - ~ 
22:25:09 :) ⚡ gluster volume heal r2 info
Brick localhost.localdomain:/home/gfs/r2_0
/ - Is in split-brain

Status: Connected
Number of entries: 1

Brick localhost.localdomain:/home/gfs/r2_1
/a 
/ - Is in split-brain

Status: Connected
Number of entries: 2


root@localhost - ~ 
22:25:19 :) ⚡ gluster volume set r2 entry-self-heal on
volume set: success

root@localhost - ~ 
22:26:40 :) ⚡ gluster volume heal r2 enable
Enable heal on volume r2 has been successful 

root@localhost - ~ 
22:26:45 :) ⚡ gluster volume heal r2
Launching heal operation to perform index self heal on volume r2 has been successful 
Use heal info commands to check status

root@localhost - ~ 
22:26:49 :) ⚡ gluster volume heal r2 info
Brick localhost.localdomain:/home/gfs/r2_0
Status: Connected
Number of entries: 0

Brick localhost.localdomain:/home/gfs/r2_1
Status: Connected
Number of entries: 0


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 5 Nag Pavan Chilakam 2016-05-23 11:42:23 UTC

QATP:
====
1) have a afr volume
2) mount the volume and create some dirs and files in them
3)now bring down one of the replica bricks
4)now from mount change the permissions and ownership of some dirs and their files
5)disable self heal deamon and all the client side heal options(data,metadata,entry so as to avoid client side healing)
6)now bring back the brick online
7)keep monitoring the  heal info and heal info split-brain in a loop till test case is complete
8)now enable the heal and start a heal of the volume
9)now check the healing

the heal must be completed, and the new filepermissions and ownership must be updated in the sink brick which can be checked in the backend brick

Also, no split brains errors must be seen
all heals must pass successfully

Comment 6 Nag Pavan Chilakam 2016-05-23 11:54:53 UTC

Validation:
=========
got the above case automated and ran it both manually and automated
the case passed. hence moving to verified
[root@dhcp35-191 ~]# rpm -qa|grep gluste
glusterfs-cli-3.7.9-6.el7rhgs.x86_64
glusterfs-libs-3.7.9-6.el7rhgs.x86_64
glusterfs-fuse-3.7.9-6.el7rhgs.x86_64
glusterfs-client-xlators-3.7.9-6.el7rhgs.x86_64
glusterfs-server-3.7.9-6.el7rhgs.x86_64
python-gluster-3.7.9-5.el7rhgs.noarch
glusterfs-3.7.9-6.el7rhgs.x86_64
glusterfs-api-3.7.9-6.el7rhgs.x86_64

Comment 8 Pranith Kumar K 2016-06-15 09:03:17 UTC

When directory operations failed with errors other than the brick being offline, *Parent directory containing these* entries that failed were shown as being in a split-brain state even when they were not. This has been corrected so that state is shown correctly in this situation.

Added the correction in '*'

Comment 9 Pranith Kumar K 2016-06-15 11:23:59 UTC

Looks good to me.

Comment 11 errata-xmlrpc 2016-06-23 05:03:59 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1240

Note You need to log in before you can comment on or make changes to this bug.