Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 761783 (GLUSTER-51)

Summary:	Strange AFR behaviour
Product:	[Community] GlusterFS	Reporter:	Basavanagowda Kanur <gowda>
Component:	replicate	Assignee:	Vikas Gorur <vikas>
Status:	CLOSED CURRENTRELEASE	QA Contact:
Severity:	low	Docs Contact:
Priority:	low
Version:	mainline	CC:	gluster-bugs
Target Milestone:	---
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:		Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Basavanagowda Kanur 2009-06-24 18:25:54 UTC

[Migrated from savannah BTS] - bug 25982 [https://savannah.nongnu.org/bugs/?25982]
Tue 24 Mar 2009 04:23:20 AM GMT, original submission by Mathijs Krullaars <mathijsk>:

I use GlusterFS with server side replication on XFS volumes.
The test setup is on 2 servers, they are also client, so exported directory's are also mounted with gluster client locally.

When i echo a string to a test file, take one server down, change the string, bring the server back up and cat the file, it will not always be the the latest string.

When i test this with rc1 to rc4, the contents of the file will revert to the wrong string, or it will even be corrupted.
When i test this with rc6, the file is the good version when i cat it on a folder that is connected to the server with the right file, and the file is updated on the other server, so everything is like it should be.

But when i cat it first on a client that is connected to the server that was taken down, it will return the old string and update the contents of the newer file with the old one.

When all servers are up all the time this will work ok, but when one would go down the results can be unexpected.

-------------------------------------------------------------------------------

Tue 24 Mar 2009 06:18:51 PM GMT, comment #1 by Mathijs Krullaars <mathijsk>:

I just checked rc7, and it seems this issue is solved.

-------------------------------------------------------------------------------
Tue 24 Mar 2009 06:49:23 PM GMT, comment #2 by Mathijs Krullaars <mathijsk>:

Checked some more and in a split brain situation the behaviour is not perfect yet, the server that gets the file accessed first will replicate it to the other one, no matter if it is the latest or not.

--------------------------------------------------------------------------------

Fri 27 Mar 2009 09:07:41 PM GMT, comment #3 by Michael Taggart <mikeytag>:

Mathijs,
I am also seeing this on my rc7 structure. I have 4 bricks with distribute and replicate. I am replicating bricks 1 and 2, 3 and 4 and then distributing the replications.

I seem to fill up my logs with the dreaded "can't replicate file please consider option favorite-child" if I bring down a brick for maintenance and then bring it back up. I will have to say that it seems that rc7 is getting better in this regard, but I am still seeing problems now and then.
Mike

--------------------------------------------------------------------------------
Fri 27 Mar 2009 09:12:22 PM GMT, comment #4 by Michael Taggart <mikeytag>:

FWIW, I am also getting these errors:

2009-03-27 13:58:24 W [dht-common.c:244:dht_revalidate_cbk] distribute: mismatching layouts for <directory>

and

2009-03-27 14:10:18 W [fuse-bridge.c:301:need_fresh_lookup] fuse-bridge: revalidate of <directory> (Stale NFS file handle)

For the stale problem, I found an old post where a dev said to enable option lookup-unhashed yes and then do a stat on the files from the client. So I did:

find /san/ -exec stat {} \;

However, this doesn't seem to have remedied the errors. Not sure if this is even related to this bug. Do you think I should run stat on the server instead of the client?

--------------------------------------------------------------------------------
Sat 28 Mar 2009 03:21:15 PM GMT, comment #5 by Mathijs Krullaars <mathijsk>:

Yes i had the same experience yesterday when i realized one of my servers was not running and was trying to sync it later with ls -lR.
It created a 0KB file and then gave me this in log:
Unable to resolve conflicting data of **. Please resolve manually by deleting the file ** from all but the preferred subvolume. Please consider 'option favorite-child <>'

For the Stale NFS thing, first read this: http://www.gluster.org/docs/index.php/GlusterFS_cookbook#Exporting_over_NFS

I would not use nfs at all.

--------------------------------------------------------------------------------
Mon 30 Mar 2009 03:34:27 AM GMT, comment #6 by Michael Taggart <mikeytag>:

I agree. I wouldn't touch NFS with a 10 foot pole on a production setup. (I have had some horrible experiences in the past that leave me biased). What is funny is I am NOT using NFS at all. Just the bug in the glusterfs.log complains about a stale NFS handle, whatever that is.

The favorite-child problem seems like a big issue. I can't just have my servers pick one child all the time because it is not always the same server that has the right copy. When I notice this happen I usually see that the file is blank on one storage node and fully there on another node. Which one is anyone's guess based on who went down first etc.

--------------------------------------------------------------------------------
Mon 30 Mar 2009 12:30:04 PM GMT, comment #7 by Vikas Gorur <vikasgp>:

RC7 has a major fix in replicate that solves many of the spurious 'split-brain' situations. Could you please check with RC7?

--------------------------------------------------------------------------------
Mon 30 Mar 2009 08:56:42 PM GMT, comment #8 by Michael Taggart <mikeytag>:

Thanks Vikas,
I am running rc7 and here is a snippet of the log from one of my webservers:

2009-03-30 13:55:31 W [dht-common.c:244:dht_revalidate_cbk] distribute: mismatching layouts for <dir>
2009-03-30 13:55:31 W [fuse-bridge.c:301:need_fresh_lookup] fuse-bridge: revalidate of <dir> failed (Stale NFS file handle)
2009-03-30 13:55:32 W [dht-common.c:244:dht_revalidate_cbk] distribute: mismatching layouts for <dir>
2009-03-30 13:55:32 W [fuse-bridge.c:301:need_fresh_lookup] fuse-bridge: revalidate of <dir> failed (Stale NFS file handle)
--------------------------------------------------------------------------------
Mon 30 Mar 2009 09:53:10 PM GMT, comment #9 by Michael Taggart <mikeytag>:

Vikas,
I just noticed something else very interesting. The original poster said that he is using Gluster across XFS volumes. I am doing the same thing, although he is using server side replication and I am using client side. Not sure if it matters, but it is an interesting fact.
--------------------------------------------------------------------------------
Thu 09 Apr 2009 02:27:12 PM GMT, comment #10 by Vikas Gorur <vikasgp>:

A few fixes to the self-heal algorithm have gone in to the repository which should fix this problem. Can you please check with the latest git (or pre36)?

Comment 1 Vikas Gorur 2009-07-09 09:41:10 UTC

Fixed in release 2.0 and later. Commit fb034ba3036fadc7cf35edc5cae7481149a67ca0.