Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 763664 (GLUSTER-1932)

Summary:	Self healing did not happen
Product:	[Community] GlusterFS	Reporter:	Jacob Shucart <jacob>
Component:	core	Assignee:	Amar Tumballi <amarts>
Status:	CLOSED CURRENTRELEASE	QA Contact:
Severity:	medium	Docs Contact:
Priority:	urgent
Version:	3.1-beta	CC:	gluster-bugs, platform, vraman
Target Milestone:	---
Target Release:	---
Hardware:	All
OS:	Windows
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:		Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Jacob Shucart 2010-10-12 15:37:44 UTC

Just an interesting extra piece of information, I just went to the Gluster mount on the client system and did an "rm" to clean up the directory.  On the GSP side, I noticed that diskA is now empty, diskB has all of the data in it, and the client-side mount is now hung.

Comment 1 Jacob Shucart 2010-10-12 15:51:14 UTC

I unmounted and remounted the volume using NFS.  I started up my script again.  Data now goes to:

diskA
diskC
diskD

Data should be mirrored between diskA and diskB, but no data gets written to diskB(it still does to diskA).

Now when I go in to Platform to look at my volume, it shows up as "red" and when I try to open it, I get a timeout.

I reboot the GSP, remount it to the client system, and now everything seems to work as expected.

Comment 2 Jacob Shucart 2010-10-12 18:35:53 UTC

Servers in a mirror pair have different data following dynamic expansion of volume.  Self healing did not occur as expected.

1. Created a VM with 4 hard disks.
2. Created a Gluster volume in distributed mirror setup with 2 hard disks.
3. Started a script that creates 100MB files for testing on a client system that was mounting the Gluster volume using Gluster native NFS.
4. While script was running, I then went to add the other 2 servers to the distribute+mirror setup, and hit update.
5. The NFS mount point was hung and did not resume.

I then looked at the filesystem in GSP and saw:

diskA: -rw-r--r-- 1 root root 100M Oct 12 18:16 test4
diskB: -rw-r--r-- 1 root root  71M Oct 12 18:15 test4

Their extended attributes appear the same:

[root@jacobgsp310 sys2]# attr -l test4
Attribute "gfid" has a 16 byte value for test4
Attribute "afr.jacobtest1-client-0" has a 12 byte value for test4
Attribute "afr.jacobtest1-client-1" has a 12 byte value for test4

[root@jacobgsp310 sys1]# attr -l test4
Attribute "gfid" has a 16 byte value for test4
Attribute "afr.jacobtest1-client-0" has a 12 byte value for test4
Attribute "afr.jacobtest1-client-1" has a 12 byte value for test4

I would have thought that looking at the file through the Gluster mount point would have caused a healing to fix the file, but it did not.  Is there any other information that you need from me?

Comment 3 Balamurugan Arumugam 2010-10-27 09:10:15 UTC

Happened only in interim QA build, has not happened with 3.1.0 GA. Need to be marked as resolved.