Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 764301 (GLUSTER-2569)

Summary:	Peer shortly unreacheable in the middle of data copy forces heal at the end of copy
Product:	[Community] GlusterFS	Reporter:	raf <milanraf>
Component:	replicate	Assignee:	Pranith Kumar K <pkarampu>
Status:	CLOSED CURRENTRELEASE	QA Contact:
Severity:	low	Docs Contact:
Priority:	medium
Version:	3.1.3	CC:	gluster-bugs, pkarampu
Target Milestone:	---
Target Release:	---
Hardware:	i386
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:		Type:	---
Regression:	---	Mount Type:	fuse
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description raf 2011-03-21 10:16:56 UTC

Environment:
[192.168.0.1]#gluster volume create san replica 2 transport tcp 192.168.0.1:/var/gluster 192.168.0.2:/var/gluster
[192.168.0.1]#gluster volume start san
[192.168.0.1]#gluster volume set san network.ping-timeout 10
[192.168.0.3]#mount -t glusterfs 192.168.0.1:/san /mnt/gluster
share /mnt/gluster using SAMBA 

Start copying two files of about 2GB in size from WinXP client.

In the middle of the 1st file copy hit:
[192.168.0.1]#killall glusterd glusterfsd glusterfs

Wait about 10 secs (copy of 1st file is still running) and enter
[192.168.0.1]#service glusterd start (copy of 1st file hasn't finished yet)

Peer absence hasn't influenced data copy in any way (it has been running smoothly).
Now the peer is back but the 1st file is only been written on 192.168.0.2 (this is normal behaviour).
When the 1st file copy finishes, before starting the 2nd, self-heal of 1st is someway forced and 2nd file copy is posponed until self-heal is finished.
It seems that the file-close action after data copy triggers self-heal process.

I think this represents a big limit in GlusterFS performance. 
Isn't the self-heal process supposed to run in background with minor system resources use?

Comment 1 Pranith Kumar K 2011-03-22 03:47:49 UTC

With the current design it is supposed work this way, but there is a plan to implement what you have suggested, so it is not a bug but a feature enhancement.

Comment 2 Pranith Kumar K 2011-08-23 03:37:48 UTC

The design change brought in as part of 3182 fixes this.