Bug 761782 (GLUSTER-50)

Summary: glusterfsd block working cluster
Product: [Community] GlusterFS Reporter: Basavanagowda Kanur <gowda>
Component: coreAssignee: Vijay Bellur <vbellur>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: low Docs Contact:
Priority: low    
Version: mainlineCC: gluster-bugs, vijay
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Basavanagowda Kanur 2009-06-24 18:13:37 UTC
[Migrated from savannah BTS] - bug 25977 [https://savannah.nongnu.org/bugs/?25977]
Tue 24 Mar 2009 12:21:15 AM GMT, comment #1:

1.3.8 is quite old. Please use the latest release - from 2.0 series.
	Krishna Srinivas <krishnasrinivas>
Project Member
Mon 23 Mar 2009 03:59:59 PM GMT, original submission:

I have 2 nodes as HA cluster (use heartbeat) with resource httpd, nginx (both nodes) and mysql(node1). When i reboot node1, on node2 glusterfsd process get 100% processor's time (by "top" output), and /mnt/cluster is unreadable (when i try read it, my console don't answer). And node1 after reboot can't read /mnt/cluster until node2 will be rebooted.
Sometimes reboot is OK. Also this situation can appear when some files delete by node1.
If i reboot node2, all correct, if i move mysql to node2, both nodes reboot ok.

glusterfs-client.vol

volume remote1
type protocol/client
option transport-type tcp/client
option remote-host node1local
option remote-subvolume brick
end-volume

volume remote2
type protocol/client
option transport-type tcp/client
option remote-host node2local
option remote-subvolume brick
end-volume

volume mirror0
type cluster/afr
subvolumes remote1 remote2
end-volume

glusterfs-server.vol

volume posix
type storage/posix
option directory /mnt/data/localhost
end-volume

volume brick
type features/posix-locks
subvolumes posix
end-volume

volume server
type protocol/server
option transport-type tcp/server
option auth.ip.brick.allow 192.168.0.*
subvolumes brick
end-volume 
--------------------------------------------------------------------------------
Tue 24 Mar 2009 01:21:37 PM GMT, comment #2 by Anonymous:

I can't found 2.0 series glusterfs in Fedora 9 repository :(

I found about my problem in "[Gluster-devel] Effect of AFR's known file re-open issue on MySQL"
http://www.mail-archive.com/gluster-devel@nongnu.org/msg05669.html

I think i must use drbd for my database data?
--------------------------------------------------------------------------------
Tue 24 Mar 2009 02:48:52 PM GMT, comment #3 by 	Krishna Srinivas <krishnasrinivas>:

Can you try compiling 2.0 release from sources?
http://ftp.zresearch.com/pub/gluster/glusterfs/qa-releases/glusterfs-2.0.0rc6.tar.gz

--------------------------------------------------------------------------------
Wed 25 Mar 2009 01:56:37 PM GMT, comment #4 by Anonymous:

Here
http://www.mail-archive.com/gluster-devel@nongnu.org/msg05669.html

a read this:
"File re-open support will be available in 2.0.1 (say another month)"

Do you think my problem is solved in 2.0.0rc6? And my principe - not using unstable version for productive.

PS: i localised my problem - it is rerun mysql server. If i do /etc/init.d/mysqld restart 2-3 times on node2 (with stopped heartbeat on both nodes), glusterfsd is block all glusterfs dev and get 100% processor time.

Comment 1 Vijay Bellur 2009-11-16 08:21:59 UTC
Fixed with latest version of glusterfs.