Bug 761782 (GLUSTER-50) - glusterfsd block working cluster
Summary: glusterfsd block working cluster
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: GLUSTER-50
Product: GlusterFS
Classification: Community
Component: core
Version: mainline
Hardware: All
OS: Linux
low
low
Target Milestone: ---
Assignee: Vijay Bellur
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-06-24 18:13 UTC by Basavanagowda Kanur
Modified: 2009-11-16 11:21 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Description Basavanagowda Kanur 2009-06-24 18:13:37 UTC
[Migrated from savannah BTS] - bug 25977 [https://savannah.nongnu.org/bugs/?25977]
Tue 24 Mar 2009 12:21:15 AM GMT, comment #1:

1.3.8 is quite old. Please use the latest release - from 2.0 series.
	Krishna Srinivas <krishnasrinivas>
Project Member
Mon 23 Mar 2009 03:59:59 PM GMT, original submission:

I have 2 nodes as HA cluster (use heartbeat) with resource httpd, nginx (both nodes) and mysql(node1). When i reboot node1, on node2 glusterfsd process get 100% processor's time (by "top" output), and /mnt/cluster is unreadable (when i try read it, my console don't answer). And node1 after reboot can't read /mnt/cluster until node2 will be rebooted.
Sometimes reboot is OK. Also this situation can appear when some files delete by node1.
If i reboot node2, all correct, if i move mysql to node2, both nodes reboot ok.

glusterfs-client.vol

volume remote1
type protocol/client
option transport-type tcp/client
option remote-host node1local
option remote-subvolume brick
end-volume

volume remote2
type protocol/client
option transport-type tcp/client
option remote-host node2local
option remote-subvolume brick
end-volume

volume mirror0
type cluster/afr
subvolumes remote1 remote2
end-volume

glusterfs-server.vol

volume posix
type storage/posix
option directory /mnt/data/localhost
end-volume

volume brick
type features/posix-locks
subvolumes posix
end-volume

volume server
type protocol/server
option transport-type tcp/server
option auth.ip.brick.allow 192.168.0.*
subvolumes brick
end-volume 
--------------------------------------------------------------------------------
Tue 24 Mar 2009 01:21:37 PM GMT, comment #2 by Anonymous:

I can't found 2.0 series glusterfs in Fedora 9 repository :(

I found about my problem in "[Gluster-devel] Effect of AFR's known file re-open issue on MySQL"
http://www.mail-archive.com/gluster-devel@nongnu.org/msg05669.html

I think i must use drbd for my database data?
--------------------------------------------------------------------------------
Tue 24 Mar 2009 02:48:52 PM GMT, comment #3 by 	Krishna Srinivas <krishnasrinivas>:

Can you try compiling 2.0 release from sources?
http://ftp.zresearch.com/pub/gluster/glusterfs/qa-releases/glusterfs-2.0.0rc6.tar.gz

--------------------------------------------------------------------------------
Wed 25 Mar 2009 01:56:37 PM GMT, comment #4 by Anonymous:

Here
http://www.mail-archive.com/gluster-devel@nongnu.org/msg05669.html

a read this:
"File re-open support will be available in 2.0.1 (say another month)"

Do you think my problem is solved in 2.0.0rc6? And my principe - not using unstable version for productive.

PS: i localised my problem - it is rerun mysql server. If i do /etc/init.d/mysqld restart 2-3 times on node2 (with stopped heartbeat on both nodes), glusterfsd is block all glusterfs dev and get 100% processor time.

Comment 1 Vijay Bellur 2009-11-16 08:21:59 UTC
Fixed with latest version of glusterfs.


Note You need to log in before you can comment on or make changes to this bug.