Bug 842955

Summary:	"gluster volume status inode" command blocks glusterd and glusterfsd
Product:	[Community] GlusterFS	Reporter:	Joe Julian <joe>
Component:	core	Assignee:	Kaushal <kaushal>
Status:	CLOSED CURRENTRELEASE	QA Contact:
Severity:	medium	Docs Contact:
Priority:	high
Version:	3.3.0	CC:	amarts, gluster-bugs
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	glusterfs-3.4.0	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:
Clones:	853211 (view as bug list)		Environment:
Last Closed:	2013-07-24 17:56:33 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	853211

Description Joe Julian 2012-07-25 05:12:53 UTC

Description of problem:
I have a volume with about 10 clients and about 500 open fds. I ran "gluster volume status home inode" and all my clients hit ping-timeout. I had to kill glusterfsd for the bricks associated with that volume and glusterd on all my servers and restart glusterd to get my volume back.

Version-Release number of selected component (if applicable):
3.3.0

How reproducible:
Couldn't risk it being more than once. This was on production servers.

Steps to Reproduce:
1. Have a busy volume
2. gluster volume status $VOL inode
  
Actual results:
All the clients hit ping-timeout. All other cli commands hang after that.

Additional info:
Had to kill the bricks and glusterd and restart glusterd (which restarted the bricks) on all the servers to get the volume back.

Comment 1 Amar Tumballi 2012-07-25 05:51:20 UTC

Kaushal,

I guess this is the same issue you figured out earlier... dict serialize and unserialize taking lot of time.

The reason for timeout is because these brick-ops from glusterd to glusterfsd are handled in main thread itself, hence blocking more n/w request (mostly from glusterfs) from reaching glusterfsd. Need to create a thread and handle these brick-ops, thus allowing other calls to pass through to below xlators.

Comment 2 Amar Tumballi 2012-10-23 10:45:32 UTC

http://review.gluster.org/4096 should fix the issue.

Comment 3 Vijay Bellur 2012-11-19 08:35:19 UTC

CHANGE: http://review.gluster.org/4096 (glusterfsd-mgmt: make brick-ops work in synctask) merged in master by Vijay Bellur (vbellur)