Bug 842955 - "gluster volume status inode" command blocks glusterd and glusterfsd
"gluster volume status inode" command blocks glusterd and glusterfsd
Product: GlusterFS
Classification: Community
Component: core (Show other bugs)
Unspecified Unspecified
high Severity medium
: ---
: ---
Assigned To: Kaushal
Depends On:
Blocks: 853211
  Show dependency treegraph
Reported: 2012-07-25 01:12 EDT by Joe Julian
Modified: 2013-07-24 13:56 EDT (History)
2 users (show)

See Also:
Fixed In Version: glusterfs-3.4.0
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 853211 (view as bug list)
Last Closed: 2013-07-24 13:56:33 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Joe Julian 2012-07-25 01:12:53 EDT
Description of problem:
I have a volume with about 10 clients and about 500 open fds. I ran "gluster volume status home inode" and all my clients hit ping-timeout. I had to kill glusterfsd for the bricks associated with that volume and glusterd on all my servers and restart glusterd to get my volume back.

Version-Release number of selected component (if applicable):

How reproducible:
Couldn't risk it being more than once. This was on production servers.

Steps to Reproduce:
1. Have a busy volume
2. gluster volume status $VOL inode
Actual results:
All the clients hit ping-timeout. All other cli commands hang after that.

Additional info:
Had to kill the bricks and glusterd and restart glusterd (which restarted the bricks) on all the servers to get the volume back.
Comment 1 Amar Tumballi 2012-07-25 01:51:20 EDT

I guess this is the same issue you figured out earlier... dict serialize and unserialize taking lot of time.

The reason for timeout is because these brick-ops from glusterd to glusterfsd are handled in main thread itself, hence blocking more n/w request (mostly from glusterfs) from reaching glusterfsd. Need to create a thread and handle these brick-ops, thus allowing other calls to pass through to below xlators.
Comment 2 Amar Tumballi 2012-10-23 06:45:32 EDT
http://review.gluster.org/4096 should fix the issue.
Comment 3 Vijay Bellur 2012-11-19 03:35:19 EST
CHANGE: http://review.gluster.org/4096 (glusterfsd-mgmt: make brick-ops work in synctask) merged in master by Vijay Bellur (vbellur@redhat.com)

Note You need to log in before you can comment on or make changes to this bug.