Bug 765426 (GLUSTER-3694) - What those bailing out frame type mean ?
Summary: What those bailing out frame type mean ?
Keywords:
Status: CLOSED NOTABUG
Alias: GLUSTER-3694
Product: GlusterFS
Classification: Community
Component: glusterd
Version: pre-release
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
Assignee: krishnan parthasarathi
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-10-05 06:43 UTC by KentaroNishizawa
Modified: 2015-11-03 23:03 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-02-06 13:38:36 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description KentaroNishizawa 2011-10-05 06:43:12 UTC
Hi 

I'm using glusterfs-3.2.3 in following enviroment and I have couple of question.

glusterfs server  CentOS-5.5 x86_64
glusterfs client  Debian-5.0 Lenny  Reading data via fuse

I see many of following message in my gluster client server log.
When I see this message, some of my gluster client servers 
cpu load gets very high , but some of the client servers cpu load are normal.

What is this error message mean ? 
and when and what timing those this error message appear ?

--------------------------------------------------------------------
Error Message
--------------------------------------------------------------------
I have 20 glusterfs client server but the error message are all the same.

[2011-10-05 15:12:14.840485] E [rpc-clnt.c:197:call_bail] 0-disk2: bailing out frame type(GlusterFS 3.1) op(FINODELK(30)) xid = 0x53471x sent = 2011-10-05 14:42:13.652077. timeout = 1800
[2011-10-05 15:12:14.840547] E [rpc-clnt.c:197:call_bail] 0-disk2: bailing out frame type(GlusterFS 3.1) op(FINODELK(30)) xid = 0x53470x sent = 2011-10-05 14:42:13.652037. timeout = 1800
[2011-10-05 15:12:14.840611] E [rpc-clnt.c:197:call_bail] 0-disk2: bailing out frame type(GlusterFS 3.1) op(FINODELK(30)) xid = 0x53469x sent = 2011-10-05 14:42:13.651900. timeout = 1800
[2011-10-05 15:12:14.840670] E [rpc-clnt.c:197:call_bail] 0-disk2: bailing out frame type(GlusterFS 3.1) op(FINODELK(30)) xid = 0x53468x sent = 2011-10-05 14:42:13.651862. timeout = 1800
[2011-10-05 15:12:14.840733] E [rpc-clnt.c:197:call_bail] 0-disk2: bailing out frame type(GlusterFS 3.1) op(FINODELK(30)) xid = 0x53467x sent = 2011-10-05 14:42:13.651821. timeout = 1800
[2011-10-05 15:12:24.841077] E [rpc-clnt.c:197:call_bail] 0-disk2: bailing out frame type(GlusterFS 3.1) op(FINODELK(30)) xid = 0x53493x sent = 2011-10-05 14:42:23.652976. timeout = 1800
[2011-10-05 15:12:34.841460] E [rpc-clnt.c:197:call_bail] 0-disk2: bailing out frame type(GlusterFS 3.1) op(FINODELK(30)) xid = 0x53567x sent = 2011-10-05 14:42:33.655952. timeout = 1800
[2011-10-05 15:12:34.841567] E [rpc-clnt.c:197:call_bail] 0-disk2: bailing out frame type(GlusterFS 3.1) op(FINODELK(30)) xid = 0x53566x sent = 2011-10-05 14:42:33.655919. timeout = 1800
[2011-10-05 15:12:34.841638] E [rpc-clnt.c:197:call_bail] 0-disk2: bailing out frame type(GlusterFS 3.1) op(FINODELK(30)) xid = 0x53565x sent = 2011-10-05 14:42:33.655886. timeout = 1800
[2011-10-05 15:12:34.841694] E [rpc-clnt.c:197:call_bail] 0-disk2: bailing out frame type(GlusterFS 3.1) op(FINODELK(30)) xid = 0x53564x sent = 2011-10-05 14:42:33.655853. timeout = 1800
[2011-10-05 15:12:34.841744] E [rpc-clnt.c:197:call_bail] 0-disk2: bailing out frame type(GlusterFS 3.1) op(FINODELK(30)) xid = 0x53563x sent = 2011-10-05 14:42:33.655821. timeout = 1800
[2011-10-05 15:12:34.841795] E [rpc-clnt.c:197:call_bail] 0-disk2: bailing out frame type(GlusterFS 3.1) op(FINODELK(30)) xid = 0x53562x sent = 2011-10-05 14:42:33.655787. timeout = 1800
[2011-10-05 15:12:34.841848] E [rpc-clnt.c:197:call_bail] 0-disk2: bailing out frame type(GlusterFS 3.1) op(FINODELK(30)) xid = 0x53561x sent = 2011-10-05 14:42:33.655753. timeout = 1800
[2011-10-05 15:12:34.841912] E [rpc-clnt.c:197:call_bail] 0-disk2: bailing out frame type(GlusterFS 3.1) op(FINODELK(30)) xid = 0x53560x sent = 2011-10-05 14:42:33.655719. timeout = 1800

Comment 1 krishnan parthasarathi 2011-10-13 14:45:00 UTC
Kentaro,

The messages that says "bailing out.." convey that a operation on the file didn't get any response from the server in the last 1800 seconds. The operation in this case being FINODELK.
We cannot keep this bug open to answer queries about Glusterfs. The best place for that would be http://community.gluster.org/

It would help us to investigate the issue, if you can provide the following information,

- What is the volume configuration (gluster volume info <volname>)
- What kind of 'workload' was seen on the glusterfs client?
- Did any of the servers 'go down' or was there any network outages,
  when you see these messages?
- Attach log files of client and server(s).
- When you observe a hang issue signal USR1 to glusterfs server process(es) -  
 'kill -s USR1 <pid>'
  It dumps the process state dump in '/tmp/glusterdump.<pid>'
  Attach the above (glusterdump.pid) file(s).


Note You need to log in before you can comment on or make changes to this bug.