953527 – Fuse mount crashes while running FSCT tool on the Samba Share from a windows client

Bug 953527 - Fuse mount crashes while running FSCT tool on the Samba Share from a windows client

Summary: Fuse mount crashes while running FSCT tool on the Samba Share from a windows ...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	fuse
Sub Component:
Version:	2.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	unspecified
Target Milestone:	---
Target Release:	---
Assignee:	Raghavendra Bhat
QA Contact:	Sudhir D
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	957657 (view as bug list)
Depends On:
Blocks:	958108
TreeView+	depends on / blocked

Reported:	2013-04-18 11:28 UTC by Ujjwala
Modified:	2013-09-23 22:41 UTC (History)
CC List:	10 users (show)
Fixed In Version:	glusterfs-3.4.0.2rhs-1
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	958108 (view as bug list)
Environment:
Last Closed:	2013-09-23 22:38:31 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Ujjwala 2013-04-18 11:28:18 UTC

Description of problem: When the FSCT tool is run on the Gluster Samba share on the windows client, the Fuse mount on the server node crashes.
Note: This testing was done on RHEL6.4 + Glusterfs 3.4.0.1rhs build.
I will retry it on the latest iso.


Version-Release number of selected component (if applicable):
glusterfs 3.4.0.1rhs built on Apr  9 2013 12:37:53


How reproducible:
2/2

Setup:
4 Node cluster
2 Windows clients
1 Windows controller

Steps to Reproduce:
1. On a 4 node cluster, Create and start a dis-rep volume.
2. Do the FSCT test setup as mentioned in the link below.
https://home.corp.redhat.com/node/69962
3. After the setup, run the tool on the controller machine. The tests run fine
for the load of 100 users. When the load is increased to 200 users, the Fuse mount on the server crashes, hence making the samba share unavailable for FSCT.
  
Actual results:


Expected results:
The fuse mount should not crash.

Additional info:
Backtrace-
(gdb) bt
#0  0x00007f0f9db2c9c8 in ioc_open_cbk (frame=0x7f0fa20721a4, cookie=<value optimized out>, this=0x131e830, op_ret=0, op_errno=117, fd=0x13c56dc, xdata=0x0)
    at io-cache.c:554
#1  0x00007f0f9dd3bd74 in ra_open_cbk (frame=0x7f0fa2072250, cookie=<value optimized out>, this=<value optimized out>, op_ret=0, op_errno=117, fd=0x13c56dc, 
    xdata=0x0) at read-ahead.c:103
#2  0x00007f0f9e18437b in dht_open_cbk (frame=0x7f0fa20712dc, cookie=<value optimized out>, this=<value optimized out>, op_ret=0, op_errno=117, 
    fd=<value optimized out>, xdata=0x0) at dht-inode-read.c:55
#3  0x00007f0f9e3bd91e in afr_open_cbk (frame=0x7f0fa207102c, cookie=<value optimized out>, this=<value optimized out>, op_ret=<value optimized out>, 
    op_errno=<value optimized out>, fd=<value optimized out>, xdata=0x0) at afr-open.c:178
#4  0x00007f0f9e62253b in client3_3_open_cbk (req=<value optimized out>, iov=<value optimized out>, count=<value optimized out>, myframe=0x7f0fa2072500)
    at client-rpc-fops.c:474
#5  0x0000003cb5c0ddf5 in rpc_clnt_handle_reply (clnt=0x138f520, pollin=0x1310c70) at rpc-clnt.c:771
#6  0x0000003cb5c0e9d7 in rpc_clnt_notify (trans=<value optimized out>, mydata=0x138f550, event=<value optimized out>, data=<value optimized out>) at rpc-clnt.c:890
#7  0x0000003cb5c0a338 in rpc_transport_notify (this=<value optimized out>, event=<value optimized out>, data=<value optimized out>) at rpc-transport.c:495
#8  0x00007f0f9f8872d4 in socket_event_poll_in (this=0x139ef50) at socket.c:2118
#9  0x00007f0f9f88742d in socket_event_handler (fd=<value optimized out>, idx=<value optimized out>, data=0x139ef50, poll_in=1, poll_out=0, poll_err=0)
    at socket.c:2230
#10 0x0000003cb545b3e7 in event_dispatch_epoll_handler (event_pool=0x12f46d0) at event-epoll.c:384
#11 event_dispatch_epoll (event_pool=0x12f46d0) at event-epoll.c:445
#12 0x0000000000406676 in main (argc=4, argv=0x7fffc7e57788) at glusterfsd.c:1902

Comment 4 Ujjwala 2013-04-22 06:38:08 UTC

The is issue is reproducible on the RHS 2.1 ISO also.
I have updated the sosreport in the rhsqe repo.

Fuse mount log snippet:
========================
pending frames:
frame : type(1) op(OPENDIR)
frame : type(1) op(READ)
frame : type(1) op(OPEN)
frame : type(0) op(0)

patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2013-04-22 05:34:45configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.4.0.1rhs
/lib64/libc.so.6[0x324d832920]
/usr/lib64/glusterfs/3.4.0.1rhs/xlator/performance/io-cache.so(ioc_open_cbk+0x98)[0x7f100ab519c8]
/usr/lib64/glusterfs/3.4.0.1rhs/xlator/performance/read-ahead.so(ra_open_cbk+0x1d4)[0x7f100ad60d74]
/usr/lib64/glusterfs/3.4.0.1rhs/xlator/cluster/distribute.so(dht_open_cbk+0xfb)[0x7f100b1a937b]
/usr/lib64/glusterfs/3.4.0.1rhs/xlator/cluster/replicate.so(afr_open_cbk+0x2de)[0x7f100b3e291e]
/usr/lib64/glusterfs/3.4.0.1rhs/xlator/protocol/client.so(client3_3_open_cbk+0x18b)[0x7f100b64753b]
/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)[0x324f00ddf5]
/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x127)[0x324f00e9d7]
/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x324f00a338]
/usr/lib64/glusterfs/3.4.0.1rhs/rpc-transport/socket.so(socket_event_poll_in+0x34)[0x7f100c8ac2d4]
/usr/lib64/glusterfs/3.4.0.1rhs/rpc-transport/socket.so(socket_event_handler+0x13d)[0x7f100c8ac42d]
/usr/lib64/libglusterfs.so.0[0x324e85b3e7]
/usr/sbin/glusterfs(main+0x5c6)[0x406676]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x324d81ecdd]
/usr/sbin/glusterfs[0x404559]

Comment 6 Rachana Patel 2013-04-26 12:38:57 UTC

FUSE mount is also crashing for me
- was running arequal, iozone, glustrefs_build on FUSE mount from - RHEL 6.4 client

signal received: 11
time of crash: 2013-04-26 05:43:01configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.4.0.1rhs
/lib64/libc.so.6[0x3387432920]
/usr/lib64/glusterfs/3.4.0.1rhs/xlator/performance/io-cache.so(ioc_open_cbk+0x98)[0x7fa7932009c8]
/usr/lib64/glusterfs/3.4.0.1rhs/xlator/performance/read-ahead.so(ra_open_cbk+0x1d4)[0x7fa79340fd74]
/usr/lib64/glusterfs/3.4.0.1rhs/xlator/cluster/distribute.so(dht_open_cbk+0xfb)[0x7fa79385837b]
/usr/lib64/glusterfs/3.4.0.1rhs/xlator/protocol/client.so(client3_3_open_cbk+0x18b)[0x7fa793a8753b]
/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)[0x3c4640ddf5]
/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x127)[0x3c4640e9d7]
/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x3c4640a338]
/usr/lib64/glusterfs/3.4.0.1rhs/rpc-transport/socket.so(socket_event_poll_in+0x34)[0x7fa794ace2d4]
/usr/lib64/glusterfs/3.4.0.1rhs/rpc-transport/socket.so(socket_event_handler+0x13d)[0x7fa794ace42d]
/usr/lib64/libglusterfs.so.0[0x3c46c5b3e7]
/usr/sbin/glusterfs(main+0x5c6)[0x406676]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x338741ecdd]
/usr/sbin/glusterfs[0x404559]

--always reproducible

Comment 7 Amar Tumballi 2013-04-30 13:48:01 UTC

*** Bug 957657 has been marked as a duplicate of this bug. ***

Comment 8 Raghavendra Bhat 2013-05-02 11:28:29 UTC

Lookups are not coming on the files. Hence io-cache is not able to fill in the inode context. It might be because of the new fuse module which has readdirp fop. Thus as part of readdirp the inode gets linked to the inode table. The fuse module which has obtained the inode as part of the readdirp operation sends the open call directly without sending the lookup first. But io-cache would not have built the inode context and accesses the NULL context. The solution for this might be making io-cache do the caching (i.e similar to what id does for the lookup) when readdirp  reply comes.

Comment 9 Ujjwala 2013-05-06 07:14:17 UTC

Tested it on the build - glusterfs 3.4.0.2rhs built on May  2 2013 06:08:46
I don't see the crash in both FSCT and smbtorture testing.

Comment 12 Scott Haines 2013-09-23 22:38:31 UTC

Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html

Comment 13 Scott Haines 2013-09-23 22:41:25 UTC

Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html

Note You need to log in before you can comment on or make changes to this bug.