Bug 815242

Summary: locktests hangs on fuse mount
Product: [Community] GlusterFS Reporter: Shwetha Panduranga <shwetha.h.panduranga>
Component: coreAssignee: Anand Avati <aavati>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: high    
Version: mainlineCC: chrisw, gluster-bugs, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.4.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-07-24 17:33:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 817967    
Attachments:
Description Flags
Fuse mount log file none

Description Shwetha Panduranga 2012-04-23 08:00:19 UTC
Created attachment 579439 [details]
Fuse mount log file

Description of problem:
Following is the back trace of brick process

(gdb) info threads
  8 Thread 0x7f01171a6700 (LWP 27681)  0x0000003638a0dff4 in __lll_lock_wait () from /lib64/libpthread.so.0
  7 Thread 0x7f01167a5700 (LWP 27682)  0x0000003638a0b3dc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  6 Thread 0x7f0115da4700 (LWP 27683)  0x0000003638a0b3dc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  5 Thread 0x7f0114f75700 (LWP 27686)  0x0000003638a0eccd in nanosleep () from /lib64/libpthread.so.0
  4 Thread 0x7f0114139700 (LWP 27705)  0x0000003638a0b3dc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  3 Thread 0x7f010ec1a700 (LWP 27706)  0x0000003638a0b75b in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  2 Thread 0x7f010eb19700 (LWP 27709)  0x0000003638a0b75b in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
* 1 Thread 0x7f01184ab700 (LWP 27680)  0x0000003638a0dff4 in __lll_lock_wait () from /lib64/libpthread.so.0
(gdb) bt
#0  0x0000003638a0dff4 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x0000003638a09328 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x0000003638a091f7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00007f011892411e in inode_unref (inode=0x7f010dec80e0) at inode.c:448
#4  0x00007f011893abb6 in fd_destroy (fd=0x14a4aa8) at fd.c:525
#5  0x00007f011893acad in fd_unref (fd=0x14a4aa8) at fd.c:552
#6  0x00007f011893a4d6 in gf_fd_put (fdtable=0x146cc70, fd=417) at fd.c:354
#7  0x00007f010f3633e3 in server_release (req=0x7f010ee2d800) at server3_1-fops.c:3444
#8  0x00007f01186dd102 in rpcsvc_handle_rpc_call (svc=0x147c550, trans=0x14b6130, msg=0x17542b0) at rpcsvc.c:520
#9  0x00007f01186dd4a5 in rpcsvc_notify (trans=0x14b6130, mydata=0x147c550, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x17542b0) at rpcsvc.c:616
#10 0x00007f01186e2e84 in rpc_transport_notify (this=0x14b6130, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x17542b0) at rpc-transport.c:498
#11 0x00007f0115198280 in socket_event_poll_in (this=0x14b6130) at socket.c:1686
#12 0x00007f0115198804 in socket_event_handler (fd=13, idx=9, data=0x14b6130, poll_in=1, poll_out=0, poll_err=0) at socket.c:1801
#13 0x00007f011893e664 in event_dispatch_epoll_handler (event_pool=0x14523a0, events=0x146b7f0, i=0) at event.c:794
#14 0x00007f011893e887 in event_dispatch_epoll (event_pool=0x14523a0) at event.c:856
#15 0x00007f011893ec12 in event_dispatch (event_pool=0x14523a0) at event.c:956
#16 0x0000000000408067 in main (argc=19, argv=0x7fff886ba2d8) at glusterfsd.c:1651
(gdb) t 8
[Switching to thread 8 (Thread 0x7f01171a6700 (LWP 27681))]#0  0x0000003638a0dff4 in __lll_lock_wait () from /lib64/libpthread.so.0
(gdb) bt
#0  0x0000003638a0dff4 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x0000003638a09328 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x0000003638a091f7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00007f011892411e in inode_unref (inode=0x7f010dec80e0) at inode.c:448
#4  0x00007f011893abb6 in fd_destroy (fd=0x149e450) at fd.c:525
#5  0x00007f011893acad in fd_unref (fd=0x149e450) at fd.c:552
#6  0x00007f0118926a24 in inode_dump (inode=0x7f010dec80e0, prefix=0x7f01171a28d0 "conn.1.bound_xl./export1/dstore1.active.1") at inode.c:1601
#7  0x00007f0118926d28 in inode_table_dump (itable=0x1498860, prefix=0x7f01171a3940 "conn.1.bound_xl./export1/dstore1") at inode.c:1644
#8  0x00007f010f3480cd in server_inode (this=0x1479700) at server.c:482
#9  0x00007f0118945c61 in gf_proc_dump_xlator_info (top=0x1479700) at statedump.c:422
#10 0x00007f0118946644 in gf_proc_dump_info (signum=10) at statedump.c:668
#11 0x0000000000407736 in glusterfs_sigwaiter (arg=0x7fff886ba0d0) at glusterfsd.c:1389
#12 0x0000003638a077f1 in start_thread () from /lib64/libpthread.so.0
#13 0x00000036386e570d in clone () from /lib64/libc.so.6

Version-Release number of selected component (if applicable):
3.3.0qa37

How reproducible:
often

locktest.sh:-
------------
#!/bin/bash

while true; 
	do
	locktests -f ./lock_test_file -n 500
	done

Steps to Reproduce:
---------------------
1.create a replicate volume (1x4)
2.create 1 fuse,1 nfs mount (both mount from different machines)
3.execute "locktest.sh" on both fuse, nfs 
4.execute "kill -USR1 <pid_of_glusterfs_process"
  
Actual results:
locktest hung on fuse mount. 

Additional info:
--------------------
[04/23/12 - 18:19:41 root@APP-SERVER2 ~]# gluster volume info
 
Volume Name: dstore
Type: Replicate
Volume ID: 884c1aa1-a0b0-4b41-9c93-92620b15fda8
Status: Started
Number of Bricks: 1 x 4 = 4
Transport-type: tcp
Bricks:
Brick1: 192.168.2.35:/export1/dstore1
Brick2: 192.168.2.36:/export1/dstore1
Brick3: 192.168.2.35:/export2/dstore1
Brick4: 192.168.2.36:/export2/dstore1

Comment 1 Amar Tumballi 2012-04-23 11:24:00 UTC
Avati already sent a patch to fix this @ http://review.gluster.com/3210

Comment 2 Anand Avati 2012-04-24 16:44:46 UTC
CHANGE: http://review.gluster.com/3210 (statedump: fix deadlock during state dump of fds) merged in master by Vijay Bellur (vijay)

Comment 3 Shwetha Panduranga 2012-05-14 06:02:08 UTC
Bug is fixed. Verified on 3.3.0qa41.