Bug 799861 - nfs-nlm: cthon lock test hangs then crashes the server
Summary: nfs-nlm: cthon lock test hangs then crashes the server
Keywords:
Status: CLOSED DUPLICATE of bug 798222
Alias: None
Product: GlusterFS
Classification: Community
Component: nfs
Version: pre-release
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Vinayaga Raman
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-03-05 09:53 UTC by Saurabh
Modified: 2016-01-19 06:09 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-03-06 10:19:29 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Saurabh 2012-03-05 09:53:02 UTC
Description of problem:
(gdb) bt
#0  0x000000340ae87f40 in memcpy () from /lib64/libc.so.6
#1  0x00007fad1d5545cd in nlm_copy_lkowner (dst=0x7fff3ba9114c, src=0x7fad1ba92a08) at nlm4.c:182
#2  0x00007fad1d555995 in nlm4_lock_to_gf_flock (flock=0x7fff3ba91130, lock=0x7fad1ba929f0, excl=0) at nlm4.c:606
#3  0x00007fad1d5582e1 in nlm4_unlock_fd_resume (carg=0x7fad1ba92508) at nlm4.c:1427
#4  0x00007fad1d558579 in nlm4_cancel_resume (carg=0x7fad1ba92508) at nlm4.c:1470
#5  0x00007fad1d552474 in nfs3_fh_resolve_inode_done (cs=0x7fad1ba92508, inode=0x7fad1c51d174) at nfs3-helpers.c:3545
#6  0x00007fad1d553951 in nfs3_fh_resolve_inode (cs=0x7fad1ba92508) at nfs3-helpers.c:3971
#7  0x00007fad1d5539e5 in nfs3_fh_resolve_resume (cs=0x7fad1ba92508) at nfs3-helpers.c:4003
#8  0x00007fad1d553c10 in nfs3_fh_resolve_root (cs=0x7fad1ba92508) at nfs3-helpers.c:4057
#9  0x00007fad1d553e50 in nfs3_fh_resolve_and_resume (cs=0x7fad1ba92508, fh=0x7fff3ba91fb0, entry=0x0, 
    resum_fn=0x7fad1d55840c <nlm4_cancel_resume>) at nfs3-helpers.c:4104
#10 0x00007fad1d558abd in nlm4svc_cancel (req=0x7fad1d1d2930) at nlm4.c:1524
#11 0x00007fad219df0b9 in rpcsvc_handle_rpc_call (svc=0x1cb0b10, trans=0x1d83140, msg=0x1ea7aa0) at rpcsvc.c:514
#12 0x00007fad219df45c in rpcsvc_notify (trans=0x1d83140, mydata=0x1cb0b10, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x1ea7aa0)
    at rpcsvc.c:610
#13 0x00007fad219e4db8 in rpc_transport_notify (this=0x1d83140, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x1ea7aa0)
    at rpc-transport.c:498
#14 0x00007fad1baa4270 in socket_event_poll_in (this=0x1d83140) at socket.c:1686
#15 0x00007fad1baa47f4 in socket_event_handler (fd=18, idx=8, data=0x1d83140, poll_in=1, poll_out=0, poll_err=0) at socket.c:1801
#16 0x00007fad21c3f030 in event_dispatch_epoll_handler (event_pool=0x1c97290, events=0x1d7a470, i=0) at event.c:794
#17 0x00007fad21c3f253 in event_dispatch_epoll (event_pool=0x1c97290) at event.c:856
#18 0x00007fad21c3f5de in event_dispatch (event_pool=0x1c97290) at event.c:956
#19 0x0000000000407dcc in main (argc=7, argv=0x7fff3ba92548) at glusterfsd.c:1612
(gdb) 


This time the cthon lock test hangs, for test #7,


nfs.log reports connection failure.

[root@RHSSA1 ~]# tail -f /root/330/inst/var/log/glusterfs/nfs.log 
[2012-03-05 09:10:58.082552] E [nlm4.c:471:nsm_monitor] 0-nfs-NLM: clnt_call(): RPC: Success
[2012-03-05 09:10:58.083030] E [socket.c:1724:socket_connect_finish] 0-NLM-client: connection to  failed (Connection refused)
[2012-03-05 09:11:28.083139] E [nlm4.c:471:nsm_monitor] 0-nfs-NLM: clnt_call(): RPC: Success
[2012-03-05 09:11:28.083624] E [socket.c:1724:socket_connect_finish] 0-NLM-client: connection to  failed (Connection refused)
[2012-03-05 09:11:58.082547] E [nlm4.c:471:nsm_monitor] 0-nfs-NLM: clnt_call(): RPC: Success
[2012-03-05 09:11:58.083041] E [socket.c:1724:socket_connect_finish] 0-NLM-client: connection to  failed (Connection refused)
[2012-03-05 09:12:28.083477] E [nlm4.c:471:nsm_monitor] 0-nfs-NLM: clnt_call(): RPC: Success
[2012-03-05 09:12:28.083959] E [socket.c:1724:socket_connect_finish] 0-NLM-client: connection to  failed (Connection refused)
[2012-03-05 09:12:58.083569] E [nlm4.c:471:nsm_monitor] 0-nfs-NLM: clnt_call(): RPC: Success
[2012-03-05 09:12:58.084020] E [socket.c:1724:socket_connect_finish] 0-NLM-client: connection to  failed (Connection refused)


Version-Release number of selected component (if applicable):
3.3.0qa25


How reproducible:
always

Steps to Reproduce:
1. start the cthon lock for distribute-replicate volume 
2. test #7 shall hang,
3. now, press ctrl+c in order to kill the cthon test
  
Actual results:
the server crashes

Expected results:

Additional info:

Comment 1 Saurabh 2012-03-05 11:28:31 UTC
nfs.log info


pending frames:

patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2012-03-05 09:47:21
configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.3.0qa25
/lib64/libc.so.6[0x340ae32980]
/lib64/libc.so.6(memcpy+0xa0)[0x340ae87f40]
/root/330/inst/lib/glusterfs/3.3.0qa25/xlator/nfs/server.so(nlm_copy_lkowner+0x42)[0x7f900b2445cd]
/root/330/inst/lib/glusterfs/3.3.0qa25/xlator/nfs/server.so(nlm4_lock_to_gf_flock+0x88)[0x7f900b245995]
/root/330/inst/lib/glusterfs/3.3.0qa25/xlator/nfs/server.so(nlm4_unlock_fd_resume+0xdc)[0x7f900b2482e1]
/root/330/inst/lib/glusterfs/3.3.0qa25/xlator/nfs/server.so(nlm4_cancel_resume+0x16d)[0x7f900b248579]
/root/330/inst/lib/glusterfs/3.3.0qa25/xlator/nfs/server.so(nfs3_fh_resolve_inode_done+0x116)[0x7f900b242474]
/root/330/inst/lib/glusterfs/3.3.0qa25/xlator/nfs/server.so(nfs3_fh_resolve_inode+0xeb)[0x7f900b243951]
/root/330/inst/lib/glusterfs/3.3.0qa25/xlator/nfs/server.so(nfs3_fh_resolve_resume+0x4c)[0x7f900b2439e5]
/root/330/inst/lib/glusterfs/3.3.0qa25/xlator/nfs/server.so(nfs3_fh_resolve_root+0x7d)[0x7f900b243c10]
/root/330/inst/lib/glusterfs/3.3.0qa25/xlator/nfs/server.so(nfs3_fh_resolve_and_resume+0xe8)[0x7f900b243e50]
/root/330/inst/lib/glusterfs/3.3.0qa25/xlator/nfs/server.so(nlm4svc_cancel+0x4a6)[0x7f900b248abd]
/root/330/inst/lib/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x360)[0x7f900f6cf0b9]
/root/330/inst/lib/libgfrpc.so.0(rpcsvc_notify+0x181)[0x7f900f6cf45c]
/root/330/inst/lib/libgfrpc.so.0(rpc_transport_notify+0x130)[0x7f900f6d4db8]
/root/330/inst/lib/glusterfs/3.3.0qa25/rpc-transport/socket.so(socket_event_poll_in+0x54)[0x7f9009794270]
/root/330/inst/lib/glusterfs/3.3.0qa25/rpc-transport/socket.so(socket_event_handler+0x21d)[0x7f90097947f4]
/root/330/inst/lib/libglusterfs.so.0(+0x4d030)[0x7f900f92f030]
/root/330/inst/lib/libglusterfs.so.0(+0x4d253)[0x7f900f92f253]
/root/330/inst/lib/libglusterfs.so.0(event_dispatch+0x88)[0x7f900f92f5de]
/root/330/inst/sbin/glusterfs(main+0x238)[0x407dcc]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x340ae1ecdd]
/root/330/inst/sbin/glusterfs[0x403f79]
---------

Comment 2 Saurabh 2012-03-05 11:29:03 UTC
found same crash with dbench over sanity runs

Comment 3 Saurabh 2012-03-05 17:19:13 UTC
nfs sanity keeps hanging and crashing please fix this bug.

Comment 4 Krishna Srinivas 2012-03-06 10:19:29 UTC

*** This bug has been marked as a duplicate of bug 798222 ***


Note You need to log in before you can comment on or make changes to this bug.