Bug 804489

Summary: nfs server crashed when unwinding the frames
Product: [Community] GlusterFS Reporter: Shwetha Panduranga <shwetha.h.panduranga>
Component: nfsAssignee: Vinayaga Raman <vraman>
Status: CLOSED DUPLICATE QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: mainlineCC: ccoleman, gluster-bugs, rwheeler
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-03-19 08:16:16 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Attachments:
Description Flags
nfs server log none

Description Shwetha Panduranga 2012-03-19 00:53:29 EDT
Created attachment 571000 [details]
nfs server log

Description of problem:
(gdb) bt
#0  0x000000350f232885 in raise () from /lib64/libc.so.6
#1  0x000000350f234065 in abort () from /lib64/libc.so.6
#2  0x000000350f22b9fe in __assert_fail_base () from /lib64/libc.so.6
#3  0x000000350f22bac0 in __assert_fail () from /lib64/libc.so.6
#4  0x00007fcb618cf08f in __gf_free (free_ptr=0x7fcb18002cc0) at mem-pool.c:278
#5  0x00007fcb61676b73 in saved_frames_destroy (frames=0x7fcb18002cc0) at rpc-clnt.c:407
#6  0x00007fcb61679699 in rpc_clnt_destroy (rpc=0x7fcb20000fd0) at rpc-clnt.c:1578
#7  0x00007fcb6167975c in rpc_clnt_unref (rpc=0x7fcb20000fd0) at rpc-clnt.c:1604
#8  0x00007fcb5ccd4706 in nlm_set_rpc_clnt (rpc_clnt=0x7fcb18000fd0, caller_name=0x25c3b70 "APP-CLIENT1") at nlm4.c:319
#9  0x00007fcb5ccd6985 in nlm4_establish_callback (csarg=0x7fcb572948b4) at nlm4.c:945
#10 0x000000350f6077f1 in start_thread () from /lib64/libpthread.so.0
#11 0x000000350f2e570d in clone () from /lib64/libc.so.6
(gdb) bt full
#0  0x000000350f232885 in raise () from /lib64/libc.so.6
No symbol table info available.
#1  0x000000350f234065 in abort () from /lib64/libc.so.6
No symbol table info available.
#2  0x000000350f22b9fe in __assert_fail_base () from /lib64/libc.so.6
No symbol table info available.
#3  0x000000350f22bac0 in __assert_fail () from /lib64/libc.so.6
No symbol table info available.
#4  0x00007fcb618cf08f in __gf_free (free_ptr=0x7fcb18002cc0) at mem-pool.c:278
        req_size = 0
        ptr = 0x7fcb18002cb4 ""
        type = 0
        xl = 0x0
        __PRETTY_FUNCTION__ = "__gf_free"
#5  0x00007fcb61676b73 in saved_frames_destroy (frames=0x7fcb18002cc0) at rpc-clnt.c:407
No locals.
#6  0x00007fcb61679699 in rpc_clnt_destroy (rpc=0x7fcb20000fd0) at rpc-clnt.c:1578
No locals.
#7  0x00007fcb6167975c in rpc_clnt_unref (rpc=0x7fcb20000fd0) at rpc-clnt.c:1604
        count = 0
#8  0x00007fcb5ccd4706 in nlm_set_rpc_clnt (rpc_clnt=0x7fcb18000fd0, caller_name=0x25c3b70 "APP-CLIENT1") at nlm4.c:319
        nlmclnt = 0x25c3fc0
        nlmclnt_found = 1
        ret = 0
        rpc_clnt_old = 0x7fcb20000fd0
        old_name = 0x7fcb200029f0 "APP-CLIENT1"
        __FUNCTION__ = "nlm_set_rpc_clnt"
#9  0x00007fcb5ccd6985 in nlm4_establish_callback (csarg=0x7fcb572948b4) at nlm4.c:945
        cs = 0x7fcb572948b4
        sa = {ss_family = 2, __ss_align = 0, __ss_padding = '\000' <repeats 111 times>}
        sockaddr = 0x7fcb2bffede0
        options = 0x24f7d30
        peerip = "192.168.2.34", '\000' <repeats 34 times>
        portstr = 0x7fcb18000aa0 "33878"
        rpc_clnt = 0x7fcb18000fd0
        port = 33878
---Type <return> to continue, or q <return> to quit---
        ret = -1
        caller_name = 0x25c3b70 "APP-CLIENT1"
        __FUNCTION__ = "nlm4_establish_callback"
        cm = 0x7fcb18002920
#10 0x000000350f6077f1 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#11 0x000000350f2e570d in clone () from /lib64/libc.so.6
No symbol table info available.


Version-Release number of selected component (if applicable):
3.3.0qa29

How reproducible:


Steps to Reproduce:
1.Create a replicate volume (1x3)
2.create nfs, fuse mounts
3.start writes from both nfs,fuse mounts
4.Bring down a brick
5.Bring back the brick online. 
  
Actual results:
nfs server crashed

Expected results:


Additional info:

[2012-03-19 15:35:53.257246] E [rpc-clnt.c:382:saved_frames_unwind] (-->/usr/local/lib/libgfrpc.so.0(+0x13676) [0x7fcb61679676] (-->/usr/local/lib/libgfrpc.so.0(rpc_cl
nt_connection_cleanup+0x155) [0x7fcb61677121] (-->/usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0x1f) [0x7fcb61676b67]))) 0-NLM-client: forced unwinding frame type
(NLMv4) op(GRANTED(5)) called at 2012-03-19 15:35:53.191691 (xid=0x11x)
[2012-03-19 15:35:53.257365] E [socket.c:1966:socket_disconnect] (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_unref+0x6f) [0x7fcb6167975c] (-->/usr/local/lib/libgfrpc.so.0(+0x13656) [0x7fcb61679656] (-->/usr/local/lib/libgfrpc.so.0(rpc_transport_disconnect+0x88) [0x7fcb61673a7c]))) 0-socket: invalid argument: this->private
[2012-03-19 15:35:53.257395] I [mem-pool.c:585:mem_pool_destroy] 0-nfs-server: size=2236 max=2 total=11
[2012-03-19 15:35:53.257417] I [mem-pool.c:585:mem_pool_destroy] 0-nfs-server: size=124 max=2 total=11
pending frames:

patchset: git://git.gluster.com/glusterfs.git
signal received: 6
time of crash: 2012-03-19 15:35:53
configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.3.0qa29
/lib64/libc.so.6[0x350f232900]
/lib64/libc.so.6(gsignal+0x35)[0x350f232885]
/lib64/libc.so.6(abort+0x175)[0x350f234065]
/lib64/libc.so.6[0x350f22b9fe]
/lib64/libc.so.6(__assert_perror_fail+0x0)[0x350f22bac0]
/usr/local/lib/libglusterfs.so.0(__gf_free+0xa3)[0x7fcb618cf08f]
/usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0x2b)[0x7fcb61676b73]
/usr/local/lib/libgfrpc.so.0(+0x13699)[0x7fcb61679699]
/usr/local/lib/libgfrpc.so.0(rpc_clnt_unref+0x6f)[0x7fcb6167975c]
/usr/local/lib/glusterfs/3.3.0qa29/xlator/nfs/server.so(nlm_set_rpc_clnt+0x221)[0x7fcb5ccd4706]
/usr/local/lib/glusterfs/3.3.0qa29/xlator/nfs/server.so(nlm4_establish_callback+0x5a0)[0x7fcb5ccd6985]

/lib64/libpthread.so.0[0x350f6077f1]
/lib64/libc.so.6(clone+0x6d)[0x350f2e570d]
Comment 1 Shwetha Panduranga 2012-03-19 08:16:16 EDT

*** This bug has been marked as a duplicate of bug 802403 ***
Comment 2 Clayton Coleman 2012-03-23 18:47:51 EDT
*** Bug 806025 has been marked as a duplicate of this bug. ***