Bug 770536

Summary: [glusterfs-3.3.0qa18]: nfs server crashed since frame->local was NULL
Product: [Community] GlusterFS Reporter: Raghavendra Bhat <rabhat>
Component: replicateAssignee: Pranith Kumar K <pkarampu>
Status: CLOSED WORKSFORME QA Contact:
Severity: high Docs Contact:
Priority: high    
Version: mainlineCC: gluster-bugs, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-06-11 10:54:46 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Raghavendra Bhat 2011-12-27 09:54:01 UTC
Description of problem:

nfs server crashed in afr_readdirp_cbk while accessing the local structure which was NULL. 

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:
Backtrace of the core generated.

Core was generated by `/usr/local/sbin/glusterfs -f /etc/glusterd/nfs/nfs-server.vol -p /etc/glusterd/'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007ffbf9531e2c in afr_readdirp_cbk (frame=0x7ffbfb1081cc, cookie=0x1, this=0xb3b700, op_ret=3, op_errno=2, entries=0x7ffff5217050)
    at ../../../../../xlators/cluster/afr/src/afr-dir-read.c:534
534             fresh_children = local->fresh_children;
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.25.el6_1.3.x86_64 libgcc-4.4.5-6.el6.x86_64
(gdb) bt
#0  0x00007ffbf9531e2c in afr_readdirp_cbk (frame=0x7ffbfb1081cc, cookie=0x1, this=0xb3b700, op_ret=3, op_errno=2, entries=0x7ffff5217050)
    at ../../../../../xlators/cluster/afr/src/afr-dir-read.c:534
#1  0x00007ffbf97bc3df in client3_1_readdirp_cbk (req=0x7ffbf5e01c2c, iov=0x7ffbf5e01c6c, count=1, myframe=0x7ffbfb12b62c)
    at ../../../../../xlators/protocol/client/src/client3_1-fops.c:1955
#2  0x00007ffbfc2a57a0 in rpc_clnt_handle_reply (clnt=0xb54bb0, pollin=0xce4d60) at ../../../../rpc/rpc-lib/src/rpc-clnt.c:789
#3  0x00007ffbfc2a5b27 in rpc_clnt_notify (trans=0xb54ee0, mydata=0xb54be0, event=RPC_TRANSPORT_MSG_RECEIVED, data=0xce4d60)
    at ../../../../rpc/rpc-lib/src/rpc-clnt.c:908
#4  0x00007ffbfc2a1d04 in rpc_transport_notify (this=0xb54ee0, event=RPC_TRANSPORT_MSG_RECEIVED, data=0xce4d60)
    at ../../../../rpc/rpc-lib/src/rpc-transport.c:498
#5  0x00007ffbf6a9723d in socket_event_poll_in (this=0xb54ee0) at ../../../../../rpc/rpc-transport/socket/src/socket.c:1675
#6  0x00007ffbf6a977c1 in socket_event_handler (fd=14, idx=6, data=0xb54ee0, poll_in=1, poll_out=0, poll_err=0)
    at ../../../../../rpc/rpc-transport/socket/src/socket.c:1790
#7  0x00007ffbfc4f6808 in event_dispatch_epoll_handler (event_pool=0xb30b80, events=0xb5a8a0, i=0) at ../../../libglusterfs/src/event.c:794
#8  0x00007ffbfc4f6a2b in event_dispatch_epoll (event_pool=0xb30b80) at ../../../libglusterfs/src/event.c:856
#9  0x00007ffbfc4f6db6 in event_dispatch (event_pool=0xb30b80) at ../../../libglusterfs/src/event.c:956
#10 0x0000000000407abe in main (argc=7, argv=0x7ffff5217688) at ../../../glusterfsd/src/glusterfsd.c:1601
(gdb) f 0
#0  0x00007ffbf9531e2c in afr_readdirp_cbk (frame=0x7ffbfb1081cc, cookie=0x1, this=0xb3b700, op_ret=3, op_errno=2, entries=0x7ffff5217050)
    at ../../../../../xlators/cluster/afr/src/afr-dir-read.c:534
534             fresh_children = local->fresh_children;
(gdb) p local
$1 = (afr_local_t *) 0x0
(gdb) l
529
530             local = frame->local;
531
532             read_child = (long) cookie;
533             last_index = &local->cont.readdir.last_index;
534             fresh_children = local->fresh_children;
535
536             /* the value of the last_index changes if afr_next_call_child is
537              * called. So to find the call_child of this callback use last_index
538              * before the next_call_child call.
(gdb) p frame->local
$2 = (void *) 0x0
(gdb) p *frame
$3 = {root = 0x7ffbfae891d0, parent = 0x7ffbfb126434, next = 0x7ffbfb11205c, prev = 0x7ffbfae89258, local = 0x0, this = 0xb3b700, 
  ret = 0x7ffbf92faf19 <dht_readdirp_cbk>, ref_count = 0, lock = 1, cookie = 0x7ffbfb1081cc, complete = _gf_false, op = GF_FOP_NULL, 
  begin = {tv_sec = 0, tv_usec = 0}, end = {tv_sec = 0, tv_usec = 0}, wind_from = 0x7ffbf931bee0 "dht_readdirp_cbk", 
  wind_to = 0x7ffbf931b5ae "next_subvol->fops->readdirp", unwind_from = 0x0, unwind_to = 0x7ffbf931b5ca "dht_readdirp_cbk"}
(gdb)  info thr
  5 Thread 0x7ffbf6a49700 (LWP 12866)  0x00000034c2a0b3cc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  4 Thread 0x7ffbfa3dd700 (LWP 12858)  0x00000034c2a0b3cc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  3 Thread 0x7ffbf5307700 (LWP 12867)  0x00000034c2a0ecbd in nanosleep () from /lib64/libpthread.so.0
  2 Thread 0x7ffbfadde700 (LWP 12857)  0x00000034c2a0f235 in sigwait () from /lib64/libpthread.so.0
* 1 Thread 0x7ffbfc06f700 (LWP 12856)  0x00007ffbf9531e2c in afr_readdirp_cbk (frame=0x7ffbfb1081cc, cookie=0x1, this=0xb3b700, op_ret=3, 
    op_errno=2, entries=0x7ffff5217050) at ../../../../../xlators/cluster/afr/src/afr-dir-read.c:534
(gdb) 




Expected results:


Additional info:

Comment 1 Pranith Kumar K 2011-12-29 05:24:07 UTC
Johnny can you provide the way to reproduce the bug. Logs will be helpful too.

Comment 2 Amar Tumballi 2012-03-12 09:47:01 UTC
please update these bugs w.r.to 3.3.0qa27, need to work on it as per target milestone set.

Comment 3 Vijay Bellur 2012-03-28 15:04:51 UTC
Not reproducible right now.

Comment 4 Raghavendra Bhat 2012-04-20 06:31:34 UTC
Not observed in recent times. Fine to close it. Will reopen if seen again.

Comment 5 Vijay Bellur 2012-05-04 12:08:10 UTC
Removing target milestone as it is no longer reproducible.

Comment 6 Pranith Kumar K 2012-06-11 10:54:46 UTC
Was never able to re-create the problem. Feel free to re-open it if you are able to recreate the problem.