Description of problem: ===================== was performing some non-functional tests for testing rpc fixes and found that one of the fuse clients crashed as below [2019-03-01 13:37:02.398653] I [dict.c:471:dict_get] (-->/usr/lib64/glusterfs/3.12.2/xlator/cluster/replicate.so(+0x6228d) [0x7f5147de128d] -->/usr/lib64/glusterfs/3.12.2/xlator/cluster/distribute.so(+0x202f7) [0x7f5147afc2f7] -->/lib64/libglusterfs.so.0(dict_get+0x10c) [0x7f515a5b2d3c] ) 13-dict: !this || key=trusted.glusterfs.dht.mds [Invalid argument] [2019-03-01 13:37:02.711187] W [inode.c:197:__is_dentry_hashed] (-->/lib64/libglusterfs.so.0(__inode_path+0x68) [0x7f515a5cd1c8] -->/lib64/libglusterfs.so.0(+0x3add4) [0x7f515a5cadd4] -->/lib64/libglusterfs.so.0(+0x3ad7e) [0x7f515a5cad7e] ) 0-fuse: dentry not found pending frames: frame : type(1) op(UNLINK) frame : type(1) op(UNLINK) frame : type(1) op(READDIRP) frame : type(1) op(OPEN) frame : type(1) op(STAT) frame : type(0) op(0) frame : type(0) op(0) patchset: git://git.gluster.org/glusterfs.git signal received: 11 time of crash: 2019-03-01 13:37:02 configuration details: argp 1 backtrace 1 dlfcn 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.12.2 /lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0x9d)[0x7f515a5bbb9d] /lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7f515a5c6114] /lib64/libc.so.6(+0x36280)[0x7f5158bf8280] /lib64/libglusterfs.so.0(+0x3adc2)[0x7f515a5cadc2] /lib64/libglusterfs.so.0(__inode_path+0x68)[0x7f515a5cd1c8] /lib64/libglusterfs.so.0(inode_path+0x31)[0x7f515a5cd551] /lib64/libglusterfs.so.0(loc_touchup+0x7a)[0x7f515a5b9dba] /usr/lib64/glusterfs/3.12.2/xlator/mount/fuse.so(+0x6f8b)[0x7f515196df8b] /usr/lib64/glusterfs/3.12.2/xlator/mount/fuse.so(+0x7665)[0x7f515196e665] /usr/lib64/glusterfs/3.12.2/xlator/mount/fuse.so(+0x7bbd)[0x7f515196ebbd] /usr/lib64/glusterfs/3.12.2/xlator/mount/fuse.so(+0x7c8e)[0x7f515196ec8e] /usr/lib64/glusterfs/3.12.2/xlator/mount/fuse.so(+0x7cd0)[0x7f515196ecd0] /usr/lib64/glusterfs/3.12.2/xlator/mount/fuse.so(+0x1f702)[0x7f5151986702] /lib64/libpthread.so.0(+0x7dd5)[0x7f51593f7dd5] /lib64/libc.so.6(clone+0x6d)[0x7f5158cbfead] --------- warning: core file may not match specified executable file. Reading symbols from /usr/sbin/glusterfsd...Reading symbols from /usr/lib/debug/usr/sbin/glusterfsd.debug...done. done. Missing separate debuginfo for Try: yum --enablerepo='*debug*' install /usr/lib/debug/.build-id/73/3d1c681cfbd8bbeb11e8b7f80876a9aed6bb74 [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `/usr/sbin/glusterfs --volfile-server=rhs-client19.lab.eng.blr.redhat.com --volf'. Program terminated with signal 11, Segmentation fault. #0 __dentry_search_arbit (inode=inode@entry=0x7f50ec000e98) at inode.c:1450 1450 list_for_each_entry (trav, &inode->dentry_list, inode_list) { Missing separate debuginfos, use: debuginfo-install glibc-2.17-260.el7_6.3.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-37.el7_6.x86_64 libcom_err-1.42.9-13.el7.x86_64 libgcc-4.8.5-36.el7.x86_64 libselinux-2.5-14.1.el7.x86_64 libuuid-2.23.2-59.el7.x86_64 openssl-libs-1.0.2k-16.el7.x86_64 pcre-8.32-17.el7.x86_64 zlib-1.2.7-18.el7.x86_64 (gdb) bt #0 __dentry_search_arbit (inode=inode@entry=0x7f50ec000e98) at inode.c:1450 #1 0x00007f515a5cd1c8 in __inode_path (inode=inode@entry=0x7f50dc01dfc8, name=name@entry=0x0, bufp=bufp@entry=0x7f5145ed6d30) at inode.c:1551 #2 0x00007f515a5cd551 in inode_path (inode=0x7f50dc01dfc8, name=name@entry=0x0, bufp=bufp@entry=0x7f5145ed6d30) at inode.c:1642 #3 0x00007f515a5b9dba in loc_touchup (loc=0x7f5069ee43c0, name=<optimized out>) at xlator.c:880 #4 0x00007f515196df8b in fuse_resolve_loc_touchup (state=0x7f5069ee43a0) at fuse-resolve.c:33 #5 fuse_resolve_continue (state=0x7f5069ee43a0) at fuse-resolve.c:704 #6 0x00007f515196e665 in fuse_resolve_inode (state=0x7f5069ee43a0) at fuse-resolve.c:364 #7 0x00007f515196ebbd in fuse_resolve (state=0x7f5069ee43a0) at fuse-resolve.c:651 #8 0x00007f515196ec8e in fuse_resolve_all (state=<optimized out>) at fuse-resolve.c:679 #9 0x00007f515196ecd0 in fuse_resolve_and_resume (state=0x7f5069ee43a0, fn=0x7f5151972c10 <fuse_getattr_resume>) at fuse-resolve.c:718 #10 0x00007f5151986702 in fuse_thread_proc (data=0x563689a35f10) at fuse-bridge.c:5781 #11 0x00007f51593f7dd5 in start_thread () from /lib64/libpthread.so.0 #12 0x00007f5158cbfead in clone () from /lib64/libc.so.6 (gdb) t a a bt Thread 11 (Thread 0x7f515aaa6780 (LWP 4229)): #0 0x00007f51593f8f47 in pthread_join () from /lib64/libpthread.so.0 #1 0x00007f515a61af78 in event_dispatch_epoll (event_pool=0x563689a2e250) at event-epoll.c:846 #2 0x0000563687bed538 in main (argc=4, argv=<optimized out>) at glusterfsd.c:2692 Thread 10 (Thread 0x7f51456d6700 (LWP 4257)): #0 0x00007f51593fb965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f515197039b in timed_response_loop (data=0x563689a35f10) at fuse-bridge.c:4660 #2 0x00007f51593f7dd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f5158cbfead in clone () from /lib64/libc.so.6 Thread 9 (Thread 0x7f514cc97700 (LWP 4251)): #0 0x00007f5158cc0483 in epoll_wait () from /lib64/libc.so.6 #1 0x00007f515a61a6e8 in event_dispatch_epoll_worker (data=0x563689a8bd60) at event-epoll.c:749 #2 0x00007f51593f7dd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f5158cbfead in clone () from /lib64/libc.so.6 Thread 8 (Thread 0x7f514d498700 (LWP 4250)): #0 0x00007f5158cc0483 in epoll_wait () from /lib64/libc.so.6 #1 0x00007f515a61a6e8 in event_dispatch_epoll_worker (data=0x563689a8ba90) at event-epoll.c:749 #2 0x00007f51593f7dd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f5158cbfead in clone () from /lib64/libc.so.6 Thread 7 (Thread 0x7f5150163700 (LWP 4234)): #0 0x00007f51593fbd12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f515a5f7dd8 in syncenv_task (proc=proc@entry=0x563689a49b20) at syncop.c:603 #2 0x00007f515a5f8ca0 in syncenv_processor (thdata=0x563689a49b20) at syncop.c:695 #3 0x00007f51593f7dd5 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f5158cbfead in clone () from /lib64/libc.so.6 Thread 6 (Thread 0x7f5151165700 (LWP 4231)): #0 0x00007f51593ff361 in sigwait () from /lib64/libpthread.so.0 #1 0x0000563687bf0c7b in glusterfs_sigwaiter (arg=<optimized out>) at glusterfsd.c:2242 #2 0x00007f51593f7dd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f5158cbfead in clone () from /lib64/libc.so.6 Thread 5 (Thread 0x7f51322ff700 (LWP 22247)): #0 0x00007f51593fbd12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f515a5f7dd8 in syncenv_task (proc=proc@entry=0x563689a4a2a0) at syncop.c:603 #2 0x00007f515a5f8ca0 in syncenv_processor (thdata=0x563689a4a2a0) at syncop.c:695 #3 0x00007f51593f7dd5 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f5158cbfead in clone () from /lib64/libc.so.6 Thread 4 (Thread 0x7f5150964700 (LWP 4232)): #0 0x00007f5158c86e2d in nanosleep () from /lib64/libc.so.6 #1 0x00007f5158c86cc4 in sleep () from /lib64/libc.so.6 #2 0x00007f515a5e4b9d in pool_sweeper (arg=<optimized out>) at mem-pool.c:470 #3 0x00007f51593f7dd5 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f5158cbfead in clone () from /lib64/libc.so.6 Thread 3 (Thread 0x7f5144ed5700 (LWP 4258)): #0 0x00007f51593fb965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f51519708d3 in notify_kernel_loop (data=<optimized out>) at fuse-bridge.c:4584 #2 0x00007f51593f7dd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f5158cbfead in clone () from /lib64/libc.so.6 Thread 2 (Thread 0x7f5151966700 (LWP 4230)): #0 0x00007f51593fee3d in nanosleep () from /lib64/libpthread.so.0 #1 0x00007f515a5c9f86 in gf_timer_proc (data=0x563689a49300) at timer.c:174 #2 0x00007f51593f7dd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f5158cbfead in clone () from /lib64/libc.so.6 ---Type <return> to continue, or q <return> to quit--- Thread 1 (Thread 0x7f5145ed7700 (LWP 4256)): #0 __dentry_search_arbit (inode=inode@entry=0x7f50ec000e98) at inode.c:1450 #1 0x00007f515a5cd1c8 in __inode_path (inode=inode@entry=0x7f50dc01dfc8, name=name@entry=0x0, bufp=bufp@entry=0x7f5145ed6d30) at inode.c:1551 #2 0x00007f515a5cd551 in inode_path (inode=0x7f50dc01dfc8, name=name@entry=0x0, bufp=bufp@entry=0x7f5145ed6d30) at inode.c:1642 #3 0x00007f515a5b9dba in loc_touchup (loc=0x7f5069ee43c0, name=<optimized out>) at xlator.c:880 #4 0x00007f515196df8b in fuse_resolve_loc_touchup (state=0x7f5069ee43a0) at fuse-resolve.c:33 #5 fuse_resolve_continue (state=0x7f5069ee43a0) at fuse-resolve.c:704 #6 0x00007f515196e665 in fuse_resolve_inode (state=0x7f5069ee43a0) at fuse-resolve.c:364 #7 0x00007f515196ebbd in fuse_resolve (state=0x7f5069ee43a0) at fuse-resolve.c:651 #8 0x00007f515196ec8e in fuse_resolve_all (state=<optimized out>) at fuse-resolve.c:679 #9 0x00007f515196ecd0 in fuse_resolve_and_resume (state=0x7f5069ee43a0, fn=0x7f5151972c10 <fuse_getattr_resume>) at fuse-resolve.c:718 #10 0x00007f5151986702 in fuse_thread_proc (data=0x563689a35f10) at fuse-bridge.c:5781 #11 0x00007f51593f7dd5 in start_thread () from /lib64/libpthread.so.0 #12 0x00007f5158cbfead in clone () from /lib64/libc.so.6 (gdb) q [root@dhcp35-64 glusterfs]# rpm -qa|grep gluster glusterfs-3.12.2-43.el7.x86_64 glusterfs-libs-3.12.2-43.el7.x86_64 glusterfs-client-xlators-3.12.2-43.el7.x86_64 glusterfs-fuse-3.12.2-43.el7.x86_64 glusterfs-debuginfo-3.12.2-43.el7.x86_64 ################# on another client too hit same crash ########## sing host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `/usr/sbin/glusterfs --acl --volfile-server=rhs-client38.lab.eng.blr.redhat.com'. Program terminated with signal 11, Segmentation fault. #0 __dentry_search_arbit (inode=inode@entry=0x7f0a5002eab8) at inode.c:1450 1450 list_for_each_entry (trav, &inode->dentry_list, inode_list) { Missing separate debuginfos, use: debuginfo-install glusterfs-fuse-3.12.2-43.el7.x86_64 (gdb) bt #0 __dentry_search_arbit (inode=inode@entry=0x7f0a5002eab8) at inode.c:1450 #1 0x00007f0ac23a01c8 in __inode_path (inode=inode@entry=0x7f0a50009e68, name=name@entry=0x7f0a028f1b50 "dhcp35-37.lab.eng.blr.redhat.com-fresh_top.log", bufp=bufp@entry=0x7f0985ba36e8) at inode.c:1551 #2 0x00007f0ac23a0551 in inode_path (inode=0x7f0a50009e68, name=0x7f0a028f1b50 "dhcp35-37.lab.eng.blr.redhat.com-fresh_top.log", bufp=bufp@entry=0x7f0985ba36e8) at inode.c:1642 #3 0x00007f0ab9740489 in fuse_resolve_entry (state=0x7f0985ba3570) at fuse-resolve.c:99 #4 0x00007f0ab974162d in fuse_resolve_parent (state=state@entry=0x7f0985ba3570) at fuse-resolve.c:312 #5 0x00007f0ab9741998 in fuse_resolve (state=0x7f0985ba3570) at fuse-resolve.c:647 #6 0x00007f0ab9741c8e in fuse_resolve_all (state=<optimized out>) at fuse-resolve.c:679 #7 0x00007f0ab9741cd0 in fuse_resolve_and_resume (state=0x7f0985ba3570, fn=0x7f0ab9744e40 <fuse_lookup_resume>) at fuse-resolve.c:718 ---Type <return> to continue, or q <return> to quit--- #8 0x00007f0ab9759702 in fuse_thread_proc (data=0x559b1ebcf0c0) at fuse-bridge.c:5781 #9 0x00007f0ac11cadd5 in start_thread () from /lib64/libpthread.so.0 #10 0x00007f0ac0a92ead in clone () from /lib64/libc.so.6 (gdb) t a a bt Thread 11 (Thread 0x7f0ac2879780 (LWP 4246)): #0 0x00007f0ac11cbf47 in pthread_join () from /lib64/libpthread.so.0 #1 0x00007f0ac23edf78 in event_dispatch_epoll (event_pool=0x559b1ebc7250) at event-epoll.c:846 #2 0x0000559b1de33538 in main (argc=5, argv=<optimized out>) at glusterfsd.c:2692 Thread 10 (Thread 0x7f0aad2ae700 (LWP 4260)): #0 0x00007f0ac11ce965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f0ab974339b in timed_response_loop (data=0x559b1ebcf0c0) at fuse-bridge.c:4660 #2 0x00007f0ac11cadd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f0ac0a92ead in clone () from /lib64/libc.so.6 Thread 9 (Thread 0x7f0ab7f36700 (LWP 4251)): ---Type <return> to continue, or q <return> to quit--- #0 0x00007f0ac11ced12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f0ac23cadd8 in syncenv_task (proc=proc@entry=0x559b1ebe2e40) at syncop.c:603 #2 0x00007f0ac23cbca0 in syncenv_processor (thdata=0x559b1ebe2e40) at syncop.c:695 #3 0x00007f0ac11cadd5 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f0ac0a92ead in clone () from /lib64/libc.so.6 Thread 8 (Thread 0x7f0ab8737700 (LWP 4250)): #0 0x00007f0ac0a59e2d in nanosleep () from /lib64/libc.so.6 #1 0x00007f0ac0a59cc4 in sleep () from /lib64/libc.so.6 #2 0x00007f0ac23b7b9d in pool_sweeper (arg=<optimized out>) at mem-pool.c:470 #3 0x00007f0ac11cadd5 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f0ac0a92ead in clone () from /lib64/libc.so.6 ---Type <return> to continue, or q <return> to quit--- Thread 7 (Thread 0x7f0ab9739700 (LWP 4248)): #0 0x00007f0ac11d1e3d in nanosleep () from /lib64/libpthread.so.0 #1 0x00007f0ac239cf86 in gf_timer_proc (data=0x559b1ebe2620) at timer.c:174 #2 0x00007f0ac11cadd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f0ac0a92ead in clone () from /lib64/libc.so.6 Thread 6 (Thread 0x7f0ab4a6a700 (LWP 4256)): #0 0x00007f0ac11d14ed in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x00007f0ac11ccdcb in _L_lock_883 () from /lib64/libpthread.so.0 #2 0x00007f0ac11ccc98 in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x00007f0ac239eee9 in inode_unref (inode=0x7f09850f0d98) at inode.c:668 #4 0x00007f0ac238ca02 in loc_wipe (loc=loc@entry=0x7f09865af178) at xlator.c:768 #5 0x00007f0ab40231ee in client_local_wipe (local=local@entry=0x7f09865af178) at client-helpers.c:127 ---Type <return> to continue, or q <return> to quit--- #6 0x00007f0ab4032f0d in client3_3_lookup_cbk (req=<optimized out>, iov=<optimized out>, count=<optimized out>, myframe=0x7f098676f668) at client-rpc-fops.c:2872 #7 0x00007f0ac2134a00 in rpc_clnt_handle_reply (clnt=clnt@entry=0x7f097ca19cb0, pollin=pollin@entry=0x7f097ccffd50) at rpc-clnt.c:778 #8 0x00007f0ac2134d6b in rpc_clnt_notify (trans=<optimized out>, mydata=0x7f097ca19ce0, event=<optimized out>, data=0x7f097ccffd50) at rpc-clnt.c:971 #9 0x00007f0ac2130ae3 in rpc_transport_notify (this=this@entry=0x7f098e451610, event=event@entry=RPC_TRANSPORT_MSG_RECEIVED, data=data@entry=0x7f097ccffd50) at rpc-transport.c:557 #10 0x00007f0ab6d22586 in socket_event_poll_in (this=this@entry=0x7f098e451610, notify_handled=<optimized out>) at socket.c:2322 #11 0x00007f0ab6d24bca in socket_event_handler (fd=33, idx=26, gen=4, data=0x7f098e451610, poll_in=<optimized out>, poll_out=<optimized out>, poll_err=0, event_thread_died=0 '\000') at socket.c:2482 #12 0x00007f0ac23ed870 in event_dispatch_epoll_handler (event=0x7f0ab4a69e70, event_pool=0x559b1ebc7250) ---Type <return> to continue, or q <return> to quit--- at event-epoll.c:643 #13 event_dispatch_epoll_worker (data=0x559b1ec250a0) at event-epoll.c:759 #14 0x00007f0ac11cadd5 in start_thread () from /lib64/libpthread.so.0 #15 0x00007f0ac0a92ead in clone () from /lib64/libc.so.6 Thread 5 (Thread 0x7f0ab526b700 (LWP 4255)): #0 0x00007f0ac11d14ed in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x00007f0ac11ccdcb in _L_lock_883 () from /lib64/libpthread.so.0 #2 0x00007f0ac11ccc98 in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x00007f0ac23a1819 in inode_is_linked (inode=inode@entry=0x7f0a4c03b8f8) at inode.c:2490 #4 0x00007f0aafdd2693 in afr_read_subvol_select_by_policy (inode=inode@entry=0x7f0a4c03b8f8, this=this@entry=0x7f097d715a70, readable=readable@entry=0x7f0ab526a420 "\001\001\001\265\n\177", args=args@entry=0x0) at afr-common.c:1685 ---Type <return> to continue, or q <return> to quit--- #5 0x00007f0aafdd29d6 in afr_read_subvol_get (inode=inode@entry=0x7f0a4c03b8f8, this=0x7f097d715a70, subvol_p=subvol_p@entry=0x0, readables=readables@entry=0x0, event_p=event_p@entry=0x0, type=type@entry=AFR_DATA_TRANSACTION, args=args@entry=0x0) at afr-common.c:1771 #6 0x00007f0aafde1050 in afr_get_parent_read_subvol (readable=0x7f0ab526a540 "\001\001\001|\t\177", replies=<optimized out>, parent=0x7f0a4c03b8f8, this=<optimized out>) at afr-common.c:2167 #7 afr_lookup_done (frame=frame@entry=0x7f098664dcb8, this=this@entry=0x7f097d715a70) at afr-common.c:2319 #8 0x00007f0aafde2058 in afr_lookup_metadata_heal_check (frame=frame@entry=0x7f098664dcb8, this=this@entry=0x7f097d715a70) at afr-common.c:2771 #9 0x00007f0aafde2a5b in afr_lookup_entry_heal (frame=frame@entry=0x7f098664dcb8, this=this@entry=0x7f097d715a70) at afr-common.c:2920 #10 0x00007f0aafde2e3d in afr_lookup_cbk (frame=frame@entry=0x7f098664dcb8, cookie=<optimized out>, this=0x7f097d715a70, op_ret=<optimized out>, op_errno=<optimized out>, inode=inode@entry=0x7f09850f0d98, buf=buf@entry=0x7f0ab526a900, xdata=0x7f0993075e88, postparent=postparent@entry=0x7f0ab526a970) ---Type <return> to continue, or q <return> to quit--- at afr-common.c:2968 #11 0x00007f0ab4032efd in client3_3_lookup_cbk (req=<optimized out>, iov=<optimized out>, count=<optimized out>, myframe=0x7f0985df1b58) at client-rpc-fops.c:2872 #12 0x00007f0ac2134a00 in rpc_clnt_handle_reply (clnt=clnt@entry=0x7f097c49ee60, pollin=pollin@entry=0x7f09937ef6c0) at rpc-clnt.c:778 #13 0x00007f0ac2134d6b in rpc_clnt_notify (trans=<optimized out>, mydata=0x7f097c49ee90, event=<optimized out>, data=0x7f09937ef6c0) at rpc-clnt.c:971 #14 0x00007f0ac2130ae3 in rpc_transport_notify (this=this@entry=0x7f097e3ea6a0, event=event@entry=RPC_TRANSPORT_MSG_RECEIVED, data=data@entry=0x7f09937ef6c0) at rpc-transport.c:557 #15 0x00007f0ab6d22586 in socket_event_poll_in (this=this@entry=0x7f097e3ea6a0, notify_handled=<optimized out>) at socket.c:2322 #16 0x00007f0ab6d24bca in socket_event_handler (fd=29, idx=20, gen=7, data=0x7f097e3ea6a0, poll_in=<optimized out>, poll_out=<optimized out>, poll_err=0, event_thread_died=0 '\000') at socket.c:2482 ---Type <return> to continue, or q <return> to quit--- #17 0x00007f0ac23ed870 in event_dispatch_epoll_handler (event=0x7f0ab526ae70, event_pool=0x559b1ebc7250) at event-epoll.c:643 #18 event_dispatch_epoll_worker (data=0x559b1ec24dd0) at event-epoll.c:759 #19 0x00007f0ac11cadd5 in start_thread () from /lib64/libpthread.so.0 #20 0x00007f0ac0a92ead in clone () from /lib64/libc.so.6 Thread 4 (Thread 0x7f0ab7735700 (LWP 4252)): #0 0x00007f0ac11ced12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f0ac23cadd8 in syncenv_task (proc=proc@entry=0x559b1ebe3200) at syncop.c:603 #2 0x00007f0ac23cbca0 in syncenv_processor (thdata=0x559b1ebe3200) at syncop.c:695 #3 0x00007f0ac11cadd5 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f0ac0a92ead in clone () from /lib64/libc.so.6 ---Type <return> to continue, or q <return> to quit--- Thread 3 (Thread 0x7f0aacaad700 (LWP 4261)): #0 0x00007f0ac11ce965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f0ab97438d3 in notify_kernel_loop (data=<optimized out>) at fuse-bridge.c:4584 #2 0x00007f0ac11cadd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f0ac0a92ead in clone () from /lib64/libc.so.6 Thread 2 (Thread 0x7f0ab8f38700 (LWP 4249)): #0 0x00007f0ac11d2361 in sigwait () from /lib64/libpthread.so.0 #1 0x0000559b1de36c7b in glusterfs_sigwaiter (arg=<optimized out>) at glusterfsd.c:2242 #2 0x00007f0ac11cadd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f0ac0a92ead in clone () from /lib64/libc.so.6 Thread 1 (Thread 0x7f0aadaaf700 (LWP 4259)): ---Type <return> to continue, or q <return> to quit--- #0 __dentry_search_arbit (inode=inode@entry=0x7f0a5002eab8) at inode.c:1450 #1 0x00007f0ac23a01c8 in __inode_path (inode=inode@entry=0x7f0a50009e68, name=name@entry=0x7f0a028f1b50 "dhcp35-37.lab.eng.blr.redhat.com-fresh_top.log", bufp=bufp@entry=0x7f0985ba36e8) at inode.c:1551 #2 0x00007f0ac23a0551 in inode_path (inode=0x7f0a50009e68, name=0x7f0a028f1b50 "dhcp35-37.lab.eng.blr.redhat.com-fresh_top.log", bufp=bufp@entry=0x7f0985ba36e8) at inode.c:1642 #3 0x00007f0ab9740489 in fuse_resolve_entry (state=0x7f0985ba3570) at fuse-resolve.c:99 #4 0x00007f0ab974162d in fuse_resolve_parent (state=state@entry=0x7f0985ba3570) at fuse-resolve.c:312 #5 0x00007f0ab9741998 in fuse_resolve (state=0x7f0985ba3570) at fuse-resolve.c:647 #6 0x00007f0ab9741c8e in fuse_resolve_all (state=<optimized out>) at fuse-resolve.c:679 #7 0x00007f0ab9741cd0 in fuse_resolve_and_resume (state=0x7f0985ba3570, fn=0x7f0ab9744e40 <fuse_lookup_resume>) at fuse-resolve.c:718 ---Type <return> to continue, or q <return> to quit--- #8 0x00007f0ab9759702 in fuse_thread_proc (data=0x559b1ebcf0c0) at fuse-bridge.c:5781 #9 0x00007f0ac11cadd5 in start_thread () from /lib64/libpthread.so.0 #10 0x00007f0ac0a92ead in clone () from /lib64/libc.so.6 (gdb) Version-Release number of selected component (if applicable): ================= 3.12.2-43 How reproducible: ================== hit it on 2 clients Steps to Reproduce: ================== was running tests to stress the system for RPC bug fixes more details available @ https://docs.google.com/spreadsheets/d/17Yf9ZRWnWOpbRyFQ2ZYxAAlp9I_yarzKZdjN8idBJM0/edit#gid=1472913705 1. 3x3 volume 2. IOs triggered from 8 different clients both as root and non root user for about a week, with quotas/uss enabled and set after 2 days 3. after about a week, did a add-brick with 3 replica sets to make it 6x3 and triggered rebalance and left it over weekend Actual results: =========== saw the crash on 2 clients
> [2019-03-01 13:37:02.711187] W [inode.c:197:__is_dentry_hashed] (-->/lib64/libglusterfs.so.0(__inode_path+0x68) [0x7f515a5cd1c8] -->/lib64/libglusterfs.so.0(+0x3add4) [0x7f515a5cadd4] -->/lib64/libglusterfs.so.0(+0x3ad7e) [0x7f515a5cad7e] ) 0-fuse: dentry not found > pending frames: > frame : type(1) op(UNLINK) > frame : type(1) op(UNLINK) > frame : type(1) op(READDIRP) ---- (gdb) bt #0 __dentry_search_arbit (inode=inode@entry=0x7f0a5002eab8) at inode.c:1450 (gdb) l 1445 dentry_t *trav = NULL; 1446 1447 if (!inode) 1448 return NULL; 1449 1450 list_for_each_entry (trav, &inode->dentry_list, inode_list) { 1451 if (__is_dentry_hashed (trav)) { 1452 dentry = trav; 1453 break; ---- Looks like we need to see if trav is null, and break the loop. Mainly here, __is_dentry_hashed() has given 0 output, and we still continue to traverse the list. I guess, that should have stopped. Still checking.
We will mark this for release once there is a fix upstream.
cleared needinfo accidentally