Description of problem: When a share is connected to a windows client and disconnected multiple times with md-cache enabled and client-io-thread enabled on volume , saw multiple crashes on server and the other share becomes inaccessible. (gdb) bt #0 0x00007f75f3ef25f7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007f75f3ef3ce8 in __GI_abort () at abort.c:90 #2 0x00007f75f5853beb in dump_core () at ../source3/lib/dumpcore.c:322 #3 0x00007f75f5846fe7 in smb_panic_s3 (why=<optimized out>) at ../source3/lib/util.c:814 #4 0x00007f75f7d3957f in smb_panic (why=why@entry=0x7f75f7d8054a "internal error") at ../lib/util/fault.c:166 #5 0x00007f75f7d39796 in fault_report (sig=<optimized out>) at ../lib/util/fault.c:83 #6 sig_fault (sig=<optimized out>) at ../lib/util/fault.c:94 #7 <signal handler called> #8 list_del_init (old=0x75) at ../../../../libglusterfs/src/list.h:87 #9 __iot_dequeue (conf=conf@entry=0x7f75a401ea90, pri=pri@entry=0x7f75b0191d6c, sleep=sleep@entry=0x7f75b0191d80) at io-threads.c:126 #10 0x00007f75d4f03727 in iot_worker (data=0x7f75a401ea90) at io-threads.c:199 #11 0x00007f75f7f92dc5 in start_thread (arg=0x7f75b0192700) at pthread_create.c:308 #12 0x00007f75f3fb3ced in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Version-Release number of selected component (if applicable): glusterfs-3.8.4-2.26.git0a405a4.el7rhgs.x86_64 How reproducible: Tried Once Steps to Reproduce: 1.Connect a samba share to windows client and disconnect. execute this multiple times. 2.Access another share from client. 3. Actual results: Another share is not accessible and there are few crashes on the server. Expected results: There should not be any crashes and the share should be accessible. Additional info:
Thread 20 (Thread 0x7f7586ffd700 (LWP 13160)): #0 0x00007f75f3fb42c3 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81 #1 0x00007f75dd4da1c0 in event_dispatch_epoll_worker (data=0x7f75a40098a0) at event-epoll.c:664 #2 0x00007f75f7f92dc5 in start_thread (arg=0x7f7586ffd700) at pthread_create.c:308 #3 0x00007f75f3fb3ced in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Thread 19 (Thread 0x7f75d82dd700 (LWP 11901)): #0 0x00007f75f7f9996d in nanosleep () at ../sysdeps/unix/syscall-template.S:81 #1 0x00007f75dd48dbb6 in gf_timer_proc (data=0x7f75f88cc480) at timer.c:176 #2 0x00007f75f7f92dc5 in start_thread (arg=0x7f75d82dd700) at pthread_create.c:308 #3 0x00007f75f3fb3ced in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Thread 18 (Thread 0x7f7590ff9700 (LWP 13158)): #0 0x00007f75f7f9996d in nanosleep () at ../sysdeps/unix/syscall-template.S:81 #1 0x00007f75dd48dbb6 in gf_timer_proc (data=0x7f75f8f2bda0) at timer.c:176 #2 0x00007f75f7f92dc5 in start_thread (arg=0x7f7590ff9700) at pthread_create.c:308 #3 0x00007f75f3fb3ced in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Thread 17 (Thread 0x7f75d78d9700 (LWP 11902)): #0 0x00007f75f7f93ef7 in pthread_join (threadid=140144095889152, thread_return=thread_return@entry=0x0) at pthread_join.c:92 #1 0x00007f75dd4da768 in event_dispatch_epoll (event_pool=0x7f75f88bb660) at event-epoll.c:758 #2 0x00007f75ddb88c64 in glfs_poller (data=<optimized out>) at glfs.c:612 #3 0x00007f75f7f92dc5 in start_thread (arg=0x7f75d78d9700) at pthread_create.c:308 #4 0x00007f75f3fb3ced in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Thread 16 (Thread 0x7f75d9696700 (LWP 11900)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238 #1 0x00007f75dd4b8d98 in syncenv_task (proc=proc@entry=0x7f75f88bec10) at syncop.c:603 #2 0x00007f75dd4b9be0 in syncenv_processor (thdata=0x7f75f88bec10) at syncop.c:695 #3 0x00007f75f7f92dc5 in start_thread (arg=0x7f75d9696700) at pthread_create.c:308 #4 0x00007f75f3fb3ced in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Thread 15 (Thread 0x7f75f8354880 (LWP 11512)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f75ddb89cf3 in glfs_lock (fs=fs@entry=0x7f75f8f2c620) at glfs-internal.h:296 #2 glfs_init_wait (fs=fs@entry=0x7f75f8f2c620) at glfs.c:887 #3 0x00007f75ddb8a1c0 in pub_glfs_init (fs=fs@entry=0x7f75f8f2c620) at glfs.c:997 #4 0x00007f75dddac2c8 in vfs_gluster_connect (handle=0x7f75f8838eb0, service=<optimized out>, user=<optimized out>) at ../source3/modules/vfs_glusterfs.c:237 #5 0x00007f75c40236d4 in connect_acl_xattr (handle=0x7f75f95359f0, service=0x7f75f890edf0 "gluster-vol2", user=<optimized out>) at ../source3/modules/vfs_acl_xattr.c:182 #6 0x00007f75f78edbc0 in make_connection_snum (xconn=0x7f75f882daf0, conn=conn@entry=0x7f75f89214c0, snum=snum@entry=3, pdev=pdev@entry=0x7f75f7a15f1f "???", vuser=0x7f75f8839280, vuser=0x7f75f8839280) at ../source3/smbd/service.c:678 #7 0x00007f75f78ee961 in make_connection_smb2 (req=req@entry=0x7f75f882ee40, tcon=0x7f75f8f2ba80, snum=snum@entry=3, vuser=0x7f75f8839280, pdev=pdev@entry=0x7f75f7a15f1f "???", pstatus=pstatus@entry=0x7ffe0c3d2660) at ../source3/smbd/service.c:991 #8 0x00007f75f790538f in smbd_smb2_tree_connect (disconnect=0x7f75f882f4cc, out_tree_id=0x7f75f882f4c8, out_maximal_access=0x7f75f882f4c4, out_capabilities=0x7f75f882f4c0, out_share_flags=0x7f75f882f4bc, out_share_type=0x7f75f882f4b8 "", in_path=<optimized out>, req=0x7f75f882ee40) at ../source3/smbd/smb2_tcon.c:308 #9 smbd_smb2_tree_connect_send (in_path=<optimized out>, smb2req=0x7f75f882ee40, ev=0x7f75f880f030, mem_ctx=0x7f75f882ee40) at ../source3/smbd/smb2_tcon.c:412 #10 smbd_smb2_request_process_tcon (req=req@entry=0x7f75f882ee40) at ../source3/smbd/smb2_tcon.c:93 #11 0x00007f75f78fe0d3 in smbd_smb2_request_dispatch (req=req@entry=0x7f75f882ee40) at ../source3/smbd/smb2_server.c:2564 #12 0x00007f75f78ff8f2 in smbd_smb2_io_handler (fde_flags=<optimized out>, xconn=0x7f75f882daf0) at ../source3/smbd/smb2_server.c:3861 #13 smbd_smb2_connection_handler (ev=<optimized out>, fde=<optimized out>, flags=<optimized out>, private_data=<optimized out>) at ../source3/smbd/smb2_server.c:3899 #14 0x00007f75f585c39c in run_events_poll (ev=0x7f75f880f030, pollrtn=<optimized out>, pfds=0x7f75f882c760, num_pfds=5) at ../source3/lib/events.c:257 #15 0x00007f75f585c5f0 in s3_event_loop_once (ev=0x7f75f880f030, location=<optimized out>) at ../source3/lib/events.c:326 #16 0x00007f75f428340d in _tevent_loop_once (ev=ev@entry=0x7f75f880f030, location=location@entry=0x7f75f7a36a80 "../source3/smbd/process.c:4117") at ../tevent.c:533 #17 0x00007f75f42835ab in tevent_common_loop_wait (ev=0x7f75f880f030, location=0x7f75f7a36a80 "../source3/smbd/process.c:4117") at ../tevent.c:637 #18 0x00007f75f78ec651 in smbd_process (ev_ctx=ev_ctx@entry=0x7f75f880f030, msg_ctx=msg_ctx@entry=0x7f75f880f120, sock_fd=sock_fd@entry=39, interactive=interactive@entry=false) at ../source3/smbd/process.c:4117 #19 0x00007f75f83d7304 in smbd_accept_connection (ev=0x7f75f880f030, fde=<optimized out>, flags=<optimized out>, private_data=<optimized out>) at ../source3/smbd/server.c:762 #20 0x00007f75f585c39c in run_events_poll (ev=0x7f75f880f030, pollrtn=<optimized out>, pfds=0x7f75f882c760, num_pfds=7) at ../source3/lib/events.c:257 #21 0x00007f75f585c5f0 in s3_event_loop_once (ev=0x7f75f880f030, location=<optimized out>) at ../source3/lib/events.c:326 #22 0x00007f75f428340d in _tevent_loop_once (ev=ev@entry=0x7f75f880f030, location=location@entry=0x7f75f83da776 "../source3/smbd/server.c:1127") at ../tevent.c:533 #23 0x00007f75f42835ab in tevent_common_loop_wait (ev=0x7f75f880f030, location=0x7f75f83da776 "../source3/smbd/server.c:1127") at ../tevent.c:637 #24 0x00007f75f83d2ad4 in smbd_parent_loop (parent=<optimized out>, ev_ctx=0x7f75f880f030) at ../source3/smbd/server.c:1127 #25 main (argc=<optimized out>, argv=<optimized out>) at ../source3/smbd/server.c:1780 Thread 14 (Thread 0x7f75d9e97700 (LWP 11899)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238 #1 0x00007f75dd4b8d98 in syncenv_task (proc=proc@entry=0x7f75f88be850) at syncop.c:603 #2 0x00007f75dd4b9be0 in syncenv_processor (thdata=0x7f75f88be850) at syncop.c:695 #3 0x00007f75f7f92dc5 in start_thread (arg=0x7f75d9e97700) at pthread_create.c:308 #4 0x00007f75f3fb3ced in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Thread 13 (Thread 0x7f75dbe50700 (LWP 11898)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238 #1 0x00007f75dd4b8d98 in syncenv_task (proc=proc@entry=0x7f75f8878890) at syncop.c:603 #2 0x00007f75dd4b9be0 in syncenv_processor (thdata=0x7f75f8878890) at syncop.c:695 #3 0x00007f75f7f92dc5 in start_thread (arg=0x7f75dbe50700) at pthread_create.c:308 #4 0x00007f75f3fb3ced in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Thread 12 (Thread 0x7f75d70d8700 (LWP 11903)): #0 0x00007f75f3fb42c3 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81 #1 0x00007f75dd4da1c0 in event_dispatch_epoll_worker (data=0x7f75d0000920) at event-epoll.c:664 #2 0x00007f75f7f92dc5 in start_thread (arg=0x7f75d70d8700) at pthread_create.c:308 #3 0x00007f75f3fb3ced in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Thread 11 (Thread 0x7f75d41c6700 (LWP 11908)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238 #1 0x00007f75d4f036f3 in iot_worker (data=0x7f75c8029640) at io-threads.c:176 #2 0x00007f75f7f92dc5 in start_thread (arg=0x7f75d41c6700) at pthread_create.c:308 #3 0x00007f75f3fb3ced in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Thread 10 (Thread 0x7f75dc651700 (LWP 11897)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238 #1 0x00007f75dd4b8d98 in syncenv_task (proc=proc@entry=0x7f75f88784d0) at syncop.c:603 #2 0x00007f75dd4b9be0 in syncenv_processor (thdata=0x7f75f88784d0) at syncop.c:695 #3 0x00007f75f7f92dc5 in start_thread (arg=0x7f75dc651700) at pthread_create.c:308 #4 0x00007f75f3fb3ced in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Thread 9 (Thread 0x7f75c70b0700 (LWP 11909)): #0 0x00007f75f3fb42c3 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81 #1 0x00007f75dd4da1c0 in event_dispatch_epoll_worker (data=0x7f75c8062be0) at event-epoll.c:664 #2 0x00007f75f7f92dc5 in start_thread (arg=0x7f75c70b0700) at pthread_create.c:308 #3 0x00007f75f3fb3ced in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Thread 8 (Thread 0x7f75b1475700 (LWP 13017)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238 #1 0x00007f75d4f036f3 in iot_worker (data=0x7f75ada29ed0) at io-threads.c:176 #2 0x00007f75f7f92dc5 in start_thread (arg=0x7f75b1475700) at pthread_create.c:308 #3 0x00007f75f3fb3ced in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Thread 7 (Thread 0x7f75877fe700 (LWP 13159)): #0 0x00007f75f7f93ef7 in pthread_join (threadid=140142752814848, thread_return=thread_return@entry=0x0) at pthread_join.c:92 #1 0x00007f75dd4da768 in event_dispatch_epoll (event_pool=0x7f75f8946c70) at event-epoll.c:758 #2 0x00007f75ddb88c64 in glfs_poller (data=<optimized out>) at glfs.c:612 #3 0x00007f75f7f92dc5 in start_thread (arg=0x7f75877fe700) at pthread_create.c:308 #4 0x00007f75f3fb3ced in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Thread 6 (Thread 0x7f7587fff700 (LWP 13157)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238 #1 0x00007f75dd4b8d98 in syncenv_task (proc=proc@entry=0x7f75f8953e00) at syncop.c:603 #2 0x00007f75dd4b9be0 in syncenv_processor (thdata=0x7f75f8953e00) at syncop.c:695 #3 0x00007f75f7f92dc5 in start_thread (arg=0x7f7587fff700) at pthread_create.c:308 #4 0x00007f75f3fb3ced in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Thread 5 (Thread 0x7f7591ffb700 (LWP 13156)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238 #1 0x00007f75dd4b8d98 in syncenv_task (proc=proc@entry=0x7f75f8953a40) at syncop.c:603 #2 0x00007f75dd4b9be0 in syncenv_processor (thdata=0x7f75f8953a40) at syncop.c:695 #3 0x00007f75f7f92dc5 in start_thread (arg=0x7f7591ffb700) at pthread_create.c:308 Thread 4 (Thread 0x7f75b1576700 (LWP 12763)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238 #1 0x00007f75d4f036f3 in iot_worker (data=0x7f75a46d5650) at io-threads.c:176 #2 0x00007f75f7f92dc5 in start_thread (arg=0x7f75b1576700) at pthread_create.c:308 #3 0x00007f75f3fb3ced in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Thread 3 (Thread 0x7f75b1677700 (LWP 12614)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238 #1 0x00007f75d4f036f3 in iot_worker (data=0x7f759c91ecf0) at io-threads.c:176 #2 0x00007f75f7f92dc5 in start_thread (arg=0x7f75b1677700) at pthread_create.c:308 #3 0x00007f75f3fb3ced in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Thread 2 (Thread 0x7f75b1778700 (LWP 12357)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238 #1 0x00007f75d4f036f3 in iot_worker (data=0x7f75ac91ecf0) at io-threads.c:176 #2 0x00007f75f7f92dc5 in start_thread (arg=0x7f75b1778700) at pthread_create.c:308 #3 0x00007f75f3fb3ced in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Thread 1 (Thread 0x7f75b0192700 (LWP 12052)): #0 0x00007f75f3ef25f7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007f75f3ef3ce8 in __GI_abort () at abort.c:90 #2 0x00007f75f5853beb in dump_core () at ../source3/lib/dumpcore.c:322 #3 0x00007f75f5846fe7 in smb_panic_s3 (why=<optimized out>) at ../source3/lib/util.c:814 #4 0x00007f75f7d3957f in smb_panic (why=why@entry=0x7f75f7d8054a "internal error") at ../lib/util/fault.c:166 #5 0x00007f75f7d39796 in fault_report (sig=<optimized out>) at ../lib/util/fault.c:83 #6 sig_fault (sig=<optimized out>) at ../lib/util/fault.c:94 #7 <signal handler called> #8 list_del_init (old=0x75) at ../../../../libglusterfs/src/list.h:87 #9 __iot_dequeue (conf=conf@entry=0x7f75a401ea90, pri=pri@entry=0x7f75b0191d6c, sleep=sleep@entry=0x7f75b0191d80) at io-threads.c:126 #10 0x00007f75d4f03727 in iot_worker (data=0x7f75a401ea90) at io-threads.c:199 #11 0x00007f75f7f92dc5 in start_thread (arg=0x7f75b0192700) at pthread_create.c:308 #12 0x00007f75f3fb3ced in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
I see that, there are io-threads on client side. The crash is related to the client io threads.
Pranith/Ravi - could you check this crash?
https://code.engineering.redhat.com/gerrit/#/c/87972/ is the fix which is already merged. I think we are waiting for surabhi to update the status about this issue. If samba does glfs_init() and glfs_fini() on connect/disconnect it should be same issue. We already have confirmation that related bug https://bugzilla.redhat.com/show_bug.cgi?id=1382065 is fixed with the io-threads patch.
Surabhi, Please re-open the bug if you find io-threads crash even after the fix. So far with nfs-ganesha, and samba mount/umount in loop things looked good. Pranith
Tried the test with latest builds with following steps and the crash is not seen. When a share is connected to a windows client and disconnected multiple times with md-cache enabled and client-io-thread enabled on volume. As no crashes are seen with client-io-thread , moving the BZ to verified with build :glusterfs-3.8.4-3.el7rhgs.x86_64.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2017-0486.html