+++ This bug was initially created as a clone of Bug #1410701 +++ +++ This bug was initially created as a clone of Bug #1409563 +++ Description of problem: Over a SSL enabled setup when multiple mount and unmount is performed the share hungs in the windows client Version-Release number of selected component (if applicable): Samba-client-libs-4.4.6-4.el7rhgs.x86_6 glusterfs-client-xlators-3.8.4-10.el7rhgs.x86_64 Windows10 How reproducible: Always Steps to Reproduce: 1.4 Node SSL enabled gluster cluster with CTDB samba setup 2.Run a script that mounts & unmount volume share using public ip (VIP) over a loop. 3.Observe the share it will hung 4.Check the pstack of the process id Actual results: Mount/share hungs Expected results: Mount/share should not hang Additional info: Thread 9 (Thread 0x7f361f44f700 (LWP 7627)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238 #1 0x00007f3620317108 in syncenv_task () from /lib64/libglusterfs.so.0 #2 0x00007f3620317f50 in syncenv_processor () from /lib64/libglusterfs.so.0 #3 0x00007f363b9fddc5 in start_thread (arg=0x7f361f44f700) at pthread_create.c:308 #4 0x00007f3637a2073d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Thread 8 (Thread 0x7f361ec4e700 (LWP 7628)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238 #1 0x00007f3620317108 in syncenv_task () from /lib64/libglusterfs.so.0 #2 0x00007f3620317f50 in syncenv_processor () from /lib64/libglusterfs.so.0 #3 0x00007f363b9fddc5 in start_thread (arg=0x7f361ec4e700) at pthread_create.c:308 #4 0x00007f3637a2073d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Thread 7 (Thread 0x7f361cc35700 (LWP 7629)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238 #1 0x00007f3620317108 in syncenv_task () from /lib64/libglusterfs.so.0 #2 0x00007f3620317f50 in syncenv_processor () from /lib64/libglusterfs.so.0 #3 0x00007f363b9fddc5 in start_thread (arg=0x7f361cc35700) at pthread_create.c:308 #4 0x00007f3637a2073d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Thread 6 (Thread 0x7f361c434700 (LWP 7630)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238 #1 0x00007f3620317108 in syncenv_task () from /lib64/libglusterfs.so.0 #2 0x00007f3620317f50 in syncenv_processor () from /lib64/libglusterfs.so.0 #3 0x00007f363b9fddc5 in start_thread (arg=0x7f361c434700) at pthread_create.c:308 #4 0x00007f3637a2073d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Thread 5 (Thread 0x7f361b07b700 (LWP 7631)): #0 0x00007f363ba04bdd in nanosleep () at ../sysdeps/unix/syscall-template.S:81 #1 0x00007f36202ebd06 in gf_timer_proc () from /lib64/libglusterfs.so.0 #2 0x00007f363b9fddc5 in start_thread (arg=0x7f361b07b700) at pthread_create.c:308 #3 0x00007f3637a2073d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Thread 4 (Thread 0x7f361a677700 (LWP 7632)): #0 0x00007f363b9feef7 in pthread_join (threadid=139870339557120, thread_return=0x0) at pthread_join.c:92 #1 0x00007f3620338ad8 in event_dispatch_epoll () from /lib64/libglusterfs.so.0 #2 0x00007f36209e6fd4 in glfs_poller () from /lib64/libgfapi.so.0 #3 0x00007f363b9fddc5 in start_thread (arg=0x7f361a677700) at pthread_create.c:308 #4 0x00007f3637a2073d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Thread 3 (Thread 0x7f3619e76700 (LWP 7633)): #0 0x00007f3637a20d13 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81 #1 0x00007f3620338530 in event_dispatch_epoll_worker () from /lib64/libglusterfs.so.0 #2 0x00007f363b9fddc5 in start_thread (arg=0x7f3619e76700) at pthread_create.c:308 #3 0x00007f3637a2073d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Thread 2 (Thread 0x7f360ac93700 (LWP 7752)): #0 0x00007f3637a20d13 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81 #1 0x00007f3620338530 in event_dispatch_epoll_worker () from /lib64/libglusterfs.so.0 #2 0x00007f363b9fddc5 in start_thread (arg=0x7f360ac93700) at pthread_create.c:308 #3 0x00007f3637a2073d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Thread 1 (Thread 0x7f363bdbd8c0 (LWP 7626)): #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 #1 0x00007f363b9ffd02 in _L_lock_791 () from /lib64/libpthread.so.0 #2 0x00007f363b9ffc08 in __GI___pthread_mutex_lock (mutex=0x7f360c059650) at pthread_mutex_lock.c:64 #3 0x00007f36194615d1 in socket_poller_mayday () from /usr/lib64/glusterfs/3.8.4/rpc-transport/socket.so #4 0x00007f362034c9c8 in _gf_ref_put () from /lib64/libglusterfs.so.0 #5 0x00007f3619461f35 in socket_disconnect () from /usr/lib64/glusterfs/3.8.4/rpc-transport/socket.so #6 0x00007f36207d416e in rpc_clnt_disable () from /lib64/libgfrpc.so.0 #7 0x00007f3619215f1e in notify () from /usr/lib64/glusterfs/3.8.4/xlator/protocol/client.so #8 0x00007f36202dc416 in xlator_notify () from /lib64/libglusterfs.so.0 #9 0x00007f3620375ec7 in default_notify () from /lib64/libglusterfs.so.0 #10 0x00007f360b9dc858 in notify () from /usr/lib64/glusterfs/3.8.4/xlator/features/snapview-client.so #11 0x00007f36202dc416 in xlator_notify () from /lib64/libglusterfs.so.0 #12 0x00007f3620375ec7 in default_notify () from /lib64/libglusterfs.so.0 #13 0x00007f360b7c225a in notify () from /usr/lib64/glusterfs/3.8.4/xlator/debug/io-stats.so #14 0x00007f36202dc416 in xlator_notify () from /lib64/libglusterfs.so.0 #15 0x00007f3620375ec7 in default_notify () from /lib64/libglusterfs.so.0 #16 0x00007f36202dc416 in xlator_notify () from /lib64/libglusterfs.so.0 #17 0x00007f36209e88dd in glfs_fini () from /lib64/libgfapi.so.0 #18 0x00007f3620c0b1a6 in glfs_clear_preopened (fs=0x7f363cdac7b0) at ../source3/modules/vfs_glusterfs.c:153 #19 vfs_gluster_disconnect (handle=<optimized out>) at ../source3/modules/vfs_glusterfs.c:374 #20 0x00007f363b35a271 in close_cnum (conn=0x7f363cdb3450, vuid=2319896939) at ../source3/smbd/service.c:1154 #21 0x00007f363b388894 in smbXsrv_tcon_disconnect (tcon=0x7f363cda61c0, vuid=2319896939) at ../source3/smbd/smbXsrv_tcon.c:983 #22 0x00007f363b36fcaf in smbd_smb2_tdis_wait_done (subreq=0x7f363cdb67d0) at ../source3/smbd/smb2_tcon.c:631 #23 0x00007f3637ceec34 in tevent_common_loop_immediate (ev=ev@entry=0x7f363cd83da0) at ../tevent_immediate.c:135 #24 0x00007f36392c72ac in run_events_poll (ev=0x7f363cd83da0, pollrtn=0, pfds=0x0, num_pfds=0) at ../source3/lib/events.c:192 #25 0x00007f36392c7594 in s3_event_loop_once (ev=0x7f363cd83da0, location=<optimized out>) at ../source3/lib/events.c:303 #26 0x00007f3637cee40d in _tevent_loop_once (ev=ev@entry=0x7f363cd83da0, location=location@entry=0x7f363b4a1ce0 "../source3/smbd/process.c:4117") at ../tevent.c:533 #27 0x00007f3637cee5ab in tevent_common_loop_wait (ev=0x7f363cd83da0, location=0x7f363b4a1ce0 "../source3/smbd/process.c:4117") at ../tevent.c:637 #28 0x00007f363b3577b1 in smbd_process (ev_ctx=ev_ctx@entry=0x7f363cd83da0, msg_ctx=msg_ctx@entry=0x7f363cd83e90, sock_fd=sock_fd@entry=39, interactive=interactive@entry=false) at ../source3/smbd/process.c:4117 #29 0x00007f363be40304 in smbd_accept_connection (ev=0x7f363cd83da0, fde=<optimized out>, flags=<optimized out>, private_data=<optimized out>) at ../source3/smbd/server.c:762 #30 0x00007f36392c73dc in run_events_poll (ev=0x7f363cd83da0, pollrtn=<optimized out>, pfds=0x7f363cd9c9c0, num_pfds=7) at ../source3/lib/events.c:257 #31 0x00007f36392c7630 in s3_event_loop_once (ev=0x7f363cd83da0, location=<optimized out>) at ../source3/lib/events.c:326 #32 0x00007f3637cee40d in _tevent_loop_once (ev=ev@entry=0x7f363cd83da0, location=location@entry=0x7f363be43776 "../source3/smbd/server.c:1127") at ../tevent.c:533 #33 0x00007f3637cee5ab in tevent_common_loop_wait (ev=0x7f363cd83da0, location=0x7f363be43776 "../source3/smbd/server.c:1127") at ../tevent.c:637 #34 0x00007f363be3bad4 in smbd_parent_loop (parent=<optimized out>, ev_ctx=0x7f363cd83da0) at ../source3/smbd/server.c:1127 #35 main (argc=<optimized out>, argv=<optimized out>) at ../source3/smbd/server.c:1780 --- Additional comment from Red Hat Bugzilla Rules Engine on 2017-01-02 08:30:13 EST --- This bug is automatically being proposed for the current release of Red Hat Gluster Storage 3 under active development, by setting the release flag 'rhgs‑3.2.0' to '?'. If this bug should be proposed for a different release, please manually change the proposed release flag. --- Additional comment from Vivek Das on 2017-01-02 08:37:06 EST --- Sosreports & samba logs : http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1409563/ --- Additional comment from Rejy M Cyriac on 2017-01-04 08:16:57 EST --- At the 'RHGS 3.2.0 - Blocker Bug Triage' meeting on 04 January, it was decided that this BZ is being ACCEPTED AS BLOCKER at the RHGS 3.2.0 release --- Additional comment from Worker Ant on 2017-01-06 01:44:51 EST --- REVIEW: http://review.gluster.org/16343 (socket: GF_REF_PUT should be called outside lock) posted (#1) for review on master by Rajesh Joseph (rjoseph) --- Additional comment from Worker Ant on 2017-02-06 00:02:05 EST --- REVIEW: https://review.gluster.org/16343 (socket: GF_REF_PUT should be called outside lock) posted (#2) for review on master by Rajesh Joseph (rjoseph) --- Additional comment from Worker Ant on 2017-02-06 06:13:50 EST --- COMMIT: https://review.gluster.org/16343 committed in master by Raghavendra G (rgowdapp) ------ commit b3188c61d248526a070b1b18df1ea1d181b349d6 Author: Rajesh Joseph <rjoseph> Date: Thu Jan 5 23:58:21 2017 +0530 socket: GF_REF_PUT should be called outside lock GF_REF_PUT was called inside lock which can call socket_poller_mayday which inturn tries to take the same lock. This can lead to deadlock scenario. BUG: 1410701 Change-Id: Ib3b161bcfeac810bd3593dc04c10ef984f996b17 Signed-off-by: Rajesh Joseph <rjoseph> Reviewed-on: https://review.gluster.org/16343 Reviewed-by: Raghavendra G <rgowdapp> CentOS-regression: Gluster Build System <jenkins.org> Smoke: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org>
REVIEW: https://review.gluster.org/16548 (socket: GF_REF_PUT should be called outside lock) posted (#1) for review on release-3.10 by Atin Mukherjee (amukherj)
COMMIT: https://review.gluster.org/16548 committed in release-3.10 by Shyamsundar Ranganathan (srangana) ------ commit 2e176d46b574af2672688410393ba20a9ad72acf Author: Rajesh Joseph <rjoseph> Date: Thu Jan 5 23:58:21 2017 +0530 socket: GF_REF_PUT should be called outside lock GF_REF_PUT was called inside lock which can call socket_poller_mayday which inturn tries to take the same lock. This can lead to deadlock scenario. >Reviewed-on: https://review.gluster.org/16343 >Reviewed-by: Raghavendra G <rgowdapp> >CentOS-regression: Gluster Build System <jenkins.org> >Smoke: Gluster Build System <jenkins.org> >NetBSD-regression: NetBSD Build System <jenkins.org> BUG: 1419503 Change-Id: Ib3b161bcfeac810bd3593dc04c10ef984f996b17 Signed-off-by: Rajesh Joseph <rjoseph> Reviewed-on: https://review.gluster.org/16548 Tested-by: Atin Mukherjee <amukherj> Reviewed-by: Raghavendra G <rgowdapp> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Smoke: Gluster Build System <jenkins.org>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.0, please open a new bug report. glusterfs-3.10.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/gluster-users/2017-February/030119.html [2] https://www.gluster.org/pipermail/gluster-users/