Bug 1419503 - [SAMBA-SSL] Volume Share hungs when multiple mount & unmount is performed over a windows client on a SSL enabled cluster
Summary: [SAMBA-SSL] Volume Share hungs when multiple mount & unmount is performed ove...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: rpc
Version: 3.10
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On: 1409563 1410701
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-02-06 11:23 UTC by Atin Mukherjee
Modified: 2017-03-06 17:45 UTC (History)
8 users (show)

Fixed In Version: glusterfs-3.10.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1410701
Environment:
Last Closed: 2017-03-06 17:45:31 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Atin Mukherjee 2017-02-06 11:23:13 UTC
+++ This bug was initially created as a clone of Bug #1410701 +++

+++ This bug was initially created as a clone of Bug #1409563 +++

Description of problem:
Over a SSL enabled setup when multiple mount and unmount is performed the share hungs in the windows client

Version-Release number of selected component (if applicable):
Samba-client-libs-4.4.6-4.el7rhgs.x86_6
glusterfs-client-xlators-3.8.4-10.el7rhgs.x86_64
Windows10

How reproducible:
Always

Steps to Reproduce:
1.4 Node SSL enabled gluster cluster with CTDB samba setup
2.Run a script that mounts & unmount volume share using public ip (VIP) over a loop.
3.Observe the share it will hung
4.Check the pstack of the process id

Actual results:
Mount/share hungs

Expected results:
Mount/share should not hang

Additional info:
Thread 9 (Thread 0x7f361f44f700 (LWP 7627)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x00007f3620317108 in syncenv_task () from /lib64/libglusterfs.so.0
#2  0x00007f3620317f50 in syncenv_processor () from /lib64/libglusterfs.so.0
#3  0x00007f363b9fddc5 in start_thread (arg=0x7f361f44f700) at pthread_create.c:308
#4  0x00007f3637a2073d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 8 (Thread 0x7f361ec4e700 (LWP 7628)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x00007f3620317108 in syncenv_task () from /lib64/libglusterfs.so.0
#2  0x00007f3620317f50 in syncenv_processor () from /lib64/libglusterfs.so.0
#3  0x00007f363b9fddc5 in start_thread (arg=0x7f361ec4e700) at pthread_create.c:308
#4  0x00007f3637a2073d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 7 (Thread 0x7f361cc35700 (LWP 7629)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x00007f3620317108 in syncenv_task () from /lib64/libglusterfs.so.0
#2  0x00007f3620317f50 in syncenv_processor () from /lib64/libglusterfs.so.0
#3  0x00007f363b9fddc5 in start_thread (arg=0x7f361cc35700) at pthread_create.c:308
#4  0x00007f3637a2073d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 6 (Thread 0x7f361c434700 (LWP 7630)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x00007f3620317108 in syncenv_task () from /lib64/libglusterfs.so.0
#2  0x00007f3620317f50 in syncenv_processor () from /lib64/libglusterfs.so.0
#3  0x00007f363b9fddc5 in start_thread (arg=0x7f361c434700) at pthread_create.c:308
#4  0x00007f3637a2073d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 5 (Thread 0x7f361b07b700 (LWP 7631)):
#0  0x00007f363ba04bdd in nanosleep () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f36202ebd06 in gf_timer_proc () from /lib64/libglusterfs.so.0
#2  0x00007f363b9fddc5 in start_thread (arg=0x7f361b07b700) at pthread_create.c:308
#3  0x00007f3637a2073d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 4 (Thread 0x7f361a677700 (LWP 7632)):
#0  0x00007f363b9feef7 in pthread_join (threadid=139870339557120, thread_return=0x0) at pthread_join.c:92
#1  0x00007f3620338ad8 in event_dispatch_epoll () from /lib64/libglusterfs.so.0
#2  0x00007f36209e6fd4 in glfs_poller () from /lib64/libgfapi.so.0
#3  0x00007f363b9fddc5 in start_thread (arg=0x7f361a677700) at pthread_create.c:308
#4  0x00007f3637a2073d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 3 (Thread 0x7f3619e76700 (LWP 7633)):
#0  0x00007f3637a20d13 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f3620338530 in event_dispatch_epoll_worker () from /lib64/libglusterfs.so.0
#2  0x00007f363b9fddc5 in start_thread (arg=0x7f3619e76700) at pthread_create.c:308
#3  0x00007f3637a2073d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 2 (Thread 0x7f360ac93700 (LWP 7752)):
#0  0x00007f3637a20d13 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f3620338530 in event_dispatch_epoll_worker () from /lib64/libglusterfs.so.0
#2  0x00007f363b9fddc5 in start_thread (arg=0x7f360ac93700) at pthread_create.c:308
#3  0x00007f3637a2073d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 1 (Thread 0x7f363bdbd8c0 (LWP 7626)):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007f363b9ffd02 in _L_lock_791 () from /lib64/libpthread.so.0
#2  0x00007f363b9ffc08 in __GI___pthread_mutex_lock (mutex=0x7f360c059650) at pthread_mutex_lock.c:64
#3  0x00007f36194615d1 in socket_poller_mayday () from /usr/lib64/glusterfs/3.8.4/rpc-transport/socket.so
#4  0x00007f362034c9c8 in _gf_ref_put () from /lib64/libglusterfs.so.0

#5  0x00007f3619461f35 in socket_disconnect () from /usr/lib64/glusterfs/3.8.4/rpc-transport/socket.so
#6  0x00007f36207d416e in rpc_clnt_disable () from /lib64/libgfrpc.so.0
#7  0x00007f3619215f1e in notify () from /usr/lib64/glusterfs/3.8.4/xlator/protocol/client.so
#8  0x00007f36202dc416 in xlator_notify () from /lib64/libglusterfs.so.0
#9  0x00007f3620375ec7 in default_notify () from /lib64/libglusterfs.so.0

#10 0x00007f360b9dc858 in notify () from /usr/lib64/glusterfs/3.8.4/xlator/features/snapview-client.so
#11 0x00007f36202dc416 in xlator_notify () from /lib64/libglusterfs.so.0
#12 0x00007f3620375ec7 in default_notify () from /lib64/libglusterfs.so.0
#13 0x00007f360b7c225a in notify () from /usr/lib64/glusterfs/3.8.4/xlator/debug/io-stats.so
#14 0x00007f36202dc416 in xlator_notify () from /lib64/libglusterfs.so.0

#15 0x00007f3620375ec7 in default_notify () from /lib64/libglusterfs.so.0
#16 0x00007f36202dc416 in xlator_notify () from /lib64/libglusterfs.so.0
#17 0x00007f36209e88dd in glfs_fini () from /lib64/libgfapi.so.0
#18 0x00007f3620c0b1a6 in glfs_clear_preopened (fs=0x7f363cdac7b0) at ../source3/modules/vfs_glusterfs.c:153
#19 vfs_gluster_disconnect (handle=<optimized out>) at ../source3/modules/vfs_glusterfs.c:374

#20 0x00007f363b35a271 in close_cnum (conn=0x7f363cdb3450, vuid=2319896939) at ../source3/smbd/service.c:1154
#21 0x00007f363b388894 in smbXsrv_tcon_disconnect (tcon=0x7f363cda61c0, vuid=2319896939) at ../source3/smbd/smbXsrv_tcon.c:983
#22 0x00007f363b36fcaf in smbd_smb2_tdis_wait_done (subreq=0x7f363cdb67d0) at ../source3/smbd/smb2_tcon.c:631
#23 0x00007f3637ceec34 in tevent_common_loop_immediate (ev=ev@entry=0x7f363cd83da0) at ../tevent_immediate.c:135
#24 0x00007f36392c72ac in run_events_poll (ev=0x7f363cd83da0, pollrtn=0, pfds=0x0, num_pfds=0) at ../source3/lib/events.c:192

#25 0x00007f36392c7594 in s3_event_loop_once (ev=0x7f363cd83da0, location=<optimized out>) at ../source3/lib/events.c:303
#26 0x00007f3637cee40d in _tevent_loop_once (ev=ev@entry=0x7f363cd83da0, location=location@entry=0x7f363b4a1ce0 "../source3/smbd/process.c:4117") at ../tevent.c:533
#27 0x00007f3637cee5ab in tevent_common_loop_wait (ev=0x7f363cd83da0, location=0x7f363b4a1ce0 "../source3/smbd/process.c:4117") at ../tevent.c:637
#28 0x00007f363b3577b1 in smbd_process (ev_ctx=ev_ctx@entry=0x7f363cd83da0, msg_ctx=msg_ctx@entry=0x7f363cd83e90, sock_fd=sock_fd@entry=39, interactive=interactive@entry=false) at ../source3/smbd/process.c:4117
#29 0x00007f363be40304 in smbd_accept_connection (ev=0x7f363cd83da0, fde=<optimized out>, flags=<optimized out>, private_data=<optimized out>) at ../source3/smbd/server.c:762
#30 0x00007f36392c73dc in run_events_poll (ev=0x7f363cd83da0, pollrtn=<optimized out>, pfds=0x7f363cd9c9c0, num_pfds=7) at ../source3/lib/events.c:257

#31 0x00007f36392c7630 in s3_event_loop_once (ev=0x7f363cd83da0, location=<optimized out>) at ../source3/lib/events.c:326
#32 0x00007f3637cee40d in _tevent_loop_once (ev=ev@entry=0x7f363cd83da0, location=location@entry=0x7f363be43776 "../source3/smbd/server.c:1127") at ../tevent.c:533
#33 0x00007f3637cee5ab in tevent_common_loop_wait (ev=0x7f363cd83da0, location=0x7f363be43776 "../source3/smbd/server.c:1127") at ../tevent.c:637
#34 0x00007f363be3bad4 in smbd_parent_loop (parent=<optimized out>, ev_ctx=0x7f363cd83da0) at ../source3/smbd/server.c:1127
#35 main (argc=<optimized out>, argv=<optimized out>) at ../source3/smbd/server.c:1780

--- Additional comment from Red Hat Bugzilla Rules Engine on 2017-01-02 08:30:13 EST ---

This bug is automatically being proposed for the current release of Red Hat Gluster Storage 3 under active development, by setting the release flag 'rhgs‑3.2.0' to '?'. 

If this bug should be proposed for a different release, please manually change the proposed release flag.

--- Additional comment from Vivek Das on 2017-01-02 08:37:06 EST ---

Sosreports & samba logs : http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1409563/

--- Additional comment from Rejy M Cyriac on 2017-01-04 08:16:57 EST ---

At the 'RHGS 3.2.0 - Blocker Bug Triage' meeting on 04 January, it was decided that this BZ is being ACCEPTED AS BLOCKER at the RHGS 3.2.0 release

--- Additional comment from Worker Ant on 2017-01-06 01:44:51 EST ---

REVIEW: http://review.gluster.org/16343 (socket: GF_REF_PUT should be called outside lock) posted (#1) for review on master by Rajesh Joseph (rjoseph)

--- Additional comment from Worker Ant on 2017-02-06 00:02:05 EST ---

REVIEW: https://review.gluster.org/16343 (socket: GF_REF_PUT should be called outside lock) posted (#2) for review on master by Rajesh Joseph (rjoseph)

--- Additional comment from Worker Ant on 2017-02-06 06:13:50 EST ---

COMMIT: https://review.gluster.org/16343 committed in master by Raghavendra G (rgowdapp) 
------
commit b3188c61d248526a070b1b18df1ea1d181b349d6
Author: Rajesh Joseph <rjoseph>
Date:   Thu Jan 5 23:58:21 2017 +0530

    socket: GF_REF_PUT should be called outside lock
    
    GF_REF_PUT was called inside lock which can call
    socket_poller_mayday which inturn tries to take the
    same lock. This can lead to deadlock scenario.
    
    BUG: 1410701
    Change-Id: Ib3b161bcfeac810bd3593dc04c10ef984f996b17
    Signed-off-by: Rajesh Joseph <rjoseph>
    Reviewed-on: https://review.gluster.org/16343
    Reviewed-by: Raghavendra G <rgowdapp>
    CentOS-regression: Gluster Build System <jenkins.org>
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>

Comment 1 Worker Ant 2017-02-06 11:24:15 UTC
REVIEW: https://review.gluster.org/16548 (socket: GF_REF_PUT should be called outside lock) posted (#1) for review on release-3.10 by Atin Mukherjee (amukherj)

Comment 2 Worker Ant 2017-02-07 11:49:47 UTC
COMMIT: https://review.gluster.org/16548 committed in release-3.10 by Shyamsundar Ranganathan (srangana) 
------
commit 2e176d46b574af2672688410393ba20a9ad72acf
Author: Rajesh Joseph <rjoseph>
Date:   Thu Jan 5 23:58:21 2017 +0530

    socket: GF_REF_PUT should be called outside lock
    
    GF_REF_PUT was called inside lock which can call
    socket_poller_mayday which inturn tries to take the
    same lock. This can lead to deadlock scenario.
    
    >Reviewed-on: https://review.gluster.org/16343
    >Reviewed-by: Raghavendra G <rgowdapp>
    >CentOS-regression: Gluster Build System <jenkins.org>
    >Smoke: Gluster Build System <jenkins.org>
    >NetBSD-regression: NetBSD Build System <jenkins.org>
    
    BUG: 1419503
    Change-Id: Ib3b161bcfeac810bd3593dc04c10ef984f996b17
    Signed-off-by: Rajesh Joseph <rjoseph>
    Reviewed-on: https://review.gluster.org/16548
    Tested-by: Atin Mukherjee <amukherj>
    Reviewed-by: Raghavendra G <rgowdapp>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Smoke: Gluster Build System <jenkins.org>

Comment 3 Shyamsundar 2017-03-06 17:45:31 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.0, please open a new bug report.

glusterfs-3.10.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2017-February/030119.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.