Bug 1386177
| Summary: | SMB[md-cache]:While multiple connect and disconnect of samba share hang is seen and other share becomes inaccessible | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | surabhi <sbhaloth> | 
| Component: | md-cache | Assignee: | Poornima G <pgurusid> | 
| Status: | CLOSED ERRATA | QA Contact: | Vivek Das <vdas> | 
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | rhgs-3.2 | CC: | amukherj, madam, pgurusid, rhinduja, rhs-bugs, rjoseph, sbhaloth | 
| Target Milestone: | --- | ||
| Target Release: | RHGS 3.2.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | glusterfs-3.8.4-6 | Doc Type: | If docs needed, set a value | 
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-03-23 06:11:58 UTC | Type: | Bug | 
| Regression: | --- | Mount Type: | --- | 
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1351528 | ||
| 
 
        
          Description
        
        
          surabhi
        
        
        
        
        
          2016-10-18 10:54:52 UTC
        
       
      
      
      
    I see that the main thread is hung on the synctask thread. Could you please provide the bt of all the threads in the smbd process that is hung. # pstack 19338 Thread 14 (Thread 0x7f74811e2700 (LWP 19341)): #0 0x00007f749d5d5a82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f7482049d98 in syncenv_task (proc=proc@entry=0x7f749e1b43d0) at syncop.c:603 #2 0x00007f748204abe0 in syncenv_processor (thdata=0x7f749e1b43d0) at syncop.c:695 #3 0x00007f749d5d1dc5 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f74995f2ced in clone () from /lib64/libc.so.6 Thread 13 (Thread 0x7f74809e1700 (LWP 19342)): #0 0x00007f749d5d5a82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f7482049d98 in syncenv_task (proc=proc@entry=0x7f749e1b4790) at syncop.c:603 #2 0x00007f748204abe0 in syncenv_processor (thdata=0x7f749e1b4790) at syncop.c:695 #3 0x00007f749d5d1dc5 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f74995f2ced in clone () from /lib64/libc.so.6 Thread 12 (Thread 0x7f747ea28700 (LWP 19343)): #0 0x00007f749d5d5a82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f7482049d98 in syncenv_task (proc=proc@entry=0x7f749e1fa0a0) at syncop.c:603 #2 0x00007f748204abe0 in syncenv_processor (thdata=0x7f749e1fa0a0) at syncop.c:695 #3 0x00007f749d5d1dc5 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f74995f2ced in clone () from /lib64/libc.so.6 Thread 11 (Thread 0x7f747e227700 (LWP 19344)): #0 0x00007f749d5d5a82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f7482049d98 in syncenv_task (proc=proc@entry=0x7f749e1fa460) at syncop.c:603 #2 0x00007f748204abe0 in syncenv_processor (thdata=0x7f749e1fa460) at syncop.c:695 #3 0x00007f749d5d1dc5 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f74995f2ced in clone () from /lib64/libc.so.6 Thread 10 (Thread 0x7f747ce6e700 (LWP 19345)): #0 0x00007f749d5d896d in nanosleep () from /lib64/libpthread.so.0 #1 0x00007f748201ebb6 in gf_timer_proc (data=0x7f749e207cd0) at timer.c:176 #2 0x00007f749d5d1dc5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f74995f2ced in clone () from /lib64/libc.so.6 Thread 9 (Thread 0x7f747c46a700 (LWP 19346)): #0 0x00007f749d5d2ef7 in pthread_join () from /lib64/libpthread.so.0 #1 0x00007f748206b768 in event_dispatch_epoll (event_pool=0x7f749e1f6eb0) at event-epoll.c:758 #2 0x00007f7482719c64 in glfs_poller (data=<optimized out>) at glfs.c:612 #3 0x00007f749d5d1dc5 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f74995f2ced in clone () from /lib64/libc.so.6 Thread 8 (Thread 0x7f747bc69700 (LWP 19347)): #0 0x00007f74995f32c3 in epoll_wait () from /lib64/libc.so.6 #1 0x00007f748206b1c0 in event_dispatch_epoll_worker (data=0x7f7474000920) at event-epoll.c:664 #2 0x00007f749d5d1dc5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f74995f2ced in clone () from /lib64/libc.so.6 Thread 7 (Thread 0x7f746bad4700 (LWP 19356)): #0 0x00007f74995f32c3 in epoll_wait () from /lib64/libc.so.6 #1 0x00007f748206b1c0 in event_dispatch_epoll_worker (data=0x7f746c05e240) at event-epoll.c:664 #2 0x00007f749d5d1dc5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f74995f2ced in clone () from /lib64/libc.so.6 Thread 6 (Thread 0x7f7459a17700 (LWP 19815)): #0 0x00007f749d5d5a82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f7482049d98 in syncenv_task (proc=proc@entry=0x7f749e878e00) at syncop.c:603 #2 0x00007f748204abe0 in syncenv_processor (thdata=0x7f749e878e00) at syncop.c:695 #3 0x00007f749d5d1dc5 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f74995f2ced in clone () from /lib64/libc.so.6 Thread 5 (Thread 0x7f744bad4700 (LWP 19816)): #0 0x00007f749d5d896d in nanosleep () from /lib64/libpthread.so.0 #1 0x00007f748201ebb6 in gf_timer_proc (data=0x7f749e860150) at timer.c:176 #2 0x00007f749d5d1dc5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f74995f2ced in clone () from /lib64/libc.so.6 Thread 4 (Thread 0x7f7458916700 (LWP 19817)): #0 0x00007f749d5d2ef7 in pthread_join () from /lib64/libpthread.so.0 #1 0x00007f748206b768 in event_dispatch_epoll (event_pool=0x7f749e875850) at event-epoll.c:758 #2 0x00007f7482719c64 in glfs_poller (data=<optimized out>) at glfs.c:612 #3 0x00007f749d5d1dc5 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f74995f2ced in clone () from /lib64/libc.so.6 Thread 3 (Thread 0x7f744b2d3700 (LWP 19818)): #0 0x00007f74995f32c3 in epoll_wait () from /lib64/libc.so.6 #1 0x00007f748206b1c0 in event_dispatch_epoll_worker (data=0x7f7450000920) at event-epoll.c:664 #2 0x00007f749d5d1dc5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f74995f2ced in clone () from /lib64/libc.so.6 Thread 2 (Thread 0x7f744aad2700 (LWP 19837)): #0 0x00007f74995f32c3 in epoll_wait () from /lib64/libc.so.6 #1 0x00007f748206b1c0 in event_dispatch_epoll_worker (data=0x7f74559a31d0) at event-epoll.c:664 #2 0x00007f749d5d1dc5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f74995f2ced in clone () from /lib64/libc.so.6 Thread 1 (Thread 0x7f749d993880 (LWP 19338)): #0 0x00007f749d5d56d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f748204ac3b in syncenv_destroy (env=0x7f749e878a40) at syncop.c:779 #2 0x00007f748271b605 in pub_glfs_fini (fs=fs@entry=0x7f749e248b10) at glfs.c:1215 #3 0x00007f748293d076 in glfs_clear_preopened (fs=0x7f749e248b10) at ../source3/modules/vfs_glusterfs.c:153 #4 vfs_gluster_disconnect (handle=<optimized out>) at ../source3/modules/vfs_glusterfs.c:270 #5 0x00007f749cf2e111 in close_cnum (conn=0x7f749e16d050, vuid=3120287205) at ../source3/smbd/service.c:1154 #6 0x00007f749cf5c714 in smbXsrv_tcon_disconnect (tcon=0x7f749e87cf90, vuid=3120287205) at ../source3/smbd/smbXsrv_tcon.c:983 #7 0x00007f749cf43b1f in smbd_smb2_tdis_wait_done (subreq=0x7f749e243db0) at ../source3/smbd/smb2_tcon.c:631 #8 0x00007f74998c2c34 in tevent_common_loop_immediate () from /lib64/libtevent.so.0 #9 0x00007f749ae9b26c in run_events_poll (ev=0x7f749e14a030, pollrtn=0, pfds=0x0, num_pfds=0) at ../source3/lib/events.c:192 #10 0x00007f749ae9b554 in s3_event_loop_once (ev=0x7f749e14a030, location=<optimized out>) at ../source3/lib/events.c:303 #11 0x00007f74998c240d in _tevent_loop_once () from /lib64/libtevent.so.0 #12 0x00007f74998c25ab in tevent_common_loop_wait () from /lib64/libtevent.so.0 #13 0x00007f749cf2b651 in smbd_process (ev_ctx=ev_ctx@entry=0x7f749e14a030, msg_ctx=msg_ctx@entry=0x7f749e14a120, sock_fd=sock_fd@entry=39, interactive=interactive@entry=false) at ../source3/smbd/process.c:4117 #14 0x00007f749da16304 in smbd_accept_connection (ev=0x7f749e14a030, fde=<optimized out>, flags=<optimized out>, private_data=<optimized out>) at ../source3/smbd/server.c:762 #15 0x00007f749ae9b39c in run_events_poll (ev=0x7f749e14a030, pollrtn=<optimized out>, pfds=0x7f749e158c10, num_pfds=7) at ../source3/lib/events.c:257 #16 0x00007f749ae9b5f0 in s3_event_loop_once (ev=0x7f749e14a030, location=<optimized out>) at ../source3/lib/events.c:326 #17 0x00007f74998c240d in _tevent_loop_once () from /lib64/libtevent.so.0 #18 0x00007f74998c25ab in tevent_common_loop_wait () from /lib64/libtevent.so.0 #19 0x00007f749da11ad4 in smbd_parent_loop (parent=<optimized out>, ev_ctx=0x7f749e14a030) at ../source3/smbd/server.c:1127 #20 main (argc=<optimized out>, argv=<optimized out>) at ../source3/smbd/server.c:1780 Fix posted upstream http://review.gluster.org/#/c/15764/2 patch is available upstream --> POST Poornima - given we have identified the fix, can we devel_ack this BZ for 3.2.0? Upstream master: http://review.gluster.org/15764 Fix Posted: Downstream 3.2: https://code.engineering.redhat.com/gerrit/#/c/90692/ Master : http://review.gluster.org/#/c/15764/ 3.9 : http://review.gluster.org/#/c/15890/ Versions --------- glusterfs-server-3.8.4-8.el7rhgs.x86_64 samba-client-4.4.6-2.el7rhgs.x86_64 Not reproducible with the below steps to reproduce Steps to Reproduce: 1.Create 2 samba shares and map it on windows client 2.run a script which connects and disconnects share 1 for around 30 times 3.After 1 or 2 runs , observe the script output 4. Try to access the second share Marking it as verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2017-0486.html  |