Bug 1708926 - Invalid memory access while executing cleanup_and_exit
Summary: Invalid memory access while executing cleanup_and_exit
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: core
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1716626
TreeView+ depends on / blocked
 
Reported: 2019-05-11 17:57 UTC by Mohammed Rafi KC
Modified: 2019-06-03 19:22 UTC (History)
2 users (show)

Fixed In Version:
Clone Of:
: 1716626 (view as bug list)
Environment:
Last Closed: 2019-05-31 11:28:15 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gluster.org Gerrit 22709 0 None Merged glusterfsd/cleanup: Protect graph object under a lock 2019-05-31 11:28:13 UTC
Gluster.org Gerrit 22743 0 None Merged afr/frame: Destroy frame after afr_selfheal_entry_granular 2019-05-21 11:37:11 UTC

Description Mohammed Rafi KC 2019-05-11 17:57:17 UTC
Description of problem:

when executing a cleanup_and_exit, a shd daemon is crashed. This is because there is a chance that a parallel graph free thread might be executing another cleanup

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. run ./tests/bugs/glusterd/reset-brick-and-daemons-follow-quorum.t in a loop
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Worker Ant 2019-05-11 17:59:31 UTC
REVIEW: https://review.gluster.org/22709 (glusterfsd/cleanup: Protect graph object under a lock) posted (#1) for review on master by mohammed rafi  kc

Comment 2 Pranith Kumar K 2019-05-14 07:09:23 UTC
Rafi,
      Could you share the bt of the core so that it is easier to understand why exactly it crashed?

Pranith

Comment 3 Mohammed Rafi KC 2019-05-14 16:01:36 UTC
          Stack trace of thread 30877:
                #0  0x0000000000406a07 cleanup_and_exit (glusterfsd)
                #1  0x0000000000406b5d glusterfs_sigwaiter (glusterfsd)
                #2  0x00007f51000cd58e start_thread (libpthread.so.0)
                #3  0x00007f50ffd1d683 __clone (libc.so.6)
                
                Stack trace of thread 30879:
                #0  0x00007f51000d3a7a futex_abstimed_wait_cancelable (libpthread.so.0)
                #1  0x00007f51003b8616 syncenv_task (libglusterfs.so.0)
                #2  0x00007f51003b9240 syncenv_processor (libglusterfs.so.0)
                #3  0x00007f51000cd58e start_thread (libpthread.so.0)
                #4  0x00007f50ffd1d683 __clone (libc.so.6)
                
                Stack trace of thread 30881:
                #0  0x00007f50ffd14cdf __GI___select (libc.so.6)
                #1  0x00007f51003ef1cd runner (libglusterfs.so.0)
                #2  0x00007f51000cd58e start_thread (libpthread.so.0)
                #3  0x00007f50ffd1d683 __clone (libc.so.6)
                
                Stack trace of thread 30880:
                #0  0x00007f51000d3a7a futex_abstimed_wait_cancelable (libpthread.so.0)
                #1  0x00007f51003b8616 syncenv_task (libglusterfs.so.0)
                #2  0x00007f51003b9240 syncenv_processor (libglusterfs.so.0)
                #3  0x00007f51000cd58e start_thread (libpthread.so.0)
                #4  0x00007f50ffd1d683 __clone (libc.so.6)
                
                Stack trace of thread 30876:
                #0  0x00007f51000d7500 __GI___nanosleep (libpthread.so.0)
                #1  0x00007f510038a346 gf_timer_proc (libglusterfs.so.0)
                #2  0x00007f51000cd58e start_thread (libpthread.so.0)
                #3  0x00007f50ffd1d683 __clone (libc.so.6)
                
                Stack trace of thread 30882:
                #0  0x00007f50ffd1e06e epoll_ctl (libc.so.6)
                #1  0x00007f51003d931e event_handled_epoll (libglusterfs.so.0)
                #2  0x00007f50eed9a781 socket_event_poll_in (socket.so)
                #3  0x00007f51003d8c9b event_dispatch_epoll_handler (libglusterfs.so.0)
                #4  0x00007f51000cd58e start_thread (libpthread.so.0)
                #5  0x00007f50ffd1d683 __clone (libc.so.6)
                
                Stack trace of thread 30875:
                #0  0x00007f51000cea6d __GI___pthread_timedjoin_ex (libpthread.so.0)
                #1  0x00007f51003d8387 event_dispatch_epoll (libglusterfs.so.0)
                #2  0x0000000000406592 main (glusterfsd)
                #3  0x00007f50ffc44413 __libc_start_main (libc.so.6)
                #4  0x00000000004067de _start (glusterfsd)
                
                Stack trace of thread 30878:
                #0  0x00007f50ffce97f8 __GI___nanosleep (libc.so.6)
                #1  0x00007f50ffce96fe __sleep (libc.so.6)
                #2  0x00007f51003a4f5a pool_sweeper (libglusterfs.so.0)
                #3  0x00007f51000cd58e start_thread (libpthread.so.0)
                #4  0x00007f50ffd1d683 __clone (libc.so.6)
                
                Stack trace of thread 30883:
                #0  0x00007f51000d6b8d __lll_lock_wait (libpthread.so.0)
                #1  0x00007f51000cfda9 __GI___pthread_mutex_lock (libpthread.so.0)
                #2  0x00007f510037cd1f _gf_msg_plain_internal (libglusterfs.so.0)
                #3  0x00007f510037ceb3 _gf_msg_plain (libglusterfs.so.0)
                #4  0x00007f5100382d43 gf_log_dump_graph (libglusterfs.so.0)
                #5  0x00007f51003b514f glusterfs_process_svc_attach_volfp (libglusterfs.so.0)
                #6  0x000000000040b16d mgmt_process_volfile (glusterfsd)
                #7  0x0000000000410792 mgmt_getspec_cbk (glusterfsd)
                #8  0x00007f51003256b1 rpc_clnt_handle_reply (libgfrpc.so.0)
                #9  0x00007f5100325a53 rpc_clnt_notify (libgfrpc.so.0)
                #10 0x00007f5100322973 rpc_transport_notify (libgfrpc.so.0)
                #11 0x00007f50eed9a45c socket_event_poll_in (socket.so)
                #12 0x00007f51003d8c9b event_dispatch_epoll_handler (libglusterfs.so.0)
                #13 0x00007f51000cd58e start_thread (libpthread.so.0)
                #14 0x00007f50ffd1d683 __clone (libc.so.6)

Comment 4 Pranith Kumar K 2019-05-15 05:34:33 UTC
(In reply to Mohammed Rafi KC from comment #3)
>           Stack trace of thread 30877:
>                 #0  0x0000000000406a07 cleanup_and_exit (glusterfsd)
>                 #1  0x0000000000406b5d glusterfs_sigwaiter (glusterfsd)
>                 #2  0x00007f51000cd58e start_thread (libpthread.so.0)
>                 #3  0x00007f50ffd1d683 __clone (libc.so.6)
>                 
>                 Stack trace of thread 30879:
>                 #0  0x00007f51000d3a7a futex_abstimed_wait_cancelable
> (libpthread.so.0)
>                 #1  0x00007f51003b8616 syncenv_task (libglusterfs.so.0)
>                 #2  0x00007f51003b9240 syncenv_processor (libglusterfs.so.0)
>                 #3  0x00007f51000cd58e start_thread (libpthread.so.0)
>                 #4  0x00007f50ffd1d683 __clone (libc.so.6)
>                 
>                 Stack trace of thread 30881:
>                 #0  0x00007f50ffd14cdf __GI___select (libc.so.6)
>                 #1  0x00007f51003ef1cd runner (libglusterfs.so.0)
>                 #2  0x00007f51000cd58e start_thread (libpthread.so.0)
>                 #3  0x00007f50ffd1d683 __clone (libc.so.6)
>                 
>                 Stack trace of thread 30880:
>                 #0  0x00007f51000d3a7a futex_abstimed_wait_cancelable
> (libpthread.so.0)
>                 #1  0x00007f51003b8616 syncenv_task (libglusterfs.so.0)
>                 #2  0x00007f51003b9240 syncenv_processor (libglusterfs.so.0)
>                 #3  0x00007f51000cd58e start_thread (libpthread.so.0)
>                 #4  0x00007f50ffd1d683 __clone (libc.so.6)
>                 
>                 Stack trace of thread 30876:
>                 #0  0x00007f51000d7500 __GI___nanosleep (libpthread.so.0)
>                 #1  0x00007f510038a346 gf_timer_proc (libglusterfs.so.0)
>                 #2  0x00007f51000cd58e start_thread (libpthread.so.0)
>                 #3  0x00007f50ffd1d683 __clone (libc.so.6)
>                 
>                 Stack trace of thread 30882:
>                 #0  0x00007f50ffd1e06e epoll_ctl (libc.so.6)
>                 #1  0x00007f51003d931e event_handled_epoll
> (libglusterfs.so.0)
>                 #2  0x00007f50eed9a781 socket_event_poll_in (socket.so)
>                 #3  0x00007f51003d8c9b event_dispatch_epoll_handler
> (libglusterfs.so.0)
>                 #4  0x00007f51000cd58e start_thread (libpthread.so.0)
>                 #5  0x00007f50ffd1d683 __clone (libc.so.6)
>                 
>                 Stack trace of thread 30875:
>                 #0  0x00007f51000cea6d __GI___pthread_timedjoin_ex
> (libpthread.so.0)
>                 #1  0x00007f51003d8387 event_dispatch_epoll
> (libglusterfs.so.0)
>                 #2  0x0000000000406592 main (glusterfsd)
>                 #3  0x00007f50ffc44413 __libc_start_main (libc.so.6)
>                 #4  0x00000000004067de _start (glusterfsd)
>                 
>                 Stack trace of thread 30878:
>                 #0  0x00007f50ffce97f8 __GI___nanosleep (libc.so.6)
>                 #1  0x00007f50ffce96fe __sleep (libc.so.6)
>                 #2  0x00007f51003a4f5a pool_sweeper (libglusterfs.so.0)
>                 #3  0x00007f51000cd58e start_thread (libpthread.so.0)
>                 #4  0x00007f50ffd1d683 __clone (libc.so.6)
>                 
>                 Stack trace of thread 30883:
>                 #0  0x00007f51000d6b8d __lll_lock_wait (libpthread.so.0)
>                 #1  0x00007f51000cfda9 __GI___pthread_mutex_lock
> (libpthread.so.0)
>                 #2  0x00007f510037cd1f _gf_msg_plain_internal
> (libglusterfs.so.0)
>                 #3  0x00007f510037ceb3 _gf_msg_plain (libglusterfs.so.0)
>                 #4  0x00007f5100382d43 gf_log_dump_graph (libglusterfs.so.0)
>                 #5  0x00007f51003b514f glusterfs_process_svc_attach_volfp
> (libglusterfs.so.0)
>                 #6  0x000000000040b16d mgmt_process_volfile (glusterfsd)
>                 #7  0x0000000000410792 mgmt_getspec_cbk (glusterfsd)
>                 #8  0x00007f51003256b1 rpc_clnt_handle_reply (libgfrpc.so.0)
>                 #9  0x00007f5100325a53 rpc_clnt_notify (libgfrpc.so.0)
>                 #10 0x00007f5100322973 rpc_transport_notify (libgfrpc.so.0)
>                 #11 0x00007f50eed9a45c socket_event_poll_in (socket.so)
>                 #12 0x00007f51003d8c9b event_dispatch_epoll_handler
> (libglusterfs.so.0)
>                 #13 0x00007f51000cd58e start_thread (libpthread.so.0)
>                 #14 0x00007f50ffd1d683 __clone (libc.so.6)

Was graph->active NULL? What lead to the crash?

Comment 5 Worker Ant 2019-05-17 18:08:44 UTC
REVIEW: https://review.gluster.org/22743 (afr/frame: Destroy frame after afr_selfheal_entry_granular) posted (#1) for review on master by mohammed rafi  kc

Comment 6 Worker Ant 2019-05-21 11:37:12 UTC
REVIEW: https://review.gluster.org/22743 (afr/frame: Destroy frame after afr_selfheal_entry_granular) merged (#3) on master by Pranith Kumar Karampuri

Comment 7 Worker Ant 2019-05-31 11:28:15 UTC
REVIEW: https://review.gluster.org/22709 (glusterfsd/cleanup: Protect graph object under a lock) merged (#10) on master by Amar Tumballi


Note You need to log in before you can comment on or make changes to this bug.