Description of problem: On a three nodes cluster (N1,N2,N3), On N1 continuosly exectuing "gluster vol status rep3_3 detail" restarted glusterd on one of the nodes from(N2, N3) not sure on which node glusterd is restarted glusterd core dumps on N2 and N3. Version-Release number of selected component (if applicable): glusterfs-3.12.2-27.el7rhgs.x86_64 How reproducible: 1/1 Steps to Reproduce: 1. Form three nodes cluster and brick-mux enabled 2. Created and started three replica(1X3) volumes 3. Next just created 300 volumes not started of type replicate(1X3) 4. Executed "gluster vol status rep3_3 detail" to check for memory leaks 5. restart glusterd on one of the nodes N2/N3 (not sure on which node glusterd is restarted) Actual results: glusterd core dumps on two nodes Node 2 bt and t a a bt #################################################################################################################### t a a bt warning: core file may not match specified executable file. Reading symbols from /usr/sbin/glusterfsd...Reading symbols from /usr/lib/debug/usr/sbin/glusterfsd.debug...done. done. Missing separate debuginfo for Try: yum --enablerepo='*debug*' install /usr/lib/debug/.build-id/16/3c2dc43405427478788bad0afd537a7acf7a13 [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO'. Program terminated with signal 11, Segmentation fault. #0 0x00007f04e815f0ad in rcu_read_lock_bp () from /lib64/liburcu-bp.so.1 Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.6-13.el7.x86_64 device-mapper-event-libs-1.02.149-10.el7_6.2.x86_64 device-mapper-libs-1.02.149-10.el7_6.2.x86_64 elfutils-libelf-0.172-2.el7.x86_64 elfutils-libs-0.172-2.el7.x86_64 glibc-2.17-260.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-34.el7.x86_64 libaio-0.3.109-13.el7.x86_64 libattr-2.4.46-13.el7.x86_64 libblkid-2.23.2-59.el7.x86_64 libcap-2.22-9.el7.x86_64 libcom_err-1.42.9-13.el7.x86_64 libgcc-4.8.5-36.el7.x86_64 libselinux-2.5-14.1.el7.x86_64 libsepol-2.5-10.el7.x86_64 libuuid-2.23.2-59.el7.x86_64 libxml2-2.9.1-6.el7_2.3.x86_64 lvm2-libs-2.02.180-10.el7_6.2.x86_64 openssl-libs-1.0.2k-16.el7.x86_64 pcre-8.32-17.el7.x86_64 sssd-client-1.16.2-13.el7.x86_64 systemd-libs-219-62.el7.x86_64 userspace-rcu-0.7.9-2.el7rhgs.x86_64 xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-18.el7.x86_64 (gdb) t a a bt Thread 9 (Thread 0x7f04eaa67700 (LWP 25849)): #0 0x00007f04e6f03410 in dm_get_suspended_counter@plt () from /lib64/libdevmapper.so.1.02 #1 0x00007f04e6f0382a in dm_lib_exit () from /lib64/libdevmapper.so.1.02 #2 0x00007f04f3f4efca in _dl_fini () from /lib64/ld-linux-x86-64.so.2 #3 0x00007f04f22ccb69 in __run_exit_handlers () from /lib64/libc.so.6 #4 0x00007f04f22ccbb7 in exit () from /lib64/libc.so.6 #5 0x00005616d710447f in cleanup_and_exit (signum=15) at glusterfsd.c:1423 #6 0x00005616d7104575 in glusterfs_sigwaiter (arg=<optimized out>) at glusterfsd.c:2145 #7 0x00007f04f2ac8dd5 in start_thread () from /lib64/libpthread.so.0 #8 0x00007f04f2390ead in clone () from /lib64/libc.so.6 Thread 8 (Thread 0x7f04eb268700 (LWP 25848)): #0 0x00007f04f2acfe3d in nanosleep () from /lib64/libpthread.so.0 #1 0x00007f04f3c77c96 in gf_timer_proc (data=0x5616d74af270) at timer.c:174 #2 0x00007f04f2ac8dd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f04f2390ead in clone () from /lib64/libc.so.6 Thread 7 (Thread 0x7f04f414d780 (LWP 25847)): #0 0x00007f04f2ac9f47 in pthread_join () from /lib64/libpthread.so.0 #1 0x00007f04f3cc7e78 in event_dispatch_epoll (event_pool=0x5616d74a7a30) at event-epoll.c:746 #2 0x00005616d7101247 in main (argc=5, argv=<optimized out>) at glusterfsd.c:2550 Thread 6 (Thread 0x7f04e3d86700 (LWP 26041)): #0 0x00007f04f2acc965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f04e87b19bb in hooks_worker (args=<optimized out>) at glusterd-hooks.c:529 #2 0x00007f04f2ac8dd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f04f2390ead in clone () from /lib64/libc.so.6 Thread 5 (Thread 0x7f04e2984700 (LWP 17711)): #0 0x00007f04f238b1c9 in syscall () from /lib64/libc.so.6 #1 0x00007f04e815ec14 in call_rcu_thread () from /lib64/liburcu-bp.so.1 #2 0x00007f04f2ac8dd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f04f2390ead in clone () from /lib64/libc.so.6 Thread 4 (Thread 0x7f04e3585700 (LWP 26042)): #0 0x00007f04f2391483 in epoll_wait () from /lib64/libc.so.6 #1 0x00007f04f3cc7712 in event_dispatch_epoll_worker (data=0x5616d7a79aa0) at event-epoll.c:649 #2 0x00007f04f2ac8dd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f04f2390ead in clone () from /lib64/libc.so.6 Thread 3 (Thread 0x7f04e9a65700 (LWP 25851)): ---Type <return> to continue, or q <return> to quit--- #0 0x00007f04f2accd12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f04f3ca5178 in syncenv_task (proc=proc@entry=0x5616d74afa90) at syncop.c:603 #2 0x00007f04f3ca6040 in syncenv_processor (thdata=0x5616d74afa90) at syncop.c:695 #3 0x00007f04f2ac8dd5 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f04f2390ead in clone () from /lib64/libc.so.6 Thread 2 (Thread 0x7f04ea266700 (LWP 25850)): #0 0x00007f04f2357e2d in nanosleep () from /lib64/libc.so.6 #1 0x00007f04f2357cc4 in sleep () from /lib64/libc.so.6 #2 0x00007f04f3c9250d in pool_sweeper (arg=<optimized out>) at mem-pool.c:481 #3 0x00007f04f2ac8dd5 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f04f2390ead in clone () from /lib64/libc.so.6 Thread 1 (Thread 0x7f04e9264700 (LWP 25852)): #0 0x00007f04e815f0ad in rcu_read_lock_bp () from /lib64/liburcu-bp.so.1 #1 0x00007f04e87e3aa7 in glusterd_peerinfo_find_by_uuid ( uuid=uuid@entry=0x7f04d5021580 "\030.\272\310$\257K\366\231=\bo\251\347\022;>:S\240p\353G\355\275\322\335q\254\250\315\374\022") at glusterd-peer-utils.c:193 #2 0x00007f04e87da510 in glusterd_handle_mgmt_v3_lock_fn (req=req@entry=0x7f04d4b74ad0) at glusterd-mgmt-handler.c:157 #3 0x00007f04e86f0b7e in glusterd_big_locked_handler (req=0x7f04d4b74ad0, actor_fn=0x7f04e87da430 <glusterd_handle_mgmt_v3_lock_fn>) at glusterd-handler.c:82 #4 0x00007f04f3ca2ba0 in synctask_wrap () at syncop.c:375 #5 0x00007f04f22db010 in ?? () from /lib64/libc.so.6 #6 0x0000000000000000 in ?? () (gdb) ############################################################################################################################################################## (gdb) bt #0 0x00007f04e815f0ad in rcu_read_lock_bp () from /lib64/liburcu-bp.so.1 #1 0x00007f04e87e3aa7 in glusterd_peerinfo_find_by_uuid ( uuid=uuid@entry=0x7f04d5021580 "\030.\272\310$\257K\366\231=\bo\251\347\022;>:S\240p\353G\355\275\322\335q\254\250\315\374\022") at glusterd-peer-utils.c:193 #2 0x00007f04e87da510 in glusterd_handle_mgmt_v3_lock_fn (req=req@entry=0x7f04d4b74ad0) at glusterd-mgmt-handler.c:157 #3 0x00007f04e86f0b7e in glusterd_big_locked_handler (req=0x7f04d4b74ad0, actor_fn=0x7f04e87da430 <glusterd_handle_mgmt_v3_lock_fn>) at glusterd-handler.c:82 #4 0x00007f04f3ca2ba0 in synctask_wrap () at syncop.c:375 #5 0x00007f04f22db010 in ?? () from /lib64/libc.so.6 #6 0x0000000000000000 in ?? () ################################################################################3 [2018-11-26 07:09:31.325762] W [MSGID: 106118] [glusterd-handler.c:6458:__glusterd_peer_rpc_notify] 0-management: Lock not released for testvol_99 [2018-11-26 07:09:37.286847] W [glusterfsd.c:1367:cleanup_and_exit] (-->/lib64/libpthread.so.0(+0x7dd5) [0x7f04f2ac8dd5] -->/usr/sbin/glusterd(glusterfs_sigwaiter+0xe5) [0x5616d7104575] -->/usr/sbin/glusterd(cleanup_and_exit+0x6b) [0x5616d71043eb] ) 0-: received signum (15), shutting down pending frames: frame : type(0) op(0) patchset: git://git.gluster.org/glusterfs.git signal received: 11 time of crash: 2018-11-26 07:09:37 configuration details: argp 1 backtrace 1 dlfcn 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.12.2 /lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0x9d)[0x7f04f3c69dfd] /lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7f04f3c73ec4] /lib64/libc.so.6(+0x36280)[0x7f04f22c9280] /lib64/liburcu-bp.so.1(rcu_read_lock_bp+0x2d)[0x7f04e815f0ad] /usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x116aa7)[0x7f04e87e3aa7] /usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x10d510)[0x7f04e87da510] /usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x23b7e)[0x7f04e86f0b7e] /lib64/libglusterfs.so.0(synctask_wrap+0x10)[0x7f04f3ca2ba0] /lib64/libc.so.6(+0x48010)[0x7f04f22db010] --------- ####################################################################### Node 3 core dump ############################################################################# t a a bt output warning: core file may not match specified executable file. Reading symbols from /usr/sbin/glusterfsd...Reading symbols from /usr/lib/debug/usr/sbin/glusterfsd.debug...done. done. Missing separate debuginfo for Try: yum --enablerepo='*debug*' install /usr/lib/debug/.build-id/16/3c2dc43405427478788bad0afd537a7acf7a13 [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO'. Program terminated with signal 11, Segmentation fault. #0 0x00007f034298b0ad in rcu_read_lock_bp () from /lib64/liburcu-bp.so.1 Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.6-13.el7.x86_64 device-mapper-event-libs-1.02.149-10.el7_6.2.x86_64 device-mapper-libs-1.02.149-10.el7_6.2.x86_64 elfutils-libelf-0.172-2.el7.x86_64 elfutils-libs-0.172-2.el7.x86_64 glibc-2.17-260.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-34.el7.x86_64 libaio-0.3.109-13.el7.x86_64 libattr-2.4.46-13.el7.x86_64 libblkid-2.23.2-59.el7.x86_64 libcap-2.22-9.el7.x86_64 libcom_err-1.42.9-13.el7.x86_64 libgcc-4.8.5-36.el7.x86_64 libselinux-2.5-14.1.el7.x86_64 libsepol-2.5-10.el7.x86_64 libuuid-2.23.2-59.el7.x86_64 libxml2-2.9.1-6.el7_2.3.x86_64 lvm2-libs-2.02.180-10.el7_6.2.x86_64 openssl-libs-1.0.2k-16.el7.x86_64 pcre-8.32-17.el7.x86_64 sssd-client-1.16.2-13.el7.x86_64 systemd-libs-219-62.el7.x86_64 userspace-rcu-0.7.9-2.el7rhgs.x86_64 xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-18.el7.x86_64 (gdb) t a a bt Thread 9 (Thread 0x7f033df32700 (LWP 4870)): #0 0x00007f034cbbd483 in epoll_wait () from /lib64/libc.so.6 #1 0x00007f034e4f3712 in event_dispatch_epoll_worker (data=0x55f8535cba30) at event-epoll.c:649 #2 0x00007f034d2f4dd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f034cbbcead in clone () from /lib64/libc.so.6 Thread 8 (Thread 0x7f033e733700 (LWP 4869)): #0 0x00007f034d2f8965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f0342fdd9bb in hooks_worker (args=<optimized out>) at glusterd-hooks.c:529 #2 0x00007f034d2f4dd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f034cbbcead in clone () from /lib64/libc.so.6 Thread 7 (Thread 0x7f0344291700 (LWP 4630)): #0 0x00007f034d2f8d12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f034e4d1178 in syncenv_task (proc=proc@entry=0x55f853430a90) ---Type <return> to continue, or q <return> to quit--- at syncop.c:603 #2 0x00007f034e4d2040 in syncenv_processor (thdata=0x55f853430a90) at syncop.c:695 #3 0x00007f034d2f4dd5 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f034cbbcead in clone () from /lib64/libc.so.6 Thread 6 (Thread 0x7f0345293700 (LWP 4628)): #0 0x00007f0341983880 in __do_global_dtors_aux () from /lib64/libblkid.so.1 #1 0x00007f034e77afca in _dl_fini () from /lib64/ld-linux-x86-64.so.2 #2 0x00007f034caf8b69 in __run_exit_handlers () from /lib64/libc.so.6 #3 0x00007f034caf8bb7 in exit () from /lib64/libc.so.6 #4 0x000055f8518ca47f in cleanup_and_exit (signum=15) at glusterfsd.c:1423 #5 0x000055f8518ca575 in glusterfs_sigwaiter (arg=<optimized out>) at glusterfsd.c:2145 #6 0x00007f034d2f4dd5 in start_thread () from /lib64/libpthread.so.0 #7 0x00007f034cbbcead in clone () from /lib64/libc.so.6 Thread 5 (Thread 0x7f0344a92700 (LWP 4629)): #0 0x00007f034cb83e2d in nanosleep () from /lib64/libc.so.6 #1 0x00007f034cb83cc4 in sleep () from /lib64/libc.so.6 ---Type <return> to continue, or q <return> to quit--- #2 0x00007f034e4be50d in pool_sweeper (arg=<optimized out>) at mem-pool.c:481 #3 0x00007f034d2f4dd5 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f034cbbcead in clone () from /lib64/libc.so.6 Thread 4 (Thread 0x7f033d5b0700 (LWP 28689)): #0 0x00007f034cbb71c9 in syscall () from /lib64/libc.so.6 #1 0x00007f034298ac14 in call_rcu_thread () from /lib64/liburcu-bp.so.1 #2 0x00007f034d2f4dd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f034cbbcead in clone () from /lib64/libc.so.6 Thread 3 (Thread 0x7f034e979780 (LWP 4626)): #0 0x00007f034d2f5f47 in pthread_join () from /lib64/libpthread.so.0 #1 0x00007f034e4f3e78 in event_dispatch_epoll (event_pool=0x55f853428a30) at event-epoll.c:746 #2 0x000055f8518c7247 in main (argc=5, argv=<optimized out>) at glusterfsd.c:2550 Thread 2 (Thread 0x7f0345a94700 (LWP 4627)): #0 0x00007f034d2fbe3d in nanosleep () from /lib64/libpthread.so.0 #1 0x00007f034e4a3c96 in gf_timer_proc (data=0x55f853430270) at timer.c:174 ---Type <return> to continue, or q <return> to quit--- #2 0x00007f034d2f4dd5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f034cbbcead in clone () from /lib64/libc.so.6 Thread 1 (Thread 0x7f0343a90700 (LWP 4631)): #0 0x00007f034298b0ad in rcu_read_lock_bp () from /lib64/liburcu-bp.so.1 #1 0x00007f0342f14a08 in __glusterd_handle_stage_op (req=req@entry=0x7f0330060bb0) at glusterd-handler.c:1062 #2 0x00007f0342f1cb7e in glusterd_big_locked_handler (req=0x7f0330060bb0, actor_fn=0x7f0342f14870 <__glusterd_handle_stage_op>) at glusterd-handler.c:82 #3 0x00007f034e4ceba0 in synctask_wrap () at syncop.c:375 #4 0x00007f034cb07010 in ?? () from /lib64/libc.so.6 #5 0x0000000000000000 in ?? () (gdb) ##################################################################################################### (gdb) bt #0 0x00007f034298b0ad in rcu_read_lock_bp () from /lib64/liburcu-bp.so.1 #1 0x00007f0342f14a08 in __glusterd_handle_stage_op (req=req@entry=0x7f0330060bb0) at glusterd-handler.c:1062 #2 0x00007f0342f1cb7e in glusterd_big_locked_handler (req=0x7f0330060bb0, actor_fn=0x7f0342f14870 <__glusterd_handle_stage_op>) at glusterd-handler.c:82 #3 0x00007f034e4ceba0 in synctask_wrap () at syncop.c:375 #4 0x00007f034cb07010 in ?? () from /lib64/libc.so.6 #5 0x0000000000000000 in ?? () (gdb) q ##########################################################################################3 [2018-11-26 06:21:50.100207] I [run.c:190:runner_log] (-->/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0xe49fa) [0x7f0342fdd9fa] -->/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0xe44bd) [0x7f0342fdd4bd] -->/lib64/libglusterfs.so.0(runner_log+0x115) [0x7f034e4e5225] ) 0-management: Ran script: /var/lib/glusterd/hooks/1/start/post/S30samba-start.sh --volname=rep3_2 --first=no --version=1 --volume-op=start --gd-workdir=/var/lib/glusterd [2018-11-26 07:09:26.653582] W [glusterfsd.c:1367:cleanup_and_exit] (-->/lib64/libpthread.so.0(+0x7dd5) [0x7f034d2f4dd5] -->/usr/sbin/glusterd(glusterfs_sigwaiter+0xe5) [0x55f8518ca575] -->/usr/sbin/glusterd(cleanup_and_exit+0x6b) [0x55f8518ca3eb] ) 0-: received signum (15), shutting down pending frames: frame : type(0) op(0) patchset: git://git.gluster.org/glusterfs.git signal received: 11 time of crash: 2018-11-26 07:09:26 configuration details: argp 1 backtrace 1 dlfcn 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.12.2 /lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0x9d)[0x7f034e495dfd] /lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7f034e49fec4] /lib64/libc.so.6(+0x36280)[0x7f034caf5280] /lib64/liburcu-bp.so.1(rcu_read_lock_bp+0x2d)[0x7f034298b0ad] /usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x1ba08)[0x7f0342f14a08] /usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x23b7e)[0x7f0342f1cb7e] /lib64/libglusterfs.so.0(synctask_wrap+0x10)[0x7f034e4ceba0] /lib64/libc.so.6(+0x48010)[0x7f034cb07010] --------- Expected results: No crash/core should be generated Additional info:
upstream patch: https://review.gluster.org/#/c/glusterfs/+/21743
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0658