Description of problem: ======================= Ran basic geo-rep cases with changelog,xsync and history crawl. Found the cores on all the master nodes. Master Node:1 ============= [root@rhsqe-vm01 ~]# ls -lrt /core* -rw-------. 1 root root 125153280 Jun 16 23:14 /core.16155 -rw-------. 1 root root 133541888 Jun 17 00:57 /core.9695 -rw-------. 1 root root 132493312 Jun 17 02:46 /core.14005 -rw-------. 1 root root 133541888 Jun 17 02:59 /core.8089 -rw-------. 1 root root 133541888 Jun 17 04:04 /core.27626 -rw-------. 1 root root 132493312 Jun 17 07:55 /core.16584 -rw-------. 1 root root 132513792 Jun 17 09:25 /core.29550 -rw-------. 1 root root 123850752 Jun 17 12:07 /core.26792 -rw-------. 1 root root 124919808 Jun 17 13:23 /core.3604 -rw-------. 1 root root 127275008 Jun 17 14:39 /core.22976 -rw-------. 1 root root 133566464 Jun 17 15:13 /core.25537 -rw-------. 1 root root 131469312 Jun 17 15:45 /core.1220 [root@rhsqe-vm01 ~]# [root@rhsqe-vm01 ~]# file /core* /core.1220: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0' /core.14005: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0' /core.16155: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0' /core.16584: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0' /core.22976: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0' /core.25537: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0' /core.26792: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0' /core.27626: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0' /core.29550: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0' /core.3604: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0' /core.8089: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0' /core.9695: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0' [root@rhsqe-vm01 ~]# [New LWP 1271] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'. Program terminated with signal 11, Segmentation fault. #0 __GI___pthread_mutex_lock (mutex=mutex@entry=0x0) at pthread_mutex_lock.c:50 50 unsigned int type = PTHREAD_MUTEX_TYPE (mutex); Missing separate debuginfos, use: debuginfo-install keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.12.2-14.el7.x86_64 libcom_err-1.42.9-7.el7.x86_64 libffi-3.0.13-11.el7.x86_64 libselinux-2.2.2-6.el7.x86_64 libuuid-2.23.2-21.el7.x86_64 openssl-libs-1.0.1e-42.el7.x86_64 pcre-8.32-14.el7.x86_64 xz-libs-5.1.2-9alpha.el7.x86_64 zlib-1.2.7-13.el7.x86_64 (gdb) bt #0 __GI___pthread_mutex_lock (mutex=mutex@entry=0x0) at pthread_mutex_lock.c:50 #1 0x00007fd71cbfa6f8 in gf_changelog_process (data=0x7fd7140589a0) at gf-changelog-journal-handler.c:649 #2 0x00007fd72ae8adf5 in start_thread (arg=0x7fd6feffd700) at pthread_create.c:308 #3 0x00007fd72a4af1ad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 (gdb) Master Node:2 ============= [root@rhsqe-vm02 ~]# ls -lrt /core* -rw-------. 1 root root 123850752 Jun 16 23:16 /core.14536 -rw-------. 1 root root 133541888 Jun 17 00:17 /core.19738 -rw-------. 1 root root 125153280 Jun 17 00:44 /core.30244 -rw-------. 1 root root 133562368 Jun 17 01:39 /core.20706 -rw-------. 1 root root 124919808 Jun 17 02:29 /core.5475 -rw-------. 1 root root 131444736 Jun 17 02:47 /core.6491 -rw-------. 1 root root 132493312 Jun 17 03:55 /core.26122 -rw-------. 1 root root 124952576 Jun 17 04:26 /core.28572 -rw-------. 1 root root 131469312 Jun 17 05:41 /core.1853 -rw-------. 1 root root 133541888 Jun 17 08:39 /core.19311 -rw-------. 1 root root 133562368 Jun 17 10:24 /core.29696 -rw-------. 1 root root 123871232 Jun 17 10:56 /core.14069 -rw-------. 1 root root 123056128 Jun 17 11:46 /core.14790 -rw-------. 1 root root 125173760 Jun 17 13:01 /core.23855 -rw-------. 1 root root 124899328 Jun 17 15:51 /core.3993 -rw-------. 1 root root 118661120 Jun 17 16:31 /core.21503 -rw-------. 1 root root 123904000 Jun 17 17:48 /core.31272 [root@rhsqe-vm02 ~]# file /core.* /core.14069: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0' /core.14536: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0' /core.14790: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0' /core.1853: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0' /core.19311: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0' /core.19738: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0' /core.20706: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0' /core.21503: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0' /core.23855: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0' /core.26122: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0' /core.28572: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0' /core.29696: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0' /core.30244: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0' /core.31272: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0' /core.3993: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0' /core.5475: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0' /core.6491: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0' [root@rhsqe-vm02 ~]# [root@rhsqe-vm02 ~]# gdb python /core.31272 GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-64.el7 Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /usr/bin/python2.7...Reading symbols from /usr/bin/python2.7...(no debugging symbols found)...done. (no debugging symbols found)...done. warning: core file may not match specified executable file. [New LWP 31324] [New LWP 31272] [New LWP 31282] [New LWP 31284] [New LWP 31318] [New LWP 31285] [New LWP 31316] [New LWP 31317] [New LWP 31320] [New LWP 31321] [New LWP 31319] [New LWP 31322] [New LWP 31323] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'. Program terminated with signal 11, Segmentation fault. #0 0x0000000000000000 in ?? () Missing separate debuginfos, use: debuginfo-install python-2.7.5-16.el7.x86_64 (gdb) bt #0 0x0000000000000000 in ?? () #1 0x00007f23e016187c in gf_changelog_callback_invoker (arg=0x7f23cc0587e0) at gf-changelog-reborp.c:293 #2 0x00007f23ed3ecdf5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f23eca111ad in clone () from /lib64/libc.so.6 (gdb) quit [root@rhsqe-vm02 ~]#
RCA: Assignments to arguments passed to thread is done post thread creation. If the thread created is scheduled before the assignment and access these variables, it would crash with segmentation fault. Solution: Assignments to arguments are done prior to the thread creation. Patch Posted Upstream: http://review.gluster.org/#/c/11273/
Upstream master Patch: http://review.gluster.org/11273 Upstream 3.7 Patch: http://review.gluster.org/11308 Downstream Patch: https://code.engineering.redhat.com/gerrit/#/c/51069/
Verified with build: glusterfs-3.7.1-7.el7rhgs.x86_64 Ran the cases for changelog,xsync and history. Didn't see segmentation fault. Moving this bug to verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-1495.html