Bug 1233044 - [geo-rep]: Segmentation faults are observed on all the master nodes
Summary: [geo-rep]: Segmentation faults are observed on all the master nodes
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: geo-replication
Version: 3.7.1
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Kotresh HR
QA Contact:
URL:
Whiteboard:
Depends On: 1232609 1232666
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-06-18 06:51 UTC by Kotresh HR
Modified: 2015-06-20 09:51 UTC (History)
8 users (show)

Fixed In Version: glusterfs-3.7.2
Doc Type: Bug Fix
Doc Text:
Clone Of: 1232666
Environment:
Last Closed: 2015-06-20 09:51:39 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Kotresh HR 2015-06-18 06:51:00 UTC
+++ This bug was initially created as a clone of Bug #1232666 +++

+++ This bug was initially created as a clone of Bug #1232609 +++

Description of problem:
=======================

Ran basic geo-rep cases with changelog,xsync and history crawl. Found the cores on all the master nodes.

Master Node:1
=============
[root@rhsqe-vm01 ~]# ls -lrt /core*
-rw-------. 1 root root 125153280 Jun 16 23:14 /core.16155
-rw-------. 1 root root 133541888 Jun 17 00:57 /core.9695
-rw-------. 1 root root 132493312 Jun 17 02:46 /core.14005
-rw-------. 1 root root 133541888 Jun 17 02:59 /core.8089
-rw-------. 1 root root 133541888 Jun 17 04:04 /core.27626
-rw-------. 1 root root 132493312 Jun 17 07:55 /core.16584
-rw-------. 1 root root 132513792 Jun 17 09:25 /core.29550
-rw-------. 1 root root 123850752 Jun 17 12:07 /core.26792
-rw-------. 1 root root 124919808 Jun 17 13:23 /core.3604
-rw-------. 1 root root 127275008 Jun 17 14:39 /core.22976
-rw-------. 1 root root 133566464 Jun 17 15:13 /core.25537
-rw-------. 1 root root 131469312 Jun 17 15:45 /core.1220
[root@rhsqe-vm01 ~]# 

[root@rhsqe-vm01 ~]# file /core*
/core.1220:  ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.14005: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.16155: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.16584: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.22976: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.25537: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.26792: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.27626: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.29550: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.3604:  ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.8089:  ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.9695:  ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
[root@rhsqe-vm01 ~]# 

[New LWP 1271]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'.
Program terminated with signal 11, Segmentation fault.
#0  __GI___pthread_mutex_lock (mutex=mutex@entry=0x0) at pthread_mutex_lock.c:50
50	  unsigned int type = PTHREAD_MUTEX_TYPE (mutex);
Missing separate debuginfos, use: debuginfo-install keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.12.2-14.el7.x86_64 libcom_err-1.42.9-7.el7.x86_64 libffi-3.0.13-11.el7.x86_64 libselinux-2.2.2-6.el7.x86_64 libuuid-2.23.2-21.el7.x86_64 openssl-libs-1.0.1e-42.el7.x86_64 pcre-8.32-14.el7.x86_64 xz-libs-5.1.2-9alpha.el7.x86_64 zlib-1.2.7-13.el7.x86_64
(gdb) bt
#0  __GI___pthread_mutex_lock (mutex=mutex@entry=0x0) at pthread_mutex_lock.c:50
#1  0x00007fd71cbfa6f8 in gf_changelog_process (data=0x7fd7140589a0)
    at gf-changelog-journal-handler.c:649
#2  0x00007fd72ae8adf5 in start_thread (arg=0x7fd6feffd700) at pthread_create.c:308
#3  0x00007fd72a4af1ad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
(gdb) 



Master Node:2
=============

[root@rhsqe-vm02 ~]# ls -lrt /core*
-rw-------. 1 root root 123850752 Jun 16 23:16 /core.14536
-rw-------. 1 root root 133541888 Jun 17 00:17 /core.19738
-rw-------. 1 root root 125153280 Jun 17 00:44 /core.30244
-rw-------. 1 root root 133562368 Jun 17 01:39 /core.20706
-rw-------. 1 root root 124919808 Jun 17 02:29 /core.5475
-rw-------. 1 root root 131444736 Jun 17 02:47 /core.6491
-rw-------. 1 root root 132493312 Jun 17 03:55 /core.26122
-rw-------. 1 root root 124952576 Jun 17 04:26 /core.28572
-rw-------. 1 root root 131469312 Jun 17 05:41 /core.1853
-rw-------. 1 root root 133541888 Jun 17 08:39 /core.19311
-rw-------. 1 root root 133562368 Jun 17 10:24 /core.29696
-rw-------. 1 root root 123871232 Jun 17 10:56 /core.14069
-rw-------. 1 root root 123056128 Jun 17 11:46 /core.14790
-rw-------. 1 root root 125173760 Jun 17 13:01 /core.23855
-rw-------. 1 root root 124899328 Jun 17 15:51 /core.3993
-rw-------. 1 root root 118661120 Jun 17 16:31 /core.21503
-rw-------. 1 root root 123904000 Jun 17 17:48 /core.31272
[root@rhsqe-vm02 ~]# file /core.*
/core.14069: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.14536: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.14790: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.1853:  ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.19311: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.19738: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.20706: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.21503: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.23855: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.26122: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.28572: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.29696: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.30244: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.31272: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.3993:  ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.5475:  ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.6491:  ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
[root@rhsqe-vm02 ~]# 
[root@rhsqe-vm02 ~]# gdb python /core.31272
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-64.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/bin/python2.7...Reading symbols from /usr/bin/python2.7...(no debugging symbols found)...done.
(no debugging symbols found)...done.

warning: core file may not match specified executable file.
[New LWP 31324]
[New LWP 31272]
[New LWP 31282]
[New LWP 31284]
[New LWP 31318]
[New LWP 31285]
[New LWP 31316]
[New LWP 31317]
[New LWP 31320]
[New LWP 31321]
[New LWP 31319]
[New LWP 31322]
[New LWP 31323]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'.
Program terminated with signal 11, Segmentation fault.
#0  0x0000000000000000 in ?? ()
Missing separate debuginfos, use: debuginfo-install python-2.7.5-16.el7.x86_64
(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x00007f23e016187c in gf_changelog_callback_invoker (arg=0x7f23cc0587e0)
    at gf-changelog-reborp.c:293
#2  0x00007f23ed3ecdf5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f23eca111ad in clone () from /lib64/libc.so.6
(gdb) quit
[root@rhsqe-vm02 ~]#

--- Additional comment from Anand Avati on 2015-06-17 05:37:32 EDT ---

REVIEW: http://review.gluster.org/11273 (libgfchangelog: Fix crash in gf_changelog_process) posted (#1) for review on master by Kotresh HR (khiremat)

--- Additional comment from Anand Avati on 2015-06-17 13:59:14 EDT ---

REVIEW: http://review.gluster.org/11273 (libgfchangelog: Fix crash in gf_changelog_process) posted (#2) for review on master by Kotresh HR (khiremat)

--- Additional comment from Anand Avati on 2015-06-18 02:34:15 EDT ---

COMMIT: http://review.gluster.org/11273 committed in master by Venky Shankar (vshankar) 
------
commit ba7d5d914b2c897aef0616f3d95beb4d17bc51a8
Author: Kotresh HR <khiremat>
Date:   Wed Jun 17 14:39:26 2015 +0530

    libgfchangelog: Fix crash in gf_changelog_process
    
    Problem:
        Crash observed in gf_changelog_process and
        gf_changelog_callback_invoker.
    
    Cause:
        Assignments to arguments passed to thread is done
        post thread creation. If the thread created gets
        scheduled before the assignment and access these
        variables, it would crash with segmentation fault.
    
    Solution:
        Assignments to arguments are done prior to the thread
        creation.
    
    Change-Id: I6afc8ccedd050cf4b50b967fef8287a0c834177b
    BUG: 1232666
    Signed-off-by: Kotresh HR <khiremat>
    Reviewed-on: http://review.gluster.org/11273
    Tested-by: NetBSD Build System <jenkins.org>
    Reviewed-by: Venky Shankar <vshankar>

Comment 1 Anand Avati 2015-06-18 07:01:15 UTC
REVIEW: http://review.gluster.org/11308 (libgfchangelog: Fix crash in gf_changelog_process) posted (#1) for review on release-3.7 by Kotresh HR (khiremat)

Comment 2 Anand Avati 2015-06-19 07:12:38 UTC
COMMIT: http://review.gluster.org/11308 committed in release-3.7 by Venky Shankar (vshankar) 
------
commit d37920661fa36aa1c77de20351d79f7378222e80
Author: Kotresh HR <khiremat>
Date:   Wed Jun 17 14:39:26 2015 +0530

    libgfchangelog: Fix crash in gf_changelog_process
    
    Problem:
        Crash observed in gf_changelog_process and
        gf_changelog_callback_invoker.
    
    Cause:
        Assignments to arguments passed to thread is done
        post thread creation. If the thread created gets
        scheduled before the assignment and access these
        variables, it would crash with segmentation fault.
    
    Solution:
        Assignments to arguments are done prior to the thread
        creation.
    
    BUG: 1233044
    Change-Id: I520599ab43026d25f4064ce71bd5a8b8e0d4b90a
    Signed-off-by: Kotresh HR <khiremat>
    Reviewed-on: http://review.gluster.org/11273
    Reviewed-on: http://review.gluster.org/11308
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Venky Shankar <vshankar>

Comment 3 Niels de Vos 2015-06-20 09:51:39 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.2, please reopen this bug report.

glusterfs-3.7.2 has been announced on the Gluster Packaging mailinglist [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://www.gluster.org/pipermail/packaging/2015-June/000006.html
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.