Bug 1232609 - [geo-rep]: RHEL7.1 segmentation faults are observed on all the master nodes
Summary: [geo-rep]: RHEL7.1 segmentation faults are observed on all the master nodes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: geo-replication
Version: rhgs-3.1
Hardware: x86_64
OS: Linux
high
urgent
Target Milestone: ---
: RHGS 3.1.0
Assignee: Kotresh HR
QA Contact: Rahul Hinduja
URL:
Whiteboard:
Depends On:
Blocks: 1202842 1232666 1233044
TreeView+ depends on / blocked
 
Reported: 2015-06-17 07:25 UTC by Rahul Hinduja
Modified: 2015-07-29 05:05 UTC (History)
5 users (show)

Fixed In Version: glusterfs-3.7.1-5
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1232666 (view as bug list)
Environment:
Last Closed: 2015-07-29 05:05:21 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:1495 0 normal SHIPPED_LIVE Important: Red Hat Gluster Storage 3.1 update 2015-07-29 08:26:26 UTC

Description Rahul Hinduja 2015-06-17 07:25:21 UTC
Description of problem:
=======================

Ran basic geo-rep cases with changelog,xsync and history crawl. Found the cores on all the master nodes.

Master Node:1
=============
[root@rhsqe-vm01 ~]# ls -lrt /core*
-rw-------. 1 root root 125153280 Jun 16 23:14 /core.16155
-rw-------. 1 root root 133541888 Jun 17 00:57 /core.9695
-rw-------. 1 root root 132493312 Jun 17 02:46 /core.14005
-rw-------. 1 root root 133541888 Jun 17 02:59 /core.8089
-rw-------. 1 root root 133541888 Jun 17 04:04 /core.27626
-rw-------. 1 root root 132493312 Jun 17 07:55 /core.16584
-rw-------. 1 root root 132513792 Jun 17 09:25 /core.29550
-rw-------. 1 root root 123850752 Jun 17 12:07 /core.26792
-rw-------. 1 root root 124919808 Jun 17 13:23 /core.3604
-rw-------. 1 root root 127275008 Jun 17 14:39 /core.22976
-rw-------. 1 root root 133566464 Jun 17 15:13 /core.25537
-rw-------. 1 root root 131469312 Jun 17 15:45 /core.1220
[root@rhsqe-vm01 ~]# 

[root@rhsqe-vm01 ~]# file /core*
/core.1220:  ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.14005: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.16155: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.16584: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.22976: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.25537: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.26792: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.27626: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.29550: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.3604:  ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.8089:  ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.9695:  ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
[root@rhsqe-vm01 ~]# 

[New LWP 1271]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'.
Program terminated with signal 11, Segmentation fault.
#0  __GI___pthread_mutex_lock (mutex=mutex@entry=0x0) at pthread_mutex_lock.c:50
50	  unsigned int type = PTHREAD_MUTEX_TYPE (mutex);
Missing separate debuginfos, use: debuginfo-install keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.12.2-14.el7.x86_64 libcom_err-1.42.9-7.el7.x86_64 libffi-3.0.13-11.el7.x86_64 libselinux-2.2.2-6.el7.x86_64 libuuid-2.23.2-21.el7.x86_64 openssl-libs-1.0.1e-42.el7.x86_64 pcre-8.32-14.el7.x86_64 xz-libs-5.1.2-9alpha.el7.x86_64 zlib-1.2.7-13.el7.x86_64
(gdb) bt
#0  __GI___pthread_mutex_lock (mutex=mutex@entry=0x0) at pthread_mutex_lock.c:50
#1  0x00007fd71cbfa6f8 in gf_changelog_process (data=0x7fd7140589a0)
    at gf-changelog-journal-handler.c:649
#2  0x00007fd72ae8adf5 in start_thread (arg=0x7fd6feffd700) at pthread_create.c:308
#3  0x00007fd72a4af1ad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
(gdb) 



Master Node:2
=============

[root@rhsqe-vm02 ~]# ls -lrt /core*
-rw-------. 1 root root 123850752 Jun 16 23:16 /core.14536
-rw-------. 1 root root 133541888 Jun 17 00:17 /core.19738
-rw-------. 1 root root 125153280 Jun 17 00:44 /core.30244
-rw-------. 1 root root 133562368 Jun 17 01:39 /core.20706
-rw-------. 1 root root 124919808 Jun 17 02:29 /core.5475
-rw-------. 1 root root 131444736 Jun 17 02:47 /core.6491
-rw-------. 1 root root 132493312 Jun 17 03:55 /core.26122
-rw-------. 1 root root 124952576 Jun 17 04:26 /core.28572
-rw-------. 1 root root 131469312 Jun 17 05:41 /core.1853
-rw-------. 1 root root 133541888 Jun 17 08:39 /core.19311
-rw-------. 1 root root 133562368 Jun 17 10:24 /core.29696
-rw-------. 1 root root 123871232 Jun 17 10:56 /core.14069
-rw-------. 1 root root 123056128 Jun 17 11:46 /core.14790
-rw-------. 1 root root 125173760 Jun 17 13:01 /core.23855
-rw-------. 1 root root 124899328 Jun 17 15:51 /core.3993
-rw-------. 1 root root 118661120 Jun 17 16:31 /core.21503
-rw-------. 1 root root 123904000 Jun 17 17:48 /core.31272
[root@rhsqe-vm02 ~]# file /core.*
/core.14069: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.14536: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.14790: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.1853:  ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.19311: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.19738: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.20706: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.21503: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.23855: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.26122: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.28572: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.29696: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.30244: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.31272: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.3993:  ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.5475:  ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
/core.6491:  ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'
[root@rhsqe-vm02 ~]# 
[root@rhsqe-vm02 ~]# gdb python /core.31272
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-64.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/bin/python2.7...Reading symbols from /usr/bin/python2.7...(no debugging symbols found)...done.
(no debugging symbols found)...done.

warning: core file may not match specified executable file.
[New LWP 31324]
[New LWP 31272]
[New LWP 31282]
[New LWP 31284]
[New LWP 31318]
[New LWP 31285]
[New LWP 31316]
[New LWP 31317]
[New LWP 31320]
[New LWP 31321]
[New LWP 31319]
[New LWP 31322]
[New LWP 31323]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --path=/bricks/brick0'.
Program terminated with signal 11, Segmentation fault.
#0  0x0000000000000000 in ?? ()
Missing separate debuginfos, use: debuginfo-install python-2.7.5-16.el7.x86_64
(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x00007f23e016187c in gf_changelog_callback_invoker (arg=0x7f23cc0587e0)
    at gf-changelog-reborp.c:293
#2  0x00007f23ed3ecdf5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f23eca111ad in clone () from /lib64/libc.so.6
(gdb) quit
[root@rhsqe-vm02 ~]#

Comment 3 Kotresh HR 2015-06-17 09:53:25 UTC
RCA:

Assignments to arguments passed to thread is done
post thread creation. If the thread created is
scheduled before the assignment and access these
variables, it would crash with segmentation fault.
    
Solution:
Assignments to arguments are done prior to the thread
creation.

Patch Posted Upstream:
http://review.gluster.org/#/c/11273/

Comment 5 Kotresh HR 2015-06-18 18:57:22 UTC
Upstream master Patch:
http://review.gluster.org/11273

Upstream 3.7 Patch:
http://review.gluster.org/11308

Downstream Patch:
https://code.engineering.redhat.com/gerrit/#/c/51069/

Comment 6 Rahul Hinduja 2015-07-06 11:31:02 UTC
Verified with build:  glusterfs-3.7.1-7.el7rhgs.x86_64

Ran the cases for changelog,xsync and history. Didn't see segmentation fault. Moving this bug to verified state.

Comment 7 errata-xmlrpc 2015-07-29 05:05:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1495.html


Note You need to log in before you can comment on or make changes to this bug.