Bug 1570586

Summary: Glusterd crashed on a few (master) nodes
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Rochelle <rallan>
Component: glusterdAssignee: Kotresh HR <khiremat>
Status: CLOSED ERRATA QA Contact: Rochelle <rallan>
Severity: high Docs Contact:
Priority: high    
Version: rhgs-3.4CC: amukherj, atumball, khiremat, nbalacha, rallan, rhinduja, rhs-bugs, sankarshan, sheggodu, storage-qa-internal, vbellur, vdas
Target Milestone: ---Keywords: Reopened
Target Release: RHGS 3.4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.12.2-10 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1576392 (view as bug list) Environment:
Last Closed: 2018-09-04 06:47:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1503137, 1576392, 1577868, 1611106    

Description Rochelle 2018-04-23 09:45:52 UTC
Description of problem:
=======================

Glusterd crashed on a few nodes
Geo-replication status was CREATED/ACTIVE as opposed to ACTIVE/PASSIVE.

Geo-replication session was started and the following was shown as the status of the session:
----------------------------------------------------------------------------------------------
[root@dhcp41-226 scripts]# gluster volume geo-replication master 10.70.41.160::slave status
 
MASTER NODE     MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE      STATUS     CRAWL STATUS       LAST_SYNCED                  
-----------------------------------------------------------------------------------------------------------------------------------------------------
10.70.41.226    master        /rhs/brick3/b7    root          10.70.41.160::slave    N/A             Created    N/A                N/A                          
10.70.41.226    master        /rhs/brick1/b1    root          10.70.41.160::slave    N/A             Created    N/A                N/A                          
10.70.41.230    master        /rhs/brick2/b5    root          10.70.41.160::slave    N/A             Created    N/A                N/A                          
10.70.41.229    master        /rhs/brick2/b4    root          10.70.41.160::slave    N/A             Created    N/A                N/A                          
10.70.41.219    master        /rhs/brick2/b6    root          10.70.41.160::slave    N/A             Created    N/A                N/A                          
10.70.41.227    master        /rhs/brick3/b8    root          10.70.41.160::slave    N/A             Created    N/A                N/A                          
10.70.41.227    master        /rhs/brick1/b2    root          10.70.41.160::slave    N/A             Created    N/A                N/A                          
10.70.41.228    master        /rhs/brick3/b9    root          10.70.41.160::slave    10.70.41.160    Active     Changelog Crawl    2018-04-23 06:13:53          
10.70.41.228    master        /rhs/brick1/b3    root          10.70.41.160::slave    10.70.42.79     Active     Changelog Crawl    2018-04-23 06:13:53        


glusterd logs:
-------------
[2018-04-23 07:34:16.850166] E [mem-pool.c:307:__gf_free] (-->/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x419cf) [0x7f98a9e619cf] -->/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x44ca5) [0x7f98a9e64ca5] -->/lib64/libglusterfs.so.0(__gf_free+0xac) [0x7f98b53e268c] ) 0-: Assertion failed: GF_MEM_HEADER_MAGIC == header->magic
pending frames:
frame : type(0) op(0)
frame : type(0) op(0)
patchset: git://git.gluster.org/glusterfs.git
signal received: 6
time of crash: 
2018-04-23 07:34:16
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.12.2
/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xa0)[0x7f98b53ba4d0]
/lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7f98b53c4414]
/lib64/libc.so.6(+0x36280)[0x7f98b3a19280]
/lib64/libc.so.6(gsignal+0x37)[0x7f98b3a19207]
/lib64/libc.so.6(abort+0x148)[0x7f98b3a1a8f8]
/lib64/libc.so.6(+0x78cc7)[0x7f98b3a5bcc7]
/lib64/libc.so.6(+0x7f574)[0x7f98b3a62574]
/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x44ca5)[0x7f98a9e64ca5]
/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x419cf)[0x7f98a9e619cf]
/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x1bdc2)[0x7f98a9e3bdc2]
/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x23b6e)[0x7f98a9e43b6e]
/lib64/libglusterfs.so.0(synctask_wrap+0x10)[0x7f98b53f3250]
/lib64/libc.so.6(+0x47fc0)[0x7f98b3a2afc0]
---------



Version-Release number of selected component (if applicable):
=============================================================
[root@dhcp41-226 ~]# rpm -qa | grep gluster
glusterfs-fuse-3.12.2-7.el7rhgs.x86_64
glusterfs-geo-replication-3.12.2-7.el7rhgs.x86_64
gluster-nagios-common-0.2.4-1.el7rhgs.noarch
glusterfs-libs-3.12.2-7.el7rhgs.x86_64
glusterfs-cli-3.12.2-7.el7rhgs.x86_64
libvirt-daemon-driver-storage-gluster-3.9.0-14.el7_5.2.x86_64
glusterfs-rdma-3.12.2-7.el7rhgs.x86_64
glusterfs-events-3.12.2-7.el7rhgs.x86_64
glusterfs-3.12.2-7.el7rhgs.x86_64
glusterfs-client-xlators-3.12.2-7.el7rhgs.x86_64
glusterfs-server-3.12.2-7.el7rhgs.x86_64
vdsm-gluster-4.19.43-2.3.el7rhgs.noarch
python2-gluster-3.12.2-7.el7rhgs.x86_64
gluster-nagios-addons-0.2.10-2.el7rhgs.x86_64
glusterfs-api-3.12.2-7.el7rhgs.x86_64



How reproducible:
=================
1/1


Steps to Reproduce:
===================
1. Create Master and a Slave cluster from 6 nodes (each)
2. Create and Start master volume (Tiered: cold-tier 1x(4+2)  and hot-tier 1x3)
4. Create and Start slave volume (Tiered: cold-tier 1x(4+2)  and hot-tier 1x3)
5. Enable quota on master volume 
6. Enable shared storage on master volume
7. Setup geo-rep session between master and slave volume 
8. Mount master volume on client 
9. Create data from master client

Actual results:
================
Glusterd crashed on a few nodes
Geo-rep session was in Created/ACTIVE state

Expected results:
=================
Glusterd should not crash
A geo-rep session which was started should be in ACTIVE/PASSIVE state.

Comment 16 errata-xmlrpc 2018-09-04 06:47:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2607