Bug 1219894

Summary: [georep]: Creating geo-rep session kills all the brick process
Product: [Community] GlusterFS Reporter: Kotresh HR <khiremat>
Component: geo-replicationAssignee: Kotresh HR <khiremat>
Status: CLOSED CURRENTRELEASE QA Contact: Rahul Hinduja <rhinduja>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: mainlineCC: bugs, gluster-bugs, rhinduja
Target Milestone: ---Keywords: Regression, Reopened
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.8rc2 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1219823 Environment:
Last Closed: 2016-06-16 12:59:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1219823    

Comment 1 Kotresh HR 2015-05-08 15:41:38 UTC
+++ This bug was initially created as a clone of Bug #1219823 +++

Description of problem:
=======================

With the latest nightly "glusterfs-3.7.0beta1-0.69.git1a32479.el6.x86_64" , when the geo-rep session is created, all the bricks are crashed with the following logs:

pending frames:
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 6
time of crash: 
2015-05-08 11:15:04
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.7.0beta1
/usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb6)[0x7f693cbc4576]
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x33f)[0x7f693cbe2eaf]
/lib64/libc.so.6[0x3683a326a0]
/lib64/libc.so.6(gsignal+0x35)[0x3683a32625]
/lib64/libc.so.6(abort+0x175)[0x3683a33e05]
/lib64/libc.so.6[0x3683a70537]
/lib64/libc.so.6(__fortify_fail+0x37)[0x3683b02697]
/lib64/libc.so.6[0x3683b00580]
/lib64/libc.so.6[0x3683affc7b]
/lib64/libc.so.6(__snprintf_chk+0x7a)[0x3683affb4a]
/usr/lib64/glusterfs/3.7.0beta1/xlator/features/changelog.so(htime_create+0x16d)[0x7f6931133ecd]
/usr/lib64/glusterfs/3.7.0beta1/xlator/features/changelog.so(reconfigure+0x486)[0x7f693112aa46]
/usr/lib64/libglusterfs.so.0(+0x75cfa)[0x7f693cc15cfa]
/usr/lib64/libglusterfs.so.0(+0x75c8c)[0x7f693cc15c8c]
/usr/lib64/libglusterfs.so.0(+0x75c8c)[0x7f693cc15c8c]
/usr/lib64/libglusterfs.so.0(+0x75c8c)[0x7f693cc15c8c]
/usr/lib64/libglusterfs.so.0(+0x75c8c)[0x7f693cc15c8c]
/usr/lib64/libglusterfs.so.0(+0x75c8c)[0x7f693cc15c8c]
/usr/lib64/libglusterfs.so.0(+0x75c8c)[0x7f693cc15c8c]
/usr/lib64/libglusterfs.so.0(+0x75c8c)[0x7f693cc15c8c]
/usr/lib64/libglusterfs.so.0(+0x75c8c)[0x7f693cc15c8c]
/usr/lib64/libglusterfs.so.0(+0x75c8c)[0x7f693cc15c8c]
/usr/lib64/libglusterfs.so.0(+0x75c8c)[0x7f693cc15c8c]
/usr/lib64/libglusterfs.so.0(+0x75c8c)[0x7f693cc15c8c]
/usr/lib64/libglusterfs.so.0(+0x75c8c)[0x7f693cc15c8c]
/usr/lib64/libglusterfs.so.0(+0x75c8c)[0x7f693cc15c8c]
/usr/lib64/libglusterfs.so.0(glusterfs_volfile_reconfigure+0x1a2)[0x7f693cc024a2]
/usr/sbin/glusterfsd(mgmt_getspec_cbk+0x2f3)[0x40d0b3]
/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)[0x7f693c994d75]
/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x142)[0x7f693c996212]
/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x7f693c9918e8]
/usr/lib64/glusterfs/3.7.0beta1/rpc-transport/socket.so(+0x9bcd)[0x7f69329d1bcd]
/usr/lib64/glusterfs/3.7.0beta1/rpc-transport/socket.so(+0xb6fd)[0x7f69329d36fd]
/usr/lib64/libglusterfs.so.0(+0x807c0)[0x7f693cc207c0]
/lib64/libpthread.so.0[0x3683e079d1]
/lib64/libc.so.6(clone+0x6d)[0x3683ae89dd]
---------


No core is found.


Version-Release number of selected component (if applicable):
=============================================================

How reproducible:
=================
always


Steps to Reproduce:
===================

[root@georep1 ~]# gluster peer probe 10.70.46.97
peer probe: success. 
[root@georep1 ~]# sleep 2
[root@georep1 ~]# gluster peer probe 10.70.46.93
peer probe: success. 
[root@georep1 ~]# 
[root@georep1 ~]# 
[root@georep1 ~]# # Master volume creation
[root@georep1 ~]# 
[root@georep1 ~]# gluster volume create master replica 3 10.70.46.96:/rhs/brick1/b1 10.70.46.97:/rhs/brick1/b1 10.70.46.93:/rhs/brick1/b1 10.70.46.96:/rhs/brick2/b2 10.70.46.97:/rhs/brick2/b2 10.70.46.93:/rhs/brick2/b2 
volume create: master: success: please start the volume to access data
[root@georep1 ~]# 
[root@georep1 ~]# gluster volume start master
volume start: master: success
[root@georep1 ~]# gluster system:: execute gsec_create
Common secret pub file present at /var/lib/glusterd/geo-replication/common_secret.pem.pub
[root@georep1 ~]# gluster v status
Status of volume: master
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.46.96:/rhs/brick1/b1            49152     0          Y       24030
Brick 10.70.46.97:/rhs/brick1/b1            49152     0          Y       9656 
Brick 10.70.46.93:/rhs/brick1/b1            49152     0          Y       11957
Brick 10.70.46.96:/rhs/brick2/b2            49153     0          Y       24047
Brick 10.70.46.97:/rhs/brick2/b2            49153     0          Y       9673 
Brick 10.70.46.93:/rhs/brick2/b2            49153     0          Y       11974
NFS Server on localhost                     2049      0          Y       24067
Self-heal Daemon on localhost               N/A       N/A        Y       24074
NFS Server on 10.70.46.97                   2049      0          Y       9692 
Self-heal Daemon on 10.70.46.97             N/A       N/A        Y       9701 
NFS Server on 10.70.46.93                   2049      0          Y       11994
Self-heal Daemon on 10.70.46.93             N/A       N/A        Y       12001
 
Task Status of Volume master
------------------------------------------------------------------------------
There are no active volume tasks
 
[root@georep1 ~]# gluster volume geo-replication master 10.70.46.154::slave create push-pem
Creating geo-replication session between master & 10.70.46.154::slave has been successful
[root@georep1 ~]# gluster v status
Status of volume: master
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.46.96:/rhs/brick1/b1            N/A       N/A        N       24030
Brick 10.70.46.97:/rhs/brick1/b1            49152     0          Y       9656 
Brick 10.70.46.93:/rhs/brick1/b1            49152     0          Y       11957
Brick 10.70.46.96:/rhs/brick2/b2            N/A       N/A        N       24047
Brick 10.70.46.97:/rhs/brick2/b2            N/A       N/A        N       9673 
Brick 10.70.46.93:/rhs/brick2/b2            49153     0          Y       11974
NFS Server on localhost                     2049      0          Y       24401
Self-heal Daemon on localhost               N/A       N/A        Y       24412
NFS Server on 10.70.46.93                   2049      0          Y       12205
Self-heal Daemon on 10.70.46.93             N/A       N/A        Y       12214
NFS Server on 10.70.46.97                   2049      0          Y       9904 
Self-heal Daemon on 10.70.46.97             N/A       N/A        Y       9912 
 
Task Status of Volume master
------------------------------------------------------------------------------
There are no active volume tasks
 
[root@georep1 ~]# gluster v status
Status of volume: master
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.46.96:/rhs/brick1/b1            N/A       N/A        N       24030
Brick 10.70.46.97:/rhs/brick1/b1            N/A       N/A        N       9656 
Brick 10.70.46.93:/rhs/brick1/b1            N/A       N/A        N       11957
Brick 10.70.46.96:/rhs/brick2/b2            N/A       N/A        N       24047
Brick 10.70.46.97:/rhs/brick2/b2            N/A       N/A        N       9673 
Brick 10.70.46.93:/rhs/brick2/b2            N/A       N/A        N       11974
NFS Server on localhost                     2049      0          Y       24401
Self-heal Daemon on localhost               N/A       N/A        Y       24412
NFS Server on 10.70.46.93                   2049      0          Y       12205
Self-heal Daemon on 10.70.46.93             N/A       N/A        Y       12214
NFS Server on 10.70.46.97                   2049      0          Y       9904 
Self-heal Daemon on 10.70.46.97             N/A       N/A        Y       9912 
 
Task Status of Volume master
------------------------------------------------------------------------------
There are no active volume tasks
 
[root@georep1 ~]#

Comment 2 Kotresh HR 2015-05-08 15:52:07 UTC
Patch sent :
http://review.gluster.org/#/c/10687/

Comment 3 Anand Avati 2015-05-09 03:31:35 UTC
COMMIT: http://review.gluster.org/10687 committed in master by Vijay Bellur (vbellur) 
------
commit bbff9e1ef72e2eab63e5d7ecd5dfa36497b642ed
Author: Kotresh HR <khiremat>
Date:   Fri May 8 21:03:09 2015 +0530

    features/changelog: Fix buffer overflow in snprintf
    
    Change-Id: Ie7e7c6028c7bffe47e60a2e93827e0e8767a3d66
    BUG: 1219894
    Signed-off-by: Kotresh HR <khiremat>
    Reviewed-on: http://review.gluster.org/10687
    Reviewed-by: Aravinda VK <avishwan>
    Tested-by: Gluster Build System <jenkins.com>
    Tested-by: NetBSD Build System

Comment 4 Aravinda VK 2015-05-18 10:46:36 UTC
This bug is getting closed because a release has been made available that
should address the reported issue. In case the problem is still not fixed with
glusterfs-3.7.0, please open a new bug report.

Comment 5 Niels de Vos 2016-06-16 12:59:33 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user