Bug 1107665

Summary: Dist-geo-rep : Glusterd crashed while resetting use-tarssh config option in geo-rep.
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Vijaykumar Koppad <vkoppad>
Component: glusterdAssignee: Kotresh HR <khiremat>
Status: CLOSED ERRATA QA Contact: Bhaskar Bandari <bbandari>
Severity: urgent Docs Contact:
Priority: urgent    
Version: rhgs-3.0CC: avishwan, bbandari, david.macdonald, nlevinki, nsathyan, ssamanta, vagarwal, vbellur
Target Milestone: ---Keywords: Regression
Target Release: RHGS 3.0.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.6.0.18-1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1107984 (view as bug list) Environment:
Last Closed: 2014-09-22 19:40:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1107984    

Description Vijaykumar Koppad 2014-06-10 12:30:23 UTC
Description of problem: Glusterd crashed while resetting geo-rep config options. 

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
# gluster v geo master 10.70.43.170::slave config \!use-tarssh     
Connection failed. Please check if gluster daemon is operational.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

bt
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
[2014-06-10 11:50:37.850691] I [glusterd-rpc-ops.c:556:__glusterd_friend_update_cbk] 0-management: Received ACC from uuid: 0c14822d-ca1a-4145-9d4b-34714fa3f15a
[2014-06-10 11:50:38.840641] I [glusterd-geo-rep.c:1833:glusterd_get_statefile_name] 0-: Using passed config template(/var/lib/glusterd/geo-replication/master_10.70.43.170_slave/gsyncd.conf).
pending frames:
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash:
2014-06-10 11:50:40
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.6.0.15
/usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb6)[0x7fddbd74be56]
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x33f)[0x7fddbd76628f]
/lib64/libc.so.6[0x3f3f4329a0]
/usr/lib64/glusterfs/3.6.0.15/xlator/mgmt/glusterd.so(glusterd_gsync_op_already_set+0x1bb)[0x7fddafd6f45b]
/usr/lib64/glusterfs/3.6.0.15/xlator/mgmt/glusterd.so(glusterd_op_gsync_set+0xf3f)[0x7fddafd7619f]
/usr/lib64/glusterfs/3.6.0.15/xlator/mgmt/glusterd.so(glusterd_op_commit_perform+0x276)[0x7fddafd1b806]
/usr/lib64/glusterfs/3.6.0.15/xlator/mgmt/glusterd.so(gd_commit_op_phase+0xbe)[0x7fddafd9168e]
/usr/lib64/glusterfs/3.6.0.15/xlator/mgmt/glusterd.so(gd_sync_task_begin+0x7f2)[0x7fddafd933b2]
/usr/lib64/glusterfs/3.6.0.15/xlator/mgmt/glusterd.so(glusterd_op_begin_synctask+0x3b)[0x7fddafd9340b]
/usr/lib64/glusterfs/3.6.0.15/xlator/mgmt/glusterd.so(__glusterd_handle_gsync_set+0x171)[0x7fddafd76fc1]
/usr/lib64/glusterfs/3.6.0.15/xlator/mgmt/glusterd.so(glusterd_big_locked_handler+0x3f)[0x7fddafcf8e2f]
/usr/lib64/libglusterfs.so.0(synctask_wrap+0x12)[0x7fddbd787742]
/lib64/libc.so.6[0x3f3f443bf0]
---------

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>


Though this bug seems to look like Bug 1025408, but according developer the back traces are not same. Hence raising another bug. 


Version-Release number of selected component (if applicable): glusterfs-3.6.0.15-1.el6rhs


How reproducible: happens everytime. 



Steps to Reproduce:
1.create and start a geo-rep relationship between master and slave
2. set geo-rep config option use-tarssh t true.
3. then reset it using the command "gluster volume geo {master-vol} {slave-url} config \!use-tarssh" 

Actual results: glusterd crashes. 


Expected results: glusterd shouldn't crash


Additional info:

core bt
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
(gdb) bt
#0  0x00007fddafd6f45b in glusterd_gsync_op_already_set (master=0x1def430 "master", slave=0x7fdda4000cd0 "10.70.43.170::slave", conf_path=<value optimized out>, 
    op_name=0x7fdda4001e80 "use-tarssh", op_value=0x0) at glusterd-geo-rep.c:2985
#1  0x00007fddafd7619f in glusterd_gsync_configure (dict=0x7fddbbf831cc, op_errstr=0x2040510, rsp_dict=0x7fddbbf83258) at glusterd-geo-rep.c:3085
#2  glusterd_op_gsync_set (dict=0x7fddbbf831cc, op_errstr=0x2040510, rsp_dict=0x7fddbbf83258) at glusterd-geo-rep.c:4459
#3  0x00007fddafd1b806 in glusterd_op_commit_perform (op=GD_OP_GSYNC_SET, dict=0x7fddbbf831cc, op_errstr=0x2040510, rsp_dict=0x7fddbbf83258) at glusterd-op-sm.c:4876
#4  0x00007fddafd9168e in gd_commit_op_phase (peers=0x1de5bb0, op=GD_OP_GSYNC_SET, op_ctx=0x7fddbbf83140, req_dict=0x7fddbbf831cc, op_errstr=0x2040510, npeers=3)
    at glusterd-syncop.c:1272
#5  0x00007fddafd933b2 in gd_sync_task_begin (op_ctx=0x7fddbbf83140, req=0x1de02fc) at glusterd-syncop.c:1676
#6  0x00007fddafd9340b in glusterd_op_begin_synctask (req=0x1de02fc, op=<value optimized out>, dict=0x7fddbbf83140) at glusterd-syncop.c:1728
#7  0x00007fddafd76fc1 in __glusterd_handle_gsync_set (req=0x1de02fc) at glusterd-geo-rep.c:332
#8  0x00007fddafcf8e2f in glusterd_big_locked_handler (req=0x1de02fc, actor_fn=0x7fddafd76e50 <__glusterd_handle_gsync_set>) at glusterd-handler.c:80
#9  0x00007fddbd787742 in synctask_wrap (old_task=<value optimized out>) at syncop.c:333
#10 0x0000003f3f443bf0 in ?? () from /lib64/libc.so.6
#11 0x0000000000000000 in ?? ()
(gdb) f 0 
#0  0x00007fddafd6f45b in glusterd_gsync_op_already_set (master=0x1def430 "master", slave=0x7fdda4000cd0 "10.70.43.170::slave", conf_path=<value optimized out>, 
    op_name=0x7fdda4001e80 "use-tarssh", op_value=0x0) at glusterd-geo-rep.c:2985
2985                    if (!strcmp(op_value,"true") || !strcmp(op_value,"1")
(gdb) f 1 
#1  0x00007fddafd7619f in glusterd_gsync_configure (dict=0x7fddbbf831cc, op_errstr=0x2040510, rsp_dict=0x7fddbbf83258) at glusterd-geo-rep.c:3085
3085                    ret = glusterd_gsync_op_already_set(master,slave,conf_path,
(gdb) f 3 
#3  0x00007fddafd1b806 in glusterd_op_commit_perform (op=GD_OP_GSYNC_SET, dict=0x7fddbbf831cc, op_errstr=0x2040510, rsp_dict=0x7fddbbf83258) at glusterd-op-sm.c:4876
4876                            ret = glusterd_op_gsync_set (dict, op_errstr, rsp_dict);
(gdb) f 2 
#2  glusterd_op_gsync_set (dict=0x7fddbbf831cc, op_errstr=0x2040510, rsp_dict=0x7fddbbf83258) at glusterd-geo-rep.c:4459
4459                    ret = glusterd_gsync_configure (volinfo, slave, path_list,

Comment 4 Kotresh HR 2014-06-11 09:33:00 UTC
Upstream Patch:
http://review.gluster.org/8032/

Downstream Patch:
https://code.engineering.redhat.com/gerrit/#/c/26664/

Comment 6 Vijaykumar Koppad 2014-06-18 06:52:52 UTC
verified on the build - glusterfs-3.6.0.18-1

Comment 10 errata-xmlrpc 2014-09-22 19:40:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-1278.html