Bug 1028732 - dist-geo-rep: glusterd crashed while running geo-rep status detail
dist-geo-rep: glusterd crashed while running geo-rep status detail
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: geo-replication (Show other bugs)
x86_64 Linux
high Severity high
: ---
: ---
Assigned To: Avra Sengupta
M S Vishwanath Bhat
: ZStream
Depends On:
  Show dependency treegraph
Reported: 2013-11-10 02:56 EST by M S Vishwanath Bhat
Modified: 2016-05-31 21:56 EDT (History)
5 users (show)

See Also:
Fixed In Version: glusterfs-
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2013-11-27 10:47:20 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
glusterd log file from the node where it crashed (2.20 MB, text/x-log)
2013-11-10 02:56 EST, M S Vishwanath Bhat
no flags Details

  None (edit)
Description M S Vishwanath Bhat 2013-11-10 02:56:10 EST
Created attachment 822041 [details]
glusterd log file from the node where it crashed

Description of problem:
I was running geo-rep status details in while loop and after some time glusterd crashed. glusterd was started with -LDEBUG mode.

Version-Release number of selected component (if applicable):

How reproducible:
Hit twice in 2 tries.

Steps to Reproduce:
1. Create and start geo-rep session between 282 dist-rep master and 2*2 slave nodes.
2. Turn on the use_tarssh option to sync the files. 
3. In a while loop keep running geo-rep status detail with 20-30 seconds sleep.

Actual results:

Core was generated by `glusterd -LDEBUG'.
Program terminated with signal 11, Segmentation fault.
#0  0x0000003b3a081361 in __strlen_sse2 () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install device-mapper-event-libs-1.02.77-9.el6.x86_64 device-mapper-libs-1.02.77-9.el6.x86_64 glibc-2.12-1.107.el6_4.5.x86_64 keyutils-libs-1.4-4.el6.x86_64 krb5-libs-1.10.3-10.el6_4.6.x86_64 libcom_err-1.41.12-14.el6_4.2.x86_64 libgcc-4.4.7-3.el6.x86_64 libselinux-2.0.94-5.3.el6_4.1.x86_64 libsepol-2.0.41-4.el6.x86_64 libudev-147-2.46.el6.x86_64 libxml2-2.7.6-12.el6_4.1.x86_64 lvm2-libs-2.02.98-9.el6.x86_64 openssl-1.0.0-27.el6_4.2.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) bt
#0  0x0000003b3a081361 in __strlen_sse2 () from /lib64/libc.so.6
#1  0x00007f20832919be in glusterd_parse_gsync_status (volinfo=0x7f207c017730, slave=0x7f206c116f00 "falcon::slave",
    conf_path=<value optimized out>, dict=0x7f20854cb7d4, node=0x1ce35d0 "spitfire.blr.redhat.com") at glusterd-geo-rep.c:2576
#2  glusterd_read_status_file (volinfo=0x7f207c017730, slave=0x7f206c116f00 "falcon::slave", conf_path=<value optimized out>, dict=0x7f20854cb7d4,
    node=0x1ce35d0 "spitfire.blr.redhat.com") at glusterd-geo-rep.c:2733
#3  0x00007f2083293a66 in glusterd_get_gsync_status_mst_slv (volinfo=0x7f207c017730, slave=0x7f206c116f00 "falcon::slave",
    conf_path=0x7f206c10c040 "/var/lib/glusterd/geo-replication/master_falcon_slave/gsyncd.conf", rsp_dict=0x7f20854cb7d4,
    node=0x1ce35d0 "spitfire.blr.redhat.com") at glusterd-geo-rep.c:2966
#4  0x00007f2083293d9c in glusterd_get_gsync_status (dict=0x7f20854cb400, op_errstr=0x1ce42b8, rsp_dict=0x7f20854cb7d4) at glusterd-geo-rep.c:3067
#5  0x00007f20832943b6 in glusterd_op_gsync_set (dict=0x7f20854cb400, op_errstr=0x1ce42b8, rsp_dict=0x7f20854cb7d4) at glusterd-geo-rep.c:3518
#6  0x00007f2083254066 in glusterd_op_commit_perform (op=GD_OP_GSYNC_SET, dict=0x7f20854cb400, op_errstr=0x1ce42b8, rsp_dict=0x7f20854cb7d4)
    at glusterd-op-sm.c:3933
#7  0x00007f20832acc7e in gd_commit_op_phase (peers=0x18da740, op=GD_OP_GSYNC_SET, op_ctx=0x7f20854cc120, req_dict=0x7f20854cb400,
    op_errstr=0x1ce42b8, npeers=3) at glusterd-syncop.c:958
#8  0x00007f20832ae8c2 in gd_sync_task_begin (op_ctx=0x7f20854cc120, req=0x7f2082ce1920) at glusterd-syncop.c:1240
#9  0x00007f20832ae9fb in glusterd_op_begin_synctask (req=0x7f2082ce1920, op=<value optimized out>, dict=0x7f20854cc120) at glusterd-syncop.c:1274
#10 0x00007f2083295d6f in __glusterd_handle_gsync_set (req=0x7f2082ce1920) at glusterd-geo-rep.c:319
#11 0x00007f208323ae7f in glusterd_big_locked_handler (req=0x7f2082ce1920, actor_fn=0x7f2083295c00 <__glusterd_handle_gsync_set>)
    at glusterd-handler.c:77
#12 0x00007f2086cfaad2 in synctask_wrap (old_task=<value optimized out>) at syncop.c:132
#13 0x0000003b3a043bb0 in ?? () from /lib64/libc.so.6
#14 0x0000000000000000 in ?? ()

Expected results:
glusterd should not crash.

Additional info:

Part of log file

pending frames:
frame : type(0) op(0)

patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2013-11-08 22:39:46configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs

I have attached the glusterd log file.
Comment 2 Avra Sengupta 2013-11-12 01:47:17 EST
Fixed with https://code.engineering.redhat.com/gerrit/#/c/15501/
Comment 3 M S Vishwanath Bhat 2013-11-12 03:47:45 EST
I ran the same steps again many times. So far glusterd hasn't crashed. So moving the bug to verified.
Comment 4 errata-xmlrpc 2013-11-27 10:47:20 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.