Bug 1262324 - Data Tiering:glusterd crashes while attaching tier
Data Tiering:glusterd crashes while attaching tier
Product: GlusterFS
Classification: Community
Component: tiering (Show other bugs)
Unspecified Unspecified
urgent Severity urgent
: ---
: ---
Assigned To: Satish Mohan
: Triaged
Depends On:
  Show dependency treegraph
Reported: 2015-09-11 08:22 EDT by nchilaka
Modified: 2017-03-08 06:03 EST (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2017-03-08 06:03:38 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description nchilaka 2015-09-11 08:22:18 EDT
Description of problem:
I was doing an attach tier whithout any IOs and the glusterd crashed

Following is the BT:
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/sbin/glusterd -p /var/run/glusterd.pid'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f96812593e9 in rpc_transport_submit_request ()
   from /lib64/libgfrpc.so.0
Missing separate debuginfos, use: debuginfo-install glusterfs-server-3.7.4-0.27.git143f0f9.el7.centos.x86_64
(gdb) bt
#0  0x00007f96812593e9 in rpc_transport_submit_request ()
   from /lib64/libgfrpc.so.0
#1  0x00007f9681254a49 in rpcsvc_callback_submit () from /lib64/libgfrpc.so.0
#2  0x00007f9675f4b26d in glusterd_fetchspec_notify ()
   from /usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so
#3  0x00007f9675ffc2ea in glusterd_op_perform_add_bricks ()
   from /usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so
#4  0x00007f9675ffe3c0 in glusterd_op_add_brick ()
   from /usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so
#5  0x00007f9675f7b09b in glusterd_op_commit_perform ()
   from /usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so
#6  0x00007f9676004a49 in gd_commit_op_phase ()
   from /usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so
#7  0x00007f967600602d in gd_sync_task_begin ()
   from /usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so
#8  0x00007f9676006300 in glusterd_op_begin_synctask ()
   from /usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so
#9  0x00007f9675ff9c68 in __glusterd_handle_add_brick ()
   from /usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so
#10 0x00007f9675f65de0 in glusterd_big_locked_handler ()
   from /usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so
#11 0x00007f96814d0d72 in synctask_wrap () from /lib64/libglusterfs.so.0
#12 0x00007f967fb8f0f0 in ?? () from /lib64/libc.so.6
#13 0x0000000000000000 in ?? ()

Version-Release number of selected component (if applicable):
[root@zod log]# rpm -qa|grep gluster
[root@zod log]# gluster --version
glusterfs 3.7.4 built on Sep  9 2015 01:30:47
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
You may redistribute copies of GlusterFS under the terms of the GNU General Public License.

Steps to Reproduce:
1.created a tier volume with hot tier as 2-brick distribute over cold tier of 2x2 with two nodes
2.started volume
3.mounted on both fuse and nfs
4.created a 1gb file from each mnt point
5. attached a tier, which asked for confirmation, and I said "yes"

Then the glusterd crashed immediatley on the localhost where the command was being executed
Comment 1 nchilaka 2015-09-11 08:33:26 EDT
core and Sosreports @rhsqe-repo.lab.eng.blr.redhat.com
Comment 2 nchilaka 2015-09-11 08:49:54 EDT

etc log:
26d] -->/lib64/libgfrpc.so.0(rpcsvc_callback_submit+0x169) [0x7f9681254a49] -->/usr/lib64/glusterfs/3.7.4/rpc-transport/socket.so(+0x61c3) [0x7f9673cea1c3] ) 0-socket: invalid argument: this->private [Invalid argument]
[2015-09-11 12:10:38.843106] W [rpcsvc.c:1085:rpcsvc_callback_submit] 0-rpcsvc: transmission of rpc-request failed
pending frames:
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash:
2015-09-11 12:10:38
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.7.4
Comment 3 Gaurav Kumar Garg 2015-09-14 06:02:41 EDT
Hi Nag Pavan,

Could you give me some details how can i access log information for this bug. i tried both ssh and http @rhsqe-repo.lab.eng.blr.redhat.com

Comment 4 Gaurav Kumar Garg 2015-09-28 03:06:37 EDT

This is a known issue. This issue is coming because of multi threaded epoll in glusterd. Could you change the by default epoll thread value to 1 by modifying 
/usr/local/etc/glusterfs/glusterd.vol file

add/modify the following things in /usr/local/etc/glusterfs/glusterd.vol file

option ping-timeout 0
option event-threads 1

for making default epoll thread value to 1
Comment 5 Vivek Agarwal 2015-09-28 04:18:59 EDT
Thanks Gaurav.

Nag, can you test with the changes suggested by Gaurav?
Comment 6 nchilaka 2015-09-30 06:37:28 EDT
This Bug is not yet fixed. Only a work around was given. Hence moving it back to assigned
Comment 8 Kaushal 2017-03-08 06:03:38 EST
This bug is getting closed because GlusteFS-3.7 has reached its end-of-life.

Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS.
If this bug still exists in newer GlusterFS releases, please reopen this bug against the newer release.

Note You need to log in before you can comment on or make changes to this bug.