1262324 – Data Tiering:glusterd crashes while attaching tier

Bug 1262324 - Data Tiering:glusterd crashes while attaching tier

Summary: Data Tiering:glusterd crashes while attaching tier

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	tiering
Sub Component:
Version:	3.7.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	urgent
Target Milestone:	---
Assignee:	Satish Mohan
QA Contact:	bugs@gluster.org
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-09-11 12:22 UTC by Nag Pavan Chilakam
Modified:	2017-03-08 11:03 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2017-03-08 11:03:38 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Nag Pavan Chilakam 2015-09-11 12:22:18 UTC

Description of problem:
=====================
I was doing an attach tier whithout any IOs and the glusterd crashed

Following is the BT:
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/sbin/glusterd -p /var/run/glusterd.pid'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f96812593e9 in rpc_transport_submit_request ()
   from /lib64/libgfrpc.so.0
Missing separate debuginfos, use: debuginfo-install glusterfs-server-3.7.4-0.27.git143f0f9.el7.centos.x86_64
(gdb) bt
#0  0x00007f96812593e9 in rpc_transport_submit_request ()
   from /lib64/libgfrpc.so.0
#1  0x00007f9681254a49 in rpcsvc_callback_submit () from /lib64/libgfrpc.so.0
#2  0x00007f9675f4b26d in glusterd_fetchspec_notify ()
   from /usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so
#3  0x00007f9675ffc2ea in glusterd_op_perform_add_bricks ()
   from /usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so
#4  0x00007f9675ffe3c0 in glusterd_op_add_brick ()
   from /usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so
#5  0x00007f9675f7b09b in glusterd_op_commit_perform ()
   from /usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so
#6  0x00007f9676004a49 in gd_commit_op_phase ()
   from /usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so
#7  0x00007f967600602d in gd_sync_task_begin ()
   from /usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so
#8  0x00007f9676006300 in glusterd_op_begin_synctask ()
   from /usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so
#9  0x00007f9675ff9c68 in __glusterd_handle_add_brick ()
   from /usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so
#10 0x00007f9675f65de0 in glusterd_big_locked_handler ()
   from /usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so
#11 0x00007f96814d0d72 in synctask_wrap () from /lib64/libglusterfs.so.0
#12 0x00007f967fb8f0f0 in ?? () from /lib64/libc.so.6
#13 0x0000000000000000 in ?? ()



Version-Release number of selected component (if applicable):
============================================================
[root@zod log]# rpm -qa|grep gluster
glusterfs-client-xlators-3.7.4-0.27.git143f0f9.el7.centos.x86_64
glusterfs-cli-3.7.4-0.27.git143f0f9.el7.centos.x86_64
glusterfs-libs-3.7.4-0.27.git143f0f9.el7.centos.x86_64
glusterfs-api-3.7.4-0.27.git143f0f9.el7.centos.x86_64
glusterfs-fuse-3.7.4-0.27.git143f0f9.el7.centos.x86_64
glusterfs-server-3.7.4-0.27.git143f0f9.el7.centos.x86_64
glusterfs-3.7.4-0.27.git143f0f9.el7.centos.x86_64
[root@zod log]# gluster --version
glusterfs 3.7.4 built on Sep  9 2015 01:30:47
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General Public License.





Steps to Reproduce:
=====================
1.created a tier volume with hot tier as 2-brick distribute over cold tier of 2x2 with two nodes
2.started volume
3.mounted on both fuse and nfs
4.created a 1gb file from each mnt point
5. attached a tier, which asked for confirmation, and I said "yes"

Then the glusterd crashed immediatley on the localhost where the command was being executed

Comment 1 Nag Pavan Chilakam 2015-09-11 12:33:26 UTC

core and Sosreports @rhsqe-repo.lab.eng.blr.redhat.com
:/home/repo/sosreports/bug.1262324

Comment 2 Nag Pavan Chilakam 2015-09-11 12:49:54 UTC


etc log:
26d] -->/lib64/libgfrpc.so.0(rpcsvc_callback_submit+0x169) [0x7f9681254a49] -->/usr/lib64/glusterfs/3.7.4/rpc-transport/socket.so(+0x61c3) [0x7f9673cea1c3] ) 0-socket: invalid argument: this->private [Invalid argument]
[2015-09-11 12:10:38.843106] W [rpcsvc.c:1085:rpcsvc_callback_submit] 0-rpcsvc: transmission of rpc-request failed
pending frames:
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash:
2015-09-11 12:10:38
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.7.4
/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb2)[0x7f968148efd2]
/lib64/libglusterfs.so.0(gf_print_trace+0x31d)[0x7f96814ab45d]
/lib64/libc.so.6(+0x35650)[0x7f967fb7d650]
/lib64/libgfrpc.so.0(rpc_transport_submit_request+0x9)[0x7f96812593e9]
/lib64/libgfrpc.so.0(rpcsvc_callback_submit+0x169)[0x7f9681254a49]
/usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so(glusterd_fetchspec_notify+0x5d)[0x7f9675f4b26d]
/usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so(glusterd_op_perform_add_bricks+0x6da)[0x7f9675ffc2ea]
/usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so(glusterd_op_add_brick+0x1f0)[0x7f9675ffe3c0]
/usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so(glusterd_op_commit_perform+0x77b)[0x7f9675f7b09b]
/usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so(gd_commit_op_phase+0xb9)[0x7f9676004a49]
/usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so(gd_sync_task_begin+0x77d)[0x7f967600602d]
/usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so(glusterd_op_begin_synctask+0x30)[0x7f9676006300]
/usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so(__glusterd_handle_add_brick+0x888)[0x7f9675ff9c68]
/usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so(glusterd_big_locked_handler+0x30)[0x7f9675f65de0]
/lib64/libglusterfs.so.0(synctask_wrap+0x12)[0x7f96814d0d72]
/lib64/libc.so.6(+0x470f0)[0x7f967fb8f0f0]
---------

Comment 3 Gaurav Kumar Garg 2015-09-14 10:02:41 UTC

Hi Nag Pavan,

Could you give me some details how can i access log information for this bug. i tried both ssh and http @rhsqe-repo.lab.eng.blr.redhat.com
:/home/repo/sosreports/bug.1262324    


~Gaurav

Comment 4 Gaurav Kumar Garg 2015-09-28 07:06:37 UTC

Hi,

This is a known issue. This issue is coming because of multi threaded epoll in glusterd. Could you change the by default epoll thread value to 1 by modifying 
/usr/local/etc/glusterfs/glusterd.vol file

add/modify the following things in /usr/local/etc/glusterfs/glusterd.vol file

option ping-timeout 0
option event-threads 1


for making default epoll thread value to 1

Comment 5 Vivek Agarwal 2015-09-28 08:18:59 UTC

Thanks Gaurav.

Nag, can you test with the changes suggested by Gaurav?

Comment 6 Nag Pavan Chilakam 2015-09-30 10:37:28 UTC

This Bug is not yet fixed. Only a work around was given. Hence moving it back to assigned

Comment 8 Kaushal 2017-03-08 11:03:38 UTC

This bug is getting closed because GlusteFS-3.7 has reached its end-of-life.

Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS.
If this bug still exists in newer GlusterFS releases, please reopen this bug against the newer release.

Note You need to log in before you can comment on or make changes to this bug.