1260003 – Data Tiering:Regression:NFS crashed due to dht readdirp after attach tier

Bug 1260003 - Data Tiering:Regression:NFS crashed due to dht readdirp after attach tier

Summary: Data Tiering:Regression:NFS crashed due to dht readdirp after attach tier

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	tiering
Sub Component:
Version:	3.7.5
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	urgent
Target Milestone:	---
Assignee:	Mohammed Rafi KC
QA Contact:	bugs@gluster.org
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1278345 1278346
TreeView+	depends on / blocked

Reported:	2015-09-04 06:59 UTC by Nag Pavan Chilakam
Modified:	2017-03-08 10:55 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Clone Of:
Clones:	1278345 1278346 (view as bug list)
Environment:
Last Closed:	2017-03-08 10:55:48 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
QE bug raising cli log (32.44 KB, text/plain) 2015-09-04 07:23 UTC, Nag Pavan Chilakam	no flags	Details
View All

Description Nag Pavan Chilakam 2015-09-04 06:59:51 UTC

Description of problem:
===========================
I had  an existing volume  which was mounted on nfs.
While i was doing IOs(creating files), I attached a tier to see if IOs are going to hot tier post attach. But this didn't happen and hence raised a bug#

But after tier attach completed, I created more files to see if these new set of files atleast will go to hot tier, but this too didn't.
So wanted to see if doing a lookup will make writes go to hot tier, hence I opened a duplicate connection to the mount point and issued an "ls" 
This crashed the nfs process.

[2015-09-04 11:10:00.830391] E [nfs3.c:341:__nfs3_get_volume_id] (-->/usr/lib64/glusterfs/3.7.4/xlator/nfs/server.so(nfs3_getattr_reply+0x29) [0x7f498efaa9e9] -->/usr/lib64/glusterfs/3.7.4/xlator/nfs/server.so(nfs3_request_xlator_deviceid+0x78) [0x7f498efa93c8] -->/usr/lib64/glusterfs/3.7.4/xlator/nfs/server.so(__nfs3_get_volume_id+0xae) [0x7f498efa930e] ) 0-nfs-nfsv3: invalid argument: xl [Invalid argument]
pending frames:
frame : type(0) op(0)
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash:
2015-09-04 11:10:42
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.7.4
/usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb6)[0x33db025936]
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x32f)[0x33db04549f]
/lib64/libc.so.6[0x340e8326a0]
/usr/lib64/glusterfs/3.7.4/xlator/cluster/distribute.so(dht_layout_search+0x19)[0x7f498f893419]
/usr/lib64/glusterfs/3.7.4/xlator/cluster/distribute.so(dht_readdirp_cbk+0x4b1)[0x7f498f8c16f1]
/usr/lib64/glusterfs/3.7.4/xlator/protocol/client.so(client3_3_readdirp_cbk+0x1a0)[0x7f498fb11830]
/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)[0x33db80f4a5]
/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x1a1)[0x33db8109d1]
/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x33db80bb28]
/usr/lib64/glusterfs/3.7.4/rpc-transport/socket.so(+0xabd5)[0x7f4990958bd5]
/usr/lib64/glusterfs/3.7.4/rpc-transport/socket.so(+0xc7bd)[0x7f499095a7bd]
/usr/lib64/libglusterfs.so.0[0x33db08b0a0]
/lib64/libpthread.so.0[0x340ec07a51]
/lib64/libc.so.6(clone+0x6d)[0x340e8e89ad]

 
Version-Release number of selected component (if applicable):
============================================================
[root@nag-manual-node1 glusterfs]# gluster --version
glusterfs 3.7.4 built on Sep  2 2015 18:06:07
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General Public License.
[root@nag-manual-node1 glusterfs]# rpm -qa|grep gluster
glusterfs-libs-3.7.4-0.16.git9f27ef9.el6.x86_64
glusterfs-api-3.7.4-0.16.git9f27ef9.el6.x86_64
glusterfs-client-xlators-3.7.4-0.16.git9f27ef9.el6.x86_64
glusterfs-fuse-3.7.4-0.16.git9f27ef9.el6.x86_64
glusterfs-cli-3.7.4-0.16.git9f27ef9.el6.x86_64
glusterfs-3.7.4-0.16.git9f27ef9.el6.x86_64
glusterfs-server-3.7.4-0.16.git9f27ef9.el6.x86_64




Steps to Reproduce:
=====================
1.create a regular volume
2.now mount vol and do IOs and while IOs are going on attach a tier 
3.now after attach tier is complete, open another connection to the same client and issue an ls
4. this caused the crash



I will be failing the on_qa bug 1259081 - I/O failure on attaching tier, due to this

Comment 1 Nag Pavan Chilakam 2015-09-04 07:13:12 UTC

Bt:
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.166.el6_7.1.x86_64 keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-42.el6.x86_64 libacl-2.2.49-6.el6.x86_64 libattr-2.4.44-7.el6.x86_64 libcom_err-1.41.12-22.el6.x86_64 libgcc-4.4.7-16.el6.x86_64 libselinux-2.0.94-5.8.el6.x86_64 libuuid-2.17.2-12.18.el6.x86_64 openssl-1.0.1e-42.el6.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) bt
#0  dht_layout_search (this=0x7f49880176f0, layout=0x0, name=0x7f497810da68 ".") at dht-layout.c:171
#1  0x00007f498f8c16f1 in dht_readdirp_cbk (frame=0x7f49999c6a34, cookie=0x7f49999c7198, this=0x7f49880176f0, op_ret=2, op_errno=2, orig_entries=0x7f498673ea80, 
    xdata=0x0) at dht-common.c:4542
#2  0x00007f498fb11830 in client3_3_readdirp_cbk (req=<value optimized out>, iov=<value optimized out>, count=<value optimized out>, myframe=0x7f49999c7198)
    at client-rpc-fops.c:2668
#3  0x00000033db80f4a5 in rpc_clnt_handle_reply (clnt=0x7f4988151e50, pollin=0x7f49781113f0) at rpc-clnt.c:766
#4  0x00000033db8109d1 in rpc_clnt_notify (trans=<value optimized out>, mydata=0x7f4988151e80, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7f49781113f0) at rpc-clnt.c:907
#5  0x00000033db80bb28 in rpc_transport_notify (this=<value optimized out>, event=<value optimized out>, data=<value optimized out>) at rpc-transport.c:544
#6  0x00007f4990958bd5 in socket_event_poll_in (this=0x7f4988161ac0) at socket.c:2236
#7  0x00007f499095a7bd in socket_event_handler (fd=<value optimized out>, idx=<value optimized out>, data=0x7f4988161ac0, poll_in=1, poll_out=0, poll_err=0)
    at socket.c:2349
#8  0x00000033db08b0a0 in event_dispatch_epoll_handler (data=0x7f4988078860) at event-epoll.c:575
#9  event_dispatch_epoll_worker (data=0x7f4988078860) at event-epoll.c:678
#10 0x000000340ec07a51 in start_thread () from /lib64/libpthread.so.0
#11 0x000000340e8e89ad in clone () from /lib64/libc.so.6

Comment 2 Nag Pavan Chilakam 2015-09-04 07:23:38 UTC

Created attachment 1070175 [details]
QE bug raising cli log

Comment 3 Nag Pavan Chilakam 2015-09-04 07:24:44 UTC

sosreports and cores at rhsqe-repo.lab.eng.blr.redhat.com:/home/repo/sosreports/bug.1260003

Comment 7 Vijay Bellur 2015-10-19 09:46:18 UTC

REVIEW: http://review.gluster.org/12375 (Revert "fuse: resolve complete path after a graph switch") posted (#2) for review on master by mohammed rafi  kc (rkavunga)

Comment 8 Kaushal 2017-03-08 10:55:48 UTC

This bug is getting closed because GlusteFS-3.7 has reached its end-of-life.

Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS.
If this bug still exists in newer GlusterFS releases, please reopen this bug against the newer release.

Note You need to log in before you can comment on or make changes to this bug.