1197548 – RDMA:crash during sanity test

Bug 1197548 - RDMA:crash during sanity test

Summary: RDMA:crash during sanity test

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	rdma
Sub Component:
Version:	mainline
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Assignee:	Mohammed Rafi KC
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1198562
TreeView+	depends on / blocked

Reported:	2015-03-02 01:42 UTC by Saurabh
Modified:	2016-01-19 06:14 UTC (History)
CC List:	9 users (show)
Fixed In Version:	glusterfs-3.7.0
Clone Of:
Clones:	1198562 (view as bug list)
Environment:
Last Closed:	2015-05-14 17:29:14 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
brick1-coredump (14.22 MB, application/x-xz) 2015-03-02 02:18 UTC, Saurabh	no flags	Details
brick2-coredump (13.86 MB, application/x-xz) 2015-03-02 02:24 UTC, Saurabh	no flags	Details
View All

Description Saurabh 2015-03-02 01:42:23 UTC

Description of problem:
I have physical nodes with glusterfs build installed on top them. Created a 2x2 glusterfs volume, and executed some cases after mounting the volume over nfs-ganesha.

Infra includes 2 physical nodes.

Version-Release number of selected component (if applicable):
glusterfs-3.7dev-0.611.git729428a.el6.x86_64
nfs-ganesha-debuginfo-2.2-0.rc2.el6.x86_64
nfs-ganesha-gluster-2.2-0.rc2.el6.x86_64
nfs-ganesha-2.2-0.rc2.el6.x86_64

How reproducible:
seen first time

Steps to Reproduce:
1. create a volume of type 2x2. transport type being RDMA
2. enable nfs-ganesha, and mount the volume with vers=3 on a different node
3. start executing tests like arequal, compile_kernel .. ltp etc

Actual results:
last test that's under execution is ltp, coredump seen

from client,
pstree glimpse,
     ├─sshd─┬─sshd───bash───screen───screen─┬─bash───pstree
     │      │                               └─bash───run.sh───ltp.sh───ltp_run.sh───fsstress───13*[fsstress]
     │      └─sshd───bash───screen


from server,
brick logs,
pending frames:
frame : type(0) op(13)
frame : type(0) op(13)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 
2015-02-27 13:32:59
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.7dev
/usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb6)[0x3258c20ad6]
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x33f)[0x3258c3bdff]
/lib64/libc.so.6[0x3315a326a0]
/lib64/libc.so.6(xdr_string+0xa7)[0x3315b17db7]
/usr/lib64/libgfxdr.so.0(xdr_gfs3_symlink_req+0x6e)[0x325940915e]
/usr/lib64/libgfxdr.so.0(xdr_to_generic+0x75)[0x325940ea35]
/usr/lib64/glusterfs/3.7dev/xlator/protocol/server.so(server3_3_symlink+0x96)[0x7f09a2340c16]
/usr/lib64/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x295)[0x3259009c65]
/usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x103)[0x3259009ea3]
/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x325900b5f8]
/usr/lib64/glusterfs/3.7dev/rpc-transport/rdma.so(gf_rdma_pollin_notify+0xd4)[0x7f09a11b20b4]
/usr/lib64/glusterfs/3.7dev/rpc-transport/rdma.so(gf_rdma_handle_successful_send_completion+0xcf)[0x7f09a11b22ff]
/usr/lib64/glusterfs/3.7dev/rpc-transport/rdma.so(+0xaeab)[0x7f09a11b2eab]
/lib64/libpthread.so.0[0x3315e079d1]
/lib64/libc.so.6(clone+0x6d)[0x3315ae88fd]


glusterfsd bt,
(gdb) bt
#0  0x0000003315b17db7 in xdr_string_internal () from /lib64/libc.so.6
#1  0x000000325940915e in xdr_gfs3_symlink_req (xdrs=0x7f0999470190, objp=0x7f0999471a80) at glusterfs3-xdr.c:466
#2  0x000000325940ea35 in xdr_to_generic (inmsg=..., args=0x7f0999471a80, proc=0x32594090f0 <xdr_gfs3_symlink_req>) at xdr-generic.c:51
#3  0x00007f09a2340c16 in server3_3_symlink (req=0x7f09a13bec7c) at server-rpc-fops.c:5644
#4  0x0000003259009c65 in rpcsvc_handle_rpc_call (svc=<value optimized out>, trans=<value optimized out>, msg=0x7f0968001310) at rpcsvc.c:690
#5  0x0000003259009ea3 in rpcsvc_notify (trans=0x7f099165a070, mydata=<value optimized out>, event=<value optimized out>, data=0x7f0968001310)
    at rpcsvc.c:784
#6  0x000000325900b5f8 in rpc_transport_notify (this=<value optimized out>, event=<value optimized out>, data=<value optimized out>)
    at rpc-transport.c:543
#7  0x00007f09a11b20b4 in gf_rdma_pollin_notify (peer=0x7f0991656ea0, post=<value optimized out>) at rdma.c:3722
#8  0x00007f09a11b22ff in gf_rdma_handle_successful_send_completion (peer=0x7f0991656ea0, wc=<value optimized out>) at rdma.c:4196
#9  0x00007f09a11b2eab in gf_rdma_send_completion_proc (data=0x7f0990019bb0) at rdma.c:4270
#10 0x0000003315e079d1 in start_thread () from /lib64/libpthread.so.0
#11 0x0000003315ae88fd in clone () from /lib64/libc.so.6

nfsd bt,
(gdb) bt
#0  __glfs_entry_fd (glfd=0x7f20fd2824f0, iovec=0x7f21197f9770, iovcnt=1, offset=0, flags=1052672) at glfs-internal.h:195
#1  pub_glfs_pwritev (glfd=0x7f20fd2824f0, iovec=0x7f21197f9770, iovcnt=1, offset=0, flags=1052672) at glfs-fops.c:841
#2  0x000000325980caca in pub_glfs_pwrite (glfd=<value optimized out>, buf=<value optimized out>, count=<value optimized out>, 
    offset=<value optimized out>, flags=<value optimized out>) at glfs-fops.c:952
#3  0x00007f215a9b2c28 in file_write (obj_hdl=0x7f2108a30518, seek_descriptor=0, buffer_size=1048576, buffer=0x7f20dc1c1000, 
    write_amount=0x7f21197f9dc8, fsal_stable=0x7f21197f986f) at /usr/src/debug/nfs-ganesha-2.2-rc2-0.1.1-Source/FSAL/FSAL_GLUSTER/handle.c:1118
#4  0x00000000004e79bd in cache_inode_rdwr_plus (entry=0x7f21098cc2f0, io_direction=CACHE_INODE_WRITE, offset=0, io_size=1048576, 
    bytes_moved=0x7f21197f9dc8, buffer=0x7f20dc1c1000, eof=0x7f21197f9dc7, sync=0x7f21197f9dc6, info=0x0)
    at /usr/src/debug/nfs-ganesha-2.2-rc2-0.1.1-Source/cache_inode/cache_inode_rdwr.c:169
#5  0x00000000004e8819 in cache_inode_rdwr (entry=0x7f21098cc2f0, io_direction=CACHE_INODE_WRITE, offset=0, io_size=1048576, 
    bytes_moved=0x7f21197f9dc8, buffer=0x7f20dc1c1000, eof=0x7f21197f9dc7, sync=0x7f21197f9dc6)
    at /usr/src/debug/nfs-ganesha-2.2-rc2-0.1.1-Source/cache_inode/cache_inode_rdwr.c:304
#6  0x000000000046190e in nfs3_write (arg=0x7f20dc00cf70, worker=0x7f20f80008c0, req=0x7f20dc00ceb8, res=0x7f20f923a5a0)
    at /usr/src/debug/nfs-ganesha-2.2-rc2-0.1.1-Source/Protocols/NFS/nfs3_write.c:234
#7  0x000000000045737f in nfs_rpc_execute (req=0x7f20dc00efe0, worker_data=0x7f20f80008c0)
    at /usr/src/debug/nfs-ganesha-2.2-rc2-0.1.1-Source/MainNFSD/nfs_worker_thread.c:1268
#8  0x0000000000458119 in worker_run (ctx=0x3ca8b00) at /usr/src/debug/nfs-ganesha-2.2-rc2-0.1.1-Source/MainNFSD/nfs_worker_thread.c:1535
#9  0x000000000051e6b2 in fridgethr_start_routine (arg=0x3ca8b00) at /usr/src/debug/nfs-ganesha-2.2-rc2-0.1.1-Source/support/fridgethr.c:562
#10 0x0000003315e079d1 in start_thread () from /lib64/libpthread.so.0
#11 0x0000003315ae88fd in clone () from /lib64/libc.so.6


gluster volume status
Status of volume: vol0
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 192.168.44.106:/rhs/brick1/d1r1       N/A       N/A        N       7738 
Brick 192.168.44.108:/rhs/brick1/d1r1       0         49154      Y       2378 
Brick 192.168.44.106:/rhs/brick1/d2r1       N/A       N/A        N       7752 
Brick 192.168.44.108:/rhs/brick1/d2r2       0         49155      Y       2392 
Self-heal Daemon on localhost               N/A       N/A        Y       7775 
Quota Daemon on localhost                   N/A       N/A        Y       8095 
Self-heal Daemon on 192.168.44.107          N/A       N/A        Y       2972 
Quota Daemon on 192.168.44.107              N/A       N/A        Y       3201 
Self-heal Daemon on 192.168.44.108          N/A       N/A        Y       2417 
Quota Daemon on 192.168.44.108              N/A       N/A        Y       2635 
 
Task Status of Volume vol0
------------------------------------------------------------------------------
There are no active volume tasks


Expected results:
ltp test should finish properly, there should not be any coredump for glusterfsd or nfsd processes

Additional info:

Comment 1 Saurabh 2015-03-02 01:46:12 UTC

 gluster volume info
 
Volume Name: vol0
Type: Distributed-Replicate
Volume ID: 25f4b031-f68e-4e43-9d2a-ce99abaf39ca
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: rdma
Bricks:
Brick1: 192.168.44.106:/rhs/brick1/d1r1
Brick2: 192.168.44.108:/rhs/brick1/d1r1
Brick3: 192.168.44.106:/rhs/brick1/d2r1
Brick4: 192.168.44.108:/rhs/brick1/d2r2
Options Reconfigured:
features.quota: on
nfs-ganesha.enable: on
nfs-ganesha.host: 192.168.44.106
nfs.disable: on
auto-delete: disable
snap-max-soft-limit: 90
snap-max-hard-limit: 256

Comment 2 Saurabh 2015-03-02 02:18:12 UTC

Created attachment 996930 [details]
brick1-coredump

Comment 3 Saurabh 2015-03-02 02:24:01 UTC

Created attachment 996931 [details]
brick2-coredump

Comment 5 Soumya Koduri 2015-03-02 04:49:27 UTC

CC'ed Raghavendra, Rafi and Jiffin to check if they have seen any similar issue while using RDMA.

Comment 6 Raghavendra Talur 2015-03-02 12:59:49 UTC

I have not looked at the core dump yet, but looking at the backtrace in the bug
I see nomem messages. What was the configuration of the machine that was being used as nfs host(192.168.44.106)?

RAM does not seem to be sufficient. 

Will continue to look at it, may be a different root cause.

Comment 7 Jiffin 2015-03-03 12:27:56 UTC

In the ltp test suite, crash was caused due to fsstress-test.

When this test ran alone in the nfsv3 mount, ganesha server didn't crash , but bricks goes down with same back trace.

Similarly for nfsv4, it completed successfully without any crash.

the test command used :

fsstress -d <mount point> -l 22 -n 22 -p 22  2

Comment 8 Mohammed Rafi KC 2015-03-04 11:36:14 UTC

Root cause identified as: when doing rdma vectored read from the remote end point, the calculation of remote address went wrong from second vector onward. Nevertheless of number of remote buffers, we are always setting the first buffer as remote address for all rdma remote read.

Comment 9 Anand Avati 2015-03-04 11:39:00 UTC

REVIEW: http://review.gluster.org/9794 (rdma: setting wrong remote memory.) posted (#2) for review on master by mohammed rafi  kc (rkavunga)

Comment 10 Anand Avati 2015-03-04 14:06:06 UTC

REVIEW: http://review.gluster.org/9794 (rdma: setting wrong remote memory.) posted (#3) for review on master by Humble Devassy Chirammal (humble.devassy)

Comment 11 Anand Avati 2015-03-05 05:27:48 UTC

REVIEW: http://review.gluster.org/9794 (rdma:setting wrong remote memory.) posted (#4) for review on master by mohammed rafi  kc (rkavunga)

Comment 12 Anand Avati 2015-03-05 06:08:10 UTC

COMMIT: http://review.gluster.org/9794 committed in master by Raghavendra G (rgowdapp) 
------
commit e08aea2fd67a06275423ded157431305a7925cf6
Author: Mohammed Rafi KC <rkavunga>
Date:   Wed Mar 4 14:37:05 2015 +0530

    rdma:setting wrong remote memory.
    
    when we send more than one work request in a single call,
    the remote addr is always setting as the first address of
    the vector.
    
    Change-Id: I55aea7bd6542abe22916719a139f7c8f73334d26
    BUG: 1197548
    Signed-off-by: Mohammed Rafi KC <rkavunga>
    Reviewed-on: http://review.gluster.org/9794
    Reviewed-by: Raghavendra G <rgowdapp>
    Tested-by: Raghavendra G <rgowdapp>

Comment 13 Niels de Vos 2015-05-14 17:29:14 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 14 Niels de Vos 2015-05-14 17:35:51 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 15 Niels de Vos 2015-05-14 17:38:13 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 16 Niels de Vos 2015-05-14 17:46:12 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Note You need to log in before you can comment on or make changes to this bug.