Bug 1197548

Summary: RDMA:crash during sanity test
Product: [Community] GlusterFS Reporter: Saurabh <saujain>
Component: rdmaAssignee: Mohammed Rafi KC <rkavunga>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: mainlineCC: bugs, gluster-bugs, jthottan, mmadhusu, mzywusko, ndevos, rkavunga, rtalur, skoduri
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.7.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1198562 (view as bug list) Environment:
Last Closed: 2015-05-14 17:29:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1198562    
Attachments:
Description Flags
brick1-coredump
none
brick2-coredump none

Description Saurabh 2015-03-02 01:42:23 UTC
Description of problem:
I have physical nodes with glusterfs build installed on top them. Created a 2x2 glusterfs volume, and executed some cases after mounting the volume over nfs-ganesha.

Infra includes 2 physical nodes.

Version-Release number of selected component (if applicable):
glusterfs-3.7dev-0.611.git729428a.el6.x86_64
nfs-ganesha-debuginfo-2.2-0.rc2.el6.x86_64
nfs-ganesha-gluster-2.2-0.rc2.el6.x86_64
nfs-ganesha-2.2-0.rc2.el6.x86_64

How reproducible:
seen first time

Steps to Reproduce:
1. create a volume of type 2x2. transport type being RDMA
2. enable nfs-ganesha, and mount the volume with vers=3 on a different node
3. start executing tests like arequal, compile_kernel .. ltp etc

Actual results:
last test that's under execution is ltp, coredump seen

from client,
pstree glimpse,
     ├─sshd─┬─sshd───bash───screen───screen─┬─bash───pstree
     │      │                               └─bash───run.sh───ltp.sh───ltp_run.sh───fsstress───13*[fsstress]
     │      └─sshd───bash───screen


from server,
brick logs,
pending frames:
frame : type(0) op(13)
frame : type(0) op(13)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 
2015-02-27 13:32:59
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.7dev
/usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb6)[0x3258c20ad6]
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x33f)[0x3258c3bdff]
/lib64/libc.so.6[0x3315a326a0]
/lib64/libc.so.6(xdr_string+0xa7)[0x3315b17db7]
/usr/lib64/libgfxdr.so.0(xdr_gfs3_symlink_req+0x6e)[0x325940915e]
/usr/lib64/libgfxdr.so.0(xdr_to_generic+0x75)[0x325940ea35]
/usr/lib64/glusterfs/3.7dev/xlator/protocol/server.so(server3_3_symlink+0x96)[0x7f09a2340c16]
/usr/lib64/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x295)[0x3259009c65]
/usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x103)[0x3259009ea3]
/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x325900b5f8]
/usr/lib64/glusterfs/3.7dev/rpc-transport/rdma.so(gf_rdma_pollin_notify+0xd4)[0x7f09a11b20b4]
/usr/lib64/glusterfs/3.7dev/rpc-transport/rdma.so(gf_rdma_handle_successful_send_completion+0xcf)[0x7f09a11b22ff]
/usr/lib64/glusterfs/3.7dev/rpc-transport/rdma.so(+0xaeab)[0x7f09a11b2eab]
/lib64/libpthread.so.0[0x3315e079d1]
/lib64/libc.so.6(clone+0x6d)[0x3315ae88fd]


glusterfsd bt,
(gdb) bt
#0  0x0000003315b17db7 in xdr_string_internal () from /lib64/libc.so.6
#1  0x000000325940915e in xdr_gfs3_symlink_req (xdrs=0x7f0999470190, objp=0x7f0999471a80) at glusterfs3-xdr.c:466
#2  0x000000325940ea35 in xdr_to_generic (inmsg=..., args=0x7f0999471a80, proc=0x32594090f0 <xdr_gfs3_symlink_req>) at xdr-generic.c:51
#3  0x00007f09a2340c16 in server3_3_symlink (req=0x7f09a13bec7c) at server-rpc-fops.c:5644
#4  0x0000003259009c65 in rpcsvc_handle_rpc_call (svc=<value optimized out>, trans=<value optimized out>, msg=0x7f0968001310) at rpcsvc.c:690
#5  0x0000003259009ea3 in rpcsvc_notify (trans=0x7f099165a070, mydata=<value optimized out>, event=<value optimized out>, data=0x7f0968001310)
    at rpcsvc.c:784
#6  0x000000325900b5f8 in rpc_transport_notify (this=<value optimized out>, event=<value optimized out>, data=<value optimized out>)
    at rpc-transport.c:543
#7  0x00007f09a11b20b4 in gf_rdma_pollin_notify (peer=0x7f0991656ea0, post=<value optimized out>) at rdma.c:3722
#8  0x00007f09a11b22ff in gf_rdma_handle_successful_send_completion (peer=0x7f0991656ea0, wc=<value optimized out>) at rdma.c:4196
#9  0x00007f09a11b2eab in gf_rdma_send_completion_proc (data=0x7f0990019bb0) at rdma.c:4270
#10 0x0000003315e079d1 in start_thread () from /lib64/libpthread.so.0
#11 0x0000003315ae88fd in clone () from /lib64/libc.so.6

nfsd bt,
(gdb) bt
#0  __glfs_entry_fd (glfd=0x7f20fd2824f0, iovec=0x7f21197f9770, iovcnt=1, offset=0, flags=1052672) at glfs-internal.h:195
#1  pub_glfs_pwritev (glfd=0x7f20fd2824f0, iovec=0x7f21197f9770, iovcnt=1, offset=0, flags=1052672) at glfs-fops.c:841
#2  0x000000325980caca in pub_glfs_pwrite (glfd=<value optimized out>, buf=<value optimized out>, count=<value optimized out>, 
    offset=<value optimized out>, flags=<value optimized out>) at glfs-fops.c:952
#3  0x00007f215a9b2c28 in file_write (obj_hdl=0x7f2108a30518, seek_descriptor=0, buffer_size=1048576, buffer=0x7f20dc1c1000, 
    write_amount=0x7f21197f9dc8, fsal_stable=0x7f21197f986f) at /usr/src/debug/nfs-ganesha-2.2-rc2-0.1.1-Source/FSAL/FSAL_GLUSTER/handle.c:1118
#4  0x00000000004e79bd in cache_inode_rdwr_plus (entry=0x7f21098cc2f0, io_direction=CACHE_INODE_WRITE, offset=0, io_size=1048576, 
    bytes_moved=0x7f21197f9dc8, buffer=0x7f20dc1c1000, eof=0x7f21197f9dc7, sync=0x7f21197f9dc6, info=0x0)
    at /usr/src/debug/nfs-ganesha-2.2-rc2-0.1.1-Source/cache_inode/cache_inode_rdwr.c:169
#5  0x00000000004e8819 in cache_inode_rdwr (entry=0x7f21098cc2f0, io_direction=CACHE_INODE_WRITE, offset=0, io_size=1048576, 
    bytes_moved=0x7f21197f9dc8, buffer=0x7f20dc1c1000, eof=0x7f21197f9dc7, sync=0x7f21197f9dc6)
    at /usr/src/debug/nfs-ganesha-2.2-rc2-0.1.1-Source/cache_inode/cache_inode_rdwr.c:304
#6  0x000000000046190e in nfs3_write (arg=0x7f20dc00cf70, worker=0x7f20f80008c0, req=0x7f20dc00ceb8, res=0x7f20f923a5a0)
    at /usr/src/debug/nfs-ganesha-2.2-rc2-0.1.1-Source/Protocols/NFS/nfs3_write.c:234
#7  0x000000000045737f in nfs_rpc_execute (req=0x7f20dc00efe0, worker_data=0x7f20f80008c0)
    at /usr/src/debug/nfs-ganesha-2.2-rc2-0.1.1-Source/MainNFSD/nfs_worker_thread.c:1268
#8  0x0000000000458119 in worker_run (ctx=0x3ca8b00) at /usr/src/debug/nfs-ganesha-2.2-rc2-0.1.1-Source/MainNFSD/nfs_worker_thread.c:1535
#9  0x000000000051e6b2 in fridgethr_start_routine (arg=0x3ca8b00) at /usr/src/debug/nfs-ganesha-2.2-rc2-0.1.1-Source/support/fridgethr.c:562
#10 0x0000003315e079d1 in start_thread () from /lib64/libpthread.so.0
#11 0x0000003315ae88fd in clone () from /lib64/libc.so.6


gluster volume status
Status of volume: vol0
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 192.168.44.106:/rhs/brick1/d1r1       N/A       N/A        N       7738 
Brick 192.168.44.108:/rhs/brick1/d1r1       0         49154      Y       2378 
Brick 192.168.44.106:/rhs/brick1/d2r1       N/A       N/A        N       7752 
Brick 192.168.44.108:/rhs/brick1/d2r2       0         49155      Y       2392 
Self-heal Daemon on localhost               N/A       N/A        Y       7775 
Quota Daemon on localhost                   N/A       N/A        Y       8095 
Self-heal Daemon on 192.168.44.107          N/A       N/A        Y       2972 
Quota Daemon on 192.168.44.107              N/A       N/A        Y       3201 
Self-heal Daemon on 192.168.44.108          N/A       N/A        Y       2417 
Quota Daemon on 192.168.44.108              N/A       N/A        Y       2635 
 
Task Status of Volume vol0
------------------------------------------------------------------------------
There are no active volume tasks


Expected results:
ltp test should finish properly, there should not be any coredump for glusterfsd or nfsd processes

Additional info:

Comment 1 Saurabh 2015-03-02 01:46:12 UTC
 gluster volume info
 
Volume Name: vol0
Type: Distributed-Replicate
Volume ID: 25f4b031-f68e-4e43-9d2a-ce99abaf39ca
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: rdma
Bricks:
Brick1: 192.168.44.106:/rhs/brick1/d1r1
Brick2: 192.168.44.108:/rhs/brick1/d1r1
Brick3: 192.168.44.106:/rhs/brick1/d2r1
Brick4: 192.168.44.108:/rhs/brick1/d2r2
Options Reconfigured:
features.quota: on
nfs-ganesha.enable: on
nfs-ganesha.host: 192.168.44.106
nfs.disable: on
auto-delete: disable
snap-max-soft-limit: 90
snap-max-hard-limit: 256

Comment 2 Saurabh 2015-03-02 02:18:12 UTC
Created attachment 996930 [details]
brick1-coredump

Comment 3 Saurabh 2015-03-02 02:24:01 UTC
Created attachment 996931 [details]
brick2-coredump

Comment 5 Soumya Koduri 2015-03-02 04:49:27 UTC
CC'ed Raghavendra, Rafi and Jiffin to check if they have seen any similar issue while using RDMA.

Comment 6 Raghavendra Talur 2015-03-02 12:59:49 UTC
I have not looked at the core dump yet, but looking at the backtrace in the bug
I see nomem messages. What was the configuration of the machine that was being used as nfs host(192.168.44.106)?

RAM does not seem to be sufficient. 

Will continue to look at it, may be a different root cause.

Comment 7 Jiffin 2015-03-03 12:27:56 UTC
In the ltp test suite, crash was caused due to fsstress-test.

When this test ran alone in the nfsv3 mount, ganesha server didn't crash , but bricks goes down with same back trace.

Similarly for nfsv4, it completed successfully without any crash.

the test command used :

fsstress -d <mount point> -l 22 -n 22 -p 22  2

Comment 8 Mohammed Rafi KC 2015-03-04 11:36:14 UTC
Root cause identified as: when doing rdma vectored read from the remote end point, the calculation of remote address went wrong from second vector onward. Nevertheless of number of remote buffers, we are always setting the first buffer as remote address for all rdma remote read.

Comment 9 Anand Avati 2015-03-04 11:39:00 UTC
REVIEW: http://review.gluster.org/9794 (rdma: setting wrong remote memory.) posted (#2) for review on master by mohammed rafi  kc (rkavunga)

Comment 10 Anand Avati 2015-03-04 14:06:06 UTC
REVIEW: http://review.gluster.org/9794 (rdma: setting wrong remote memory.) posted (#3) for review on master by Humble Devassy Chirammal (humble.devassy)

Comment 11 Anand Avati 2015-03-05 05:27:48 UTC
REVIEW: http://review.gluster.org/9794 (rdma:setting wrong remote memory.) posted (#4) for review on master by mohammed rafi  kc (rkavunga)

Comment 12 Anand Avati 2015-03-05 06:08:10 UTC
COMMIT: http://review.gluster.org/9794 committed in master by Raghavendra G (rgowdapp) 
------
commit e08aea2fd67a06275423ded157431305a7925cf6
Author: Mohammed Rafi KC <rkavunga>
Date:   Wed Mar 4 14:37:05 2015 +0530

    rdma:setting wrong remote memory.
    
    when we send more than one work request in a single call,
    the remote addr is always setting as the first address of
    the vector.
    
    Change-Id: I55aea7bd6542abe22916719a139f7c8f73334d26
    BUG: 1197548
    Signed-off-by: Mohammed Rafi KC <rkavunga>
    Reviewed-on: http://review.gluster.org/9794
    Reviewed-by: Raghavendra G <rgowdapp>
    Tested-by: Raghavendra G <rgowdapp>

Comment 13 Niels de Vos 2015-05-14 17:29:14 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 14 Niels de Vos 2015-05-14 17:35:51 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 15 Niels de Vos 2015-05-14 17:38:13 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 16 Niels de Vos 2015-05-14 17:46:12 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user