Description of problem: I have physical nodes with glusterfs build installed on top them. Created a 2x2 glusterfs volume, and executed some cases after mounting the volume over nfs-ganesha. Infra includes 2 physical nodes. Version-Release number of selected component (if applicable): glusterfs-3.7dev-0.611.git729428a.el6.x86_64 nfs-ganesha-debuginfo-2.2-0.rc2.el6.x86_64 nfs-ganesha-gluster-2.2-0.rc2.el6.x86_64 nfs-ganesha-2.2-0.rc2.el6.x86_64 How reproducible: seen first time Steps to Reproduce: 1. create a volume of type 2x2. transport type being RDMA 2. enable nfs-ganesha, and mount the volume with vers=3 on a different node 3. start executing tests like arequal, compile_kernel .. ltp etc Actual results: last test that's under execution is ltp, coredump seen from client, pstree glimpse, ├─sshd─┬─sshd───bash───screen───screen─┬─bash───pstree │ │ └─bash───run.sh───ltp.sh───ltp_run.sh───fsstress───13*[fsstress] │ └─sshd───bash───screen from server, brick logs, pending frames: frame : type(0) op(13) frame : type(0) op(13) patchset: git://git.gluster.com/glusterfs.git signal received: 11 time of crash: 2015-02-27 13:32:59 configuration details: argp 1 backtrace 1 dlfcn 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.7dev /usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb6)[0x3258c20ad6] /usr/lib64/libglusterfs.so.0(gf_print_trace+0x33f)[0x3258c3bdff] /lib64/libc.so.6[0x3315a326a0] /lib64/libc.so.6(xdr_string+0xa7)[0x3315b17db7] /usr/lib64/libgfxdr.so.0(xdr_gfs3_symlink_req+0x6e)[0x325940915e] /usr/lib64/libgfxdr.so.0(xdr_to_generic+0x75)[0x325940ea35] /usr/lib64/glusterfs/3.7dev/xlator/protocol/server.so(server3_3_symlink+0x96)[0x7f09a2340c16] /usr/lib64/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x295)[0x3259009c65] /usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x103)[0x3259009ea3] /usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x325900b5f8] /usr/lib64/glusterfs/3.7dev/rpc-transport/rdma.so(gf_rdma_pollin_notify+0xd4)[0x7f09a11b20b4] /usr/lib64/glusterfs/3.7dev/rpc-transport/rdma.so(gf_rdma_handle_successful_send_completion+0xcf)[0x7f09a11b22ff] /usr/lib64/glusterfs/3.7dev/rpc-transport/rdma.so(+0xaeab)[0x7f09a11b2eab] /lib64/libpthread.so.0[0x3315e079d1] /lib64/libc.so.6(clone+0x6d)[0x3315ae88fd] glusterfsd bt, (gdb) bt #0 0x0000003315b17db7 in xdr_string_internal () from /lib64/libc.so.6 #1 0x000000325940915e in xdr_gfs3_symlink_req (xdrs=0x7f0999470190, objp=0x7f0999471a80) at glusterfs3-xdr.c:466 #2 0x000000325940ea35 in xdr_to_generic (inmsg=..., args=0x7f0999471a80, proc=0x32594090f0 <xdr_gfs3_symlink_req>) at xdr-generic.c:51 #3 0x00007f09a2340c16 in server3_3_symlink (req=0x7f09a13bec7c) at server-rpc-fops.c:5644 #4 0x0000003259009c65 in rpcsvc_handle_rpc_call (svc=<value optimized out>, trans=<value optimized out>, msg=0x7f0968001310) at rpcsvc.c:690 #5 0x0000003259009ea3 in rpcsvc_notify (trans=0x7f099165a070, mydata=<value optimized out>, event=<value optimized out>, data=0x7f0968001310) at rpcsvc.c:784 #6 0x000000325900b5f8 in rpc_transport_notify (this=<value optimized out>, event=<value optimized out>, data=<value optimized out>) at rpc-transport.c:543 #7 0x00007f09a11b20b4 in gf_rdma_pollin_notify (peer=0x7f0991656ea0, post=<value optimized out>) at rdma.c:3722 #8 0x00007f09a11b22ff in gf_rdma_handle_successful_send_completion (peer=0x7f0991656ea0, wc=<value optimized out>) at rdma.c:4196 #9 0x00007f09a11b2eab in gf_rdma_send_completion_proc (data=0x7f0990019bb0) at rdma.c:4270 #10 0x0000003315e079d1 in start_thread () from /lib64/libpthread.so.0 #11 0x0000003315ae88fd in clone () from /lib64/libc.so.6 nfsd bt, (gdb) bt #0 __glfs_entry_fd (glfd=0x7f20fd2824f0, iovec=0x7f21197f9770, iovcnt=1, offset=0, flags=1052672) at glfs-internal.h:195 #1 pub_glfs_pwritev (glfd=0x7f20fd2824f0, iovec=0x7f21197f9770, iovcnt=1, offset=0, flags=1052672) at glfs-fops.c:841 #2 0x000000325980caca in pub_glfs_pwrite (glfd=<value optimized out>, buf=<value optimized out>, count=<value optimized out>, offset=<value optimized out>, flags=<value optimized out>) at glfs-fops.c:952 #3 0x00007f215a9b2c28 in file_write (obj_hdl=0x7f2108a30518, seek_descriptor=0, buffer_size=1048576, buffer=0x7f20dc1c1000, write_amount=0x7f21197f9dc8, fsal_stable=0x7f21197f986f) at /usr/src/debug/nfs-ganesha-2.2-rc2-0.1.1-Source/FSAL/FSAL_GLUSTER/handle.c:1118 #4 0x00000000004e79bd in cache_inode_rdwr_plus (entry=0x7f21098cc2f0, io_direction=CACHE_INODE_WRITE, offset=0, io_size=1048576, bytes_moved=0x7f21197f9dc8, buffer=0x7f20dc1c1000, eof=0x7f21197f9dc7, sync=0x7f21197f9dc6, info=0x0) at /usr/src/debug/nfs-ganesha-2.2-rc2-0.1.1-Source/cache_inode/cache_inode_rdwr.c:169 #5 0x00000000004e8819 in cache_inode_rdwr (entry=0x7f21098cc2f0, io_direction=CACHE_INODE_WRITE, offset=0, io_size=1048576, bytes_moved=0x7f21197f9dc8, buffer=0x7f20dc1c1000, eof=0x7f21197f9dc7, sync=0x7f21197f9dc6) at /usr/src/debug/nfs-ganesha-2.2-rc2-0.1.1-Source/cache_inode/cache_inode_rdwr.c:304 #6 0x000000000046190e in nfs3_write (arg=0x7f20dc00cf70, worker=0x7f20f80008c0, req=0x7f20dc00ceb8, res=0x7f20f923a5a0) at /usr/src/debug/nfs-ganesha-2.2-rc2-0.1.1-Source/Protocols/NFS/nfs3_write.c:234 #7 0x000000000045737f in nfs_rpc_execute (req=0x7f20dc00efe0, worker_data=0x7f20f80008c0) at /usr/src/debug/nfs-ganesha-2.2-rc2-0.1.1-Source/MainNFSD/nfs_worker_thread.c:1268 #8 0x0000000000458119 in worker_run (ctx=0x3ca8b00) at /usr/src/debug/nfs-ganesha-2.2-rc2-0.1.1-Source/MainNFSD/nfs_worker_thread.c:1535 #9 0x000000000051e6b2 in fridgethr_start_routine (arg=0x3ca8b00) at /usr/src/debug/nfs-ganesha-2.2-rc2-0.1.1-Source/support/fridgethr.c:562 #10 0x0000003315e079d1 in start_thread () from /lib64/libpthread.so.0 #11 0x0000003315ae88fd in clone () from /lib64/libc.so.6 gluster volume status Status of volume: vol0 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 192.168.44.106:/rhs/brick1/d1r1 N/A N/A N 7738 Brick 192.168.44.108:/rhs/brick1/d1r1 0 49154 Y 2378 Brick 192.168.44.106:/rhs/brick1/d2r1 N/A N/A N 7752 Brick 192.168.44.108:/rhs/brick1/d2r2 0 49155 Y 2392 Self-heal Daemon on localhost N/A N/A Y 7775 Quota Daemon on localhost N/A N/A Y 8095 Self-heal Daemon on 192.168.44.107 N/A N/A Y 2972 Quota Daemon on 192.168.44.107 N/A N/A Y 3201 Self-heal Daemon on 192.168.44.108 N/A N/A Y 2417 Quota Daemon on 192.168.44.108 N/A N/A Y 2635 Task Status of Volume vol0 ------------------------------------------------------------------------------ There are no active volume tasks Expected results: ltp test should finish properly, there should not be any coredump for glusterfsd or nfsd processes Additional info:
gluster volume info Volume Name: vol0 Type: Distributed-Replicate Volume ID: 25f4b031-f68e-4e43-9d2a-ce99abaf39ca Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: rdma Bricks: Brick1: 192.168.44.106:/rhs/brick1/d1r1 Brick2: 192.168.44.108:/rhs/brick1/d1r1 Brick3: 192.168.44.106:/rhs/brick1/d2r1 Brick4: 192.168.44.108:/rhs/brick1/d2r2 Options Reconfigured: features.quota: on nfs-ganesha.enable: on nfs-ganesha.host: 192.168.44.106 nfs.disable: on auto-delete: disable snap-max-soft-limit: 90 snap-max-hard-limit: 256
Created attachment 996930 [details] brick1-coredump
Created attachment 996931 [details] brick2-coredump
CC'ed Raghavendra, Rafi and Jiffin to check if they have seen any similar issue while using RDMA.
I have not looked at the core dump yet, but looking at the backtrace in the bug I see nomem messages. What was the configuration of the machine that was being used as nfs host(192.168.44.106)? RAM does not seem to be sufficient. Will continue to look at it, may be a different root cause.
In the ltp test suite, crash was caused due to fsstress-test. When this test ran alone in the nfsv3 mount, ganesha server didn't crash , but bricks goes down with same back trace. Similarly for nfsv4, it completed successfully without any crash. the test command used : fsstress -d <mount point> -l 22 -n 22 -p 22 2
Root cause identified as: when doing rdma vectored read from the remote end point, the calculation of remote address went wrong from second vector onward. Nevertheless of number of remote buffers, we are always setting the first buffer as remote address for all rdma remote read.
REVIEW: http://review.gluster.org/9794 (rdma: setting wrong remote memory.) posted (#2) for review on master by mohammed rafi kc (rkavunga)
REVIEW: http://review.gluster.org/9794 (rdma: setting wrong remote memory.) posted (#3) for review on master by Humble Devassy Chirammal (humble.devassy)
REVIEW: http://review.gluster.org/9794 (rdma:setting wrong remote memory.) posted (#4) for review on master by mohammed rafi kc (rkavunga)
COMMIT: http://review.gluster.org/9794 committed in master by Raghavendra G (rgowdapp) ------ commit e08aea2fd67a06275423ded157431305a7925cf6 Author: Mohammed Rafi KC <rkavunga> Date: Wed Mar 4 14:37:05 2015 +0530 rdma:setting wrong remote memory. when we send more than one work request in a single call, the remote addr is always setting as the first address of the vector. Change-Id: I55aea7bd6542abe22916719a139f7c8f73334d26 BUG: 1197548 Signed-off-by: Mohammed Rafi KC <rkavunga> Reviewed-on: http://review.gluster.org/9794 Reviewed-by: Raghavendra G <rgowdapp> Tested-by: Raghavendra G <rgowdapp>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report. glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939 [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user