Bug 849124 - glusterd crashed when trying to mount a tcp,rdma volume via rdma transport
Summary: glusterd crashed when trying to mount a tcp,rdma volume via rdma transport
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: rdma
Version: 2.0
Hardware: x86_64
OS: Linux
low
medium
Target Milestone: ---
: ---
Assignee: Raghavendra G
QA Contact: shylesh
URL:
Whiteboard:
Depends On: GLUSTER-3664
Blocks: 858449
TreeView+ depends on / blocked
 
Reported: 2012-08-17 11:45 UTC by Vidya Sakar
Modified: 2015-05-13 17:18 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: GLUSTER-3664
: 858449 (view as bug list)
Environment:
Last Closed: 2015-02-13 09:47:05 UTC
Embargoed:


Attachments (Terms of Use)

Description Vidya Sakar 2012-08-17 11:45:01 UTC
+++ This bug was initially created as a clone of Bug #765396 +++

Created a volume with tcp,rdma transport type. Now I can mount this volume via tcp transport but when I tried to mount the same volume via rdma transport glusterd crahsed. Core is generated with following backtrace.

Loaded symbols for /usr/lib64/libmthca-rdmav2.so
Reading symbols from /lib64/libnss_files.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libnss_files.so.2
Reading symbols from /lib64/libnss_dns.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libnss_dns.so.2
Reading symbols from /lib64/libresolv.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libresolv.so.2
Reading symbols from /lib64/libgcc_s.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libgcc_s.so.1

warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7fff9f1fc000
Core was generated by `glusterd'.
Program terminated with signal 11, Segmentation fault.
#0  0x00002aaaab01d592 in rdma_decode_msg (peer=0x7901708, post=0x74cd7e0, readch=0x43b3f050, bytes_in_post=164) at ../../../../../rpc/rpc-transport/rdma/src/rdma.c:2804
2804                    memcpy (post->ctx.vector[0].iov_base, ptr,
(gdb) bt
#0  0x00002aaaab01d592 in rdma_decode_msg (peer=0x7901708, post=0x74cd7e0, readch=0x43b3f050, bytes_in_post=164) at ../../../../../rpc/rpc-transport/rdma/src/rdma.c:2804
#1  0x00002aaaab01d6d4 in rdma_decode_header (peer=0x7901708, post=0x74cd7e0, readch=0x43b3f050, bytes_in_post=164) at ../../../../../rpc/rpc-transport/rdma/src/rdma.c:2844
#2  0x00002aaaab01e4f6 in rdma_process_recv (peer=0x7901708, wc=0x43b3f0d0) at ../../../../../rpc/rpc-transport/rdma/src/rdma.c:3215
#3  0x00002aaaab01e93e in rdma_recv_completion_proc (data=0x5ea7420) at ../../../../../rpc/rpc-transport/rdma/src/rdma.c:3347
#4  0x000000328420673d in start_thread () from /lib64/libpthread.so.0
#5  0x0000003283ad40cd in clone () from /lib64/libc.so.6
(gdb) f 0
#0  0x00002aaaab01d592 in rdma_decode_msg (peer=0x7901708, post=0x74cd7e0, readch=0x43b3f050, bytes_in_post=164) at ../../../../../rpc/rpc-transport/rdma/src/rdma.c:2804
2804                    memcpy (post->ctx.vector[0].iov_base, ptr,
(gdb) 


I will upload the glusterd log file. I have archived the core file since it's too big to upload. (201MB)

--- Additional comment from raghavendra on 2011-09-30 02:35:57 EDT ---

f78c8253d7fb7576 is causing some memory corruption.

--- Additional comment from amarts on 2012-02-27 05:35:50 EST ---

This is the priority for immediate future (before 3.3.0 GA release). Will bump the priority up once we take RDMA related tasks.

Comment 2 Amar Tumballi 2012-08-23 06:45:15 UTC
This bug is not seen in current master branch (which will get branched as RHS 2.1.0 soon). To consider it for fixing, want to make sure this bug still exists in RHS servers. If not reproduced, would like to close this.

Comment 4 Sachidananda Urs 2013-08-08 05:43:42 UTC
Moving out of Big Bend since RDMA support is not available in Big Bend,2.1

Comment 7 Vivek Agarwal 2014-06-17 12:19:00 UTC
Removing RDMA related bugs, as they are not in scope for  Denali


Note You need to log in before you can comment on or make changes to this bug.