Bug 591139

Summary: RDMA client shutdown is broken (client hangs)
Product: Red Hat Enterprise MRG Reporter: Andrew Stitcher <astitcher>
Component: qpid-cppAssignee: Andrew Stitcher <astitcher>
Status: CLOSED ERRATA QA Contact: Jan Sarenik <jsarenik>
Severity: high Docs Contact:
Priority: urgent    
Version: betaCC: gsim, jsarenik
Target Milestone: 1.3   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Andrew Stitcher 2010-05-11 14:21:47 UTC
Description of problem:

Since the changes to the client shutdown in trunk r934503 rdma the client shutdown() callback will no longer be called when clients close an rdma connection.

This results in the client hanging whilst waiting for rdma connections to be completely closed.

How reproducible:

100%

Steps to Reproduce:
1. run "qpidd --auth no" (you will need to have a working rdma plugin)

To make sure that rdma is running, check that there is a line like:
    2010-05-10 17:55:04 notice Rdma: Listening on RDMA port 5672
in the output.

2. run perftest -Prdma -b <IP address of IB interface> 

3. Wait forever for exit after the test completes and print results.

Comment 1 Andrew Stitcher 2010-06-14 15:25:35 UTC
This is fixed as of r954499 upstream

Comment 4 Jan Sarenik 2010-06-30 13:50:21 UTC
Verified on RHEL5 x86_64
  qpid-cpp-server-rdma-0.7.946106-4.el5
  qpid-cpp-client-rdma-0.7.946106-4.el5

Comment 5 Jan Sarenik 2010-06-30 14:23:51 UTC
But I am unable to reproduce described bug.

It is strange that I get the same (good, no hang) results on version
which should be buggy:

bash-3.2# rpm -qa | grep qpid
qpidd-0.5.752581-34.el5
qpid-cpp-client-rdma-0.7.946106-2.el5
qpid-cpp-server-rdma-0.7.946106-2.el5
qpid-cpp-client-0.7.946106-2.el5
qpid-cpp-server-0.7.946106-2.el5
qpid-cpp-client-devel-0.7.946106-2.el5
bash-3.2# qpid-perftest -Prdma -b 192.168.55.26
Processing 1 messages from qpid-perftest_sub_ready . done.
Sending start 1 times to qpid-perftest_pub_start
Processing 1 messages from qpid-perftest_pub_done . done.
Processing 1 messages from qpid-perftest_sub_done . done.
... [SNIP] ...
Total transfers/sec:      122993
Total Mbytes/sec: 120.111
bash-3.2#

Comment 6 Andrew Stitcher 2010-07-04 20:55:44 UTC
I think that -2 had most of the fix in and you should try to reproduce with -1

Comment 7 Jan Sarenik 2010-07-07 08:05:36 UTC
So I have tried
  qpid-cpp-mrg-0.7.946106-1.el5
  qpid-cpp-mrg-0.7.935473-1.el5
  qpid-cpp-mrg-0.7.929717-1.el5

none of above reproduce the bug how you describe it.
I am sure I use RDMA as I get troughput ~136 MB/s.
May the bug be connected with something else (libibverbs,
particular IB driver, etc.)? I run the broker without
store or acl, focusing on RDMA.

On which machine did you experience the reported bug?
What hardware did you use?

I use mrg26.lab.bos.redhat.com, which has (by lscpi)
Mellanox Technologies MT25204 [InfiniHost III Lx HCA]
i.e. the package containing its driver is libmthca.

Comment 8 Jan Sarenik 2010-07-07 08:11:29 UTC
More info: The machine (mrg26) I tried to reproduce this bug on
was running RHEL5 x86_64 with following MRG packages installed:
  qpid-cpp-client-devel.x86_64
  qpid-cpp-client-rdma.x86_64
  qpid-cpp-client.x86_64
  qpid-cpp-server-rdma.x86_64
  qpid-cpp-server.x86_64

Comment 9 Jan Sarenik 2010-07-22 13:53:17 UTC
Though the bug is not reproducible, I am sure the described
bug is no longer valid in current versions. --> VERIFIED