Bug 484691 - Qpid RDMA support doesn't work with iWarp RNICs
Qpid RDMA support doesn't work with iWarp RNICs
Status: CLOSED ERRATA
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp (Show other bugs)
1.1.1
x86_64 Linux
urgent Severity high
: 2.0
: ---
Assigned To: Andrew Stitcher
ppecka
:
Depends On: 674011 681313
Blocks:
  Show dependency treegraph
 
Reported: 2009-02-09 10:26 EST by Steve Reichard
Modified: 2011-06-23 11:42 EDT (History)
10 users (show)

See Also:
Fixed In Version: qpid-cpp-mrg-0.7.946106-26
Doc Type: Bug Fix
Doc Text:
Previously, the RDMA protocol transport for Qpid supported only Infiniband network interfaces. As a consequence of using Qpid RDMA with an iWarp network interface, the client process was unable to transmit more than 30-40 messages on a single connection due to lost flow control messages. Qpid's usage of RDMA now has changed to support iWarp network interfaces. Current users of RDMA must upgrade any brokers before upgrading their clients if the upgrade is staged. This upgrade order is necessary because the new brokers can detect both the old and new protocols and switch automatically but the new clients will only use the new protocol.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-06-23 11:42:51 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
The other detected hangs (4 cases out of hundreds) (6.26 KB, application/x-tbz)
2011-02-02 08:25 EST, Frantisek Reznicek
no flags Details
The hang reports from common reproducer (35.16 KB, application/x-tbz)
2011-02-02 08:59 EST, Frantisek Reznicek
no flags Details

  None (edit)
Description Steve Reichard 2009-02-09 10:26:39 EST
Description of problem:

Attempts to run RDMA latencytest or perftest using Chelsio IWARP driver appeared to hang.

Testbed
  servers - HP DL585  4 socket 2 core/socket AMD, 72G
            HP DL580  4 socket 4 core/socket Intel, 64G

  interconnect -  Chelsio Communications Inc T310 10GbE Single Port Adapter
     directly connected

Version-Release number of selected component (if applicable):

Kernel: 2.6.18-128.el5
openib-1.3.2-0.20080728.0355.3.el5
perftest-1.2-11.el5
qpidc-perftest-0.4.732838-1.el5
libcxgb3-1.2.2-1.el5

How reproducible:

always on testbed, no other testbeds attempted

Steps to Reproduce:

1. start qpidd on one server
     /usr/sbin/qpidd --auth no --mgmt-enable no

2. start performance test on other server

   latencytest -b renoir-10g --count 100 --protocol rdma

   or

   perftest -b renoir-10g --count 100 --protocol rdma

  
Actual results:

  perftest outputs the following but did not complete:
Processing 1 messages from sub_ready . done.
Sending start 1 times to pub_start
Processing 1 messages from pub_done .

   latencytest did not output anything or complete

Expected results:

completion and output of perfromance numbers

Additional info:

Used rping to confirm the RDMA did work between the hosts

client_test was tried and did return and did not output any errors.
Comment 1 Doug Ledford 2009-02-09 11:09:59 EST
iWARP is different from InfiniBand in that the IB spec regards every connection made to be an RDMA connection, while iWARP tries to walk the line between regular TCP/IP over ethernet and RDMA.  In order to preserve resources on the non-RDMA connections, all connections start out as non-RDMA and must be transitioned to RDMA.  However, the spec requires that the connecting party, not the listening party, must be the one to send the first RDMA packet, and that send will automatically transition the listening side's socket to an RDMA socket in the iWARP resource tables.  So, since iWARP has this special restriction, you can run afoul of it and have things not work that work fine on IB.  The perftest programs in particular were not written with iWARP in mind.  Whether any given test works is dependent on the implementation and whether the connecting side of the socket sends the first RDMA packet or if it waits for the listening side to send the first packet.  The qperf test package *was* written with iWARP in mind and should work reliably on iWARP.  This obviously has implications for the qpidd server as well.  So, I think you need to investigate the code and see if the qpidc-perftest programs you are referencing here run afoul of this RDMA restriction to support iWARP.  Let me know what you find out.  If it doesn't run afoul of the restrictions, then I'll investigate further.
Comment 2 Andrew Stitcher 2009-02-09 15:35:00 EST
I don't think this special start up sequence can be the issue as the AMQP protocol is defined currently so that the connecting party is always the first to send a packet with a protocol identifier and the required protocol version.

So for AMQP the first packet should always come from the connector not the listener.

This issue is *not* dependent on the AMQP programs being used but the protocol itself.
Comment 3 Doug Ledford 2009-02-09 15:54:03 EST
It's not the first packet that matters, but the first RDMA packet that matters.  When you initially open an iWARP connection, it's via standard TCP/IP over ethernet.  It isn't until you send an RDMA specific packet that the connection is transitioned.  Are you using librdmacm to connect to the listening side of the connection?  And are you using ibv_post_send or something similar to actually send the first packet from the connector?  If so, you should be good.
Comment 4 Andrew Stitcher 2009-02-09 16:38:36 EST
The only APIs being used in the qpid rdma protocol driver are librdmacm and ibv APIs.

iWARP is being used simply by specifying the iWARP IP address for the rdma driver father than an IPoIB IP address (or that's my understanding)

Actuallly the fact that client_test completed means that basic transfer over the rdma link must be happening.
Comment 5 Andrew Stitcher 2009-02-09 16:42:04 EST
Actually, at this point I'd have to say that the rdma support is only tested and working over mellanox cards as that seems to be the rdma card that we have working at all.

It is entirely possible that the issue here is not specific to iWARP, but rather some sequence of events coming from the Chelsio card that the driver doesn't cope well with.
Comment 6 Doug Ledford 2009-04-22 19:14:11 EDT
For rhel5.4, both perftest and qperf were updated, with specific mention of handling iWARP interfaces better.  It might be worth grabbing the updated packages and seeing if it helps any.
Comment 7 Andrew Stitcher 2009-05-05 11:14:53 EDT
Verified that there is an issue with the qpid RDMA code by demonstrating lock up with the RDMAServer/RDMAClient test programs
Comment 10 Gordon Sim 2010-08-05 11:37:34 EDT
Missed deadline for 1.3; moving to 1.3.1.
Comment 11 Andrew Stitcher 2010-12-23 14:40:54 EST
The way qpid was using rdma to send amqp turned out to be the problem. We were using Immediate Data to send flow credit and iWarp does not support Immediate Data at all.

A new version of the RDMA amqp protocol has been checked into the trunk of qpid which supports use by iWarp RNICs.

The relevant trunk checkins are:
r1052318, r1052319, r1052320, r1052321, r1052323, r1052324, r1052325, r1052326, r1052327, r1052328, r1052329, r1052330, r1052331

But these depend on some previous trunk checkins:
r1021822, r1021823, r1021831
Comment 14 Andrew Stitcher 2011-01-24 12:41:36 EST
Release note:
Due to an update in the RDMA protocol used to transport AMQP:
Previous users of RDMA over Infiniband will need to upgrade any brokers before upgrading their clients if the upgrade is staged. This is because the new brokers recognise both the old and new protocol and automatically switch, but the new clients will only use the new protocol.
Comment 15 Andrew Stitcher 2011-01-24 12:41:36 EST
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Previously the native RDMA protocol transport for Qpid only worked for Infiniband type network interfaces. If you used Qpid RDMA with an iWarp network interface then usually the client process would hang.
The way that Qpid uses RDMA now to transport AMQP has changed incompatibly to fix this, this was necessary because before we used features of RDMA that are not supported by iWarp.
Note that previous users of RDMA over Infiniband will need to upgrade any brokers before upgrading their clients if the upgrade is staged. This is because the new brokers recognise both the old and new protocol and automatically switch, but the new clients will only use the new protocol.
Comment 24 Andrew Stitcher 2011-02-01 11:11:18 EST
I've changed the technical notes to better describe the previous behaviour.
Comment 25 Andrew Stitcher 2011-02-01 11:11:19 EST
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,3 +1,3 @@
-Previously the native RDMA protocol transport for Qpid only worked for Infiniband type network interfaces. If you used Qpid RDMA with an iWarp network interface then usually the client process would hang.
+Previously the native RDMA protocol transport for Qpid only worked for Infiniband type network interfaces. If you used Qpid RDMA with an iWarp network interface then the client process would be unable to send more 30-40 messages on a single connection because flow control messages got lost.
 The way that Qpid uses RDMA now to transport AMQP has changed incompatibly to fix this, this was necessary because before we used features of RDMA that are not supported by iWarp.
 Note that previous users of RDMA over Infiniband will need to upgrade any brokers before upgrading their clients if the upgrade is staged. This is because the new brokers recognise both the old and new protocol and automatically switch, but the new clients will only use the new protocol.
Comment 26 Frantisek Reznicek 2011-02-02 08:23:38 EST
I can confirm that current behavior is radically improved.

On the -22 broker (stable pkgset atm) there are observed hangs and crashes in nearly 100% of the etst runs.


For completeness I dumped few client hangs (which can be compared to bug 674011 and 674056)

One of the hangs looks like:


(gdb)   
  6 Thread 0x42472940 (LWP 28108)  0x0000003750ad44b8 in epoll_wait ()
   from /lib64/libc.so.6
  5 Thread 0x42e73940 (LWP 28109)  0x000000375160aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  4 Thread 0x43874940 (LWP 28110)  0x0000003750ad44b8 in epoll_wait ()
   from /lib64/libc.so.6
  3 Thread 0x44275940 (LWP 28111)  0x000000375160aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  2 Thread 0x44c76940 (LWP 28112)  0x0000003750ad44b8 in epoll_wait ()
   from /lib64/libc.so.6
* 1 Thread 0x2ac4ab0ab020 (LWP 28107)  0x000000375160aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
(gdb)
Thread 6 (Thread 0x42472940 (LWP 28108)):
#0  0x0000003750ad44b8 in epoll_wait () from /lib64/libc.so.6
#1  0x00002ac4aab2e331 in qpid::sys::Poller::wait (this=0xbb7e6e0,
    timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:563
#2  0x00002ac4aab2edc7 in qpid::sys::Poller::run (this=0xbb7e6e0)
    at qpid/sys/epoll/EpollPoller.cpp:515
#3  0x00002ac4aab2501a in qpid::sys::(anonymous namespace)::runRunnable (p=0x6)
    at qpid/sys/posix/Thread.cpp:35
#4  0x000000375160673d in start_thread () from /lib64/libpthread.so.0
#5  0x0000003750ad40cd in clone () from /lib64/libc.so.6

Thread 5 (Thread 0x42e73940 (LWP 28109)):
#0  0x000000375160aee9 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
#1  0x00002ac4aa7b7a3f in wait (this=0xbb8fed0, id=...)
    at ../include/qpid/sys/posix/Condition.h:63
#2  qpid::client::SessionImpl::waitForCompletionImpl (this=0xbb8fed0, id=...)
    at qpid/client/SessionImpl.cpp:180
#3  0x00002ac4aa7bc690 in qpid::client::SessionImpl::waitForCompletion (
    this=0xbb8fed0, id=...) at qpid/client/SessionImpl.cpp:173
#4  0x00002ac4aa7a27e8 in qpid::client::Future::wait (this=0x42e72cd0,
    session=...) at qpid/client/Future.cpp:31
#5  0x00002ac4aa7b572f in qpid::client::SessionBase_0_10::sync (
    this=<value optimized out>) at qpid/client/SessionBase_0_10.cpp:50
#6  0x000000000041c948 in qpid::tests::PublishThread::run (this=0xbb7e650)
    at qpid-perftest.cpp:543
#7  0x00002ac4aab2501a in qpid::sys::(anonymous namespace)::runRunnable (
    p=0xbb8ff6c) at qpid/sys/posix/Thread.cpp:35
#8  0x000000375160673d in start_thread () from /lib64/libpthread.so.0
#9  0x0000003750ad40cd in clone () from /lib64/libc.so.6

Thread 4 (Thread 0x43874940 (LWP 28110)):
#0  0x0000003750ad44b8 in epoll_wait () from /lib64/libc.so.6
#1  0x00002ac4aab2e331 in qpid::sys::Poller::wait (this=0xbb7e6e0,
    timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:563
#2  0x00002ac4aab2edc7 in qpid::sys::Poller::run (this=0xbb7e6e0)
    at qpid/sys/epoll/EpollPoller.cpp:515
#3  0x00002ac4aab2501a in qpid::sys::(anonymous namespace)::runRunnable (p=0x6)
    at qpid/sys/posix/Thread.cpp:35
#4  0x000000375160673d in start_thread () from /lib64/libpthread.so.0
#5  0x0000003750ad40cd in clone () from /lib64/libc.so.6

Thread 3 (Thread 0x44275940 (LWP 28111)):
#0  0x000000375160aee9 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
#1  0x00002ac4aa7a7b8b in pop (this=0xbb95de0, result=...,
    timeout=<value optimized out>) at ../include/qpid/sys/posix/Condition.h:63
#2  qpid::client::LocalQueueImpl::get (this=0xbb95de0, result=...,
    timeout=<value optimized out>) at qpid/client/LocalQueueImpl.cpp:49
#3  0x00002ac4aa7a83f9 in qpid::client::LocalQueueImpl::get (this=0xbb95de0,
    timeout=...) at qpid/client/LocalQueueImpl.cpp:40
#4  0x00002ac4aa7a8599 in qpid::client::LocalQueueImpl::pop (this=0x80,
    timeout=...) at qpid/client/LocalQueueImpl.cpp:36
#5  0x00002ac4aa7a55ac in qpid::client::LocalQueue::pop (
    this=<value optimized out>, timeout=...) at qpid/client/LocalQueue.cpp:43
#6  0x000000000041b0f2 in qpid::tests::SubscribeThread::run (this=0xbb87070)
    at qpid-perftest.cpp:626
#7  0x00002ac4aab2501a in qpid::sys::(anonymous namespace)::runRunnable (
    p=0x2aaab0000a0c) at qpid/sys/posix/Thread.cpp:35
#8  0x000000375160673d in start_thread () from /lib64/libpthread.so.0
#9  0x0000003750ad40cd in clone () from /lib64/libc.so.6

Thread 2 (Thread 0x44c76940 (LWP 28112)):
#0  0x0000003750ad44b8 in epoll_wait () from /lib64/libc.so.6
#1  0x00002ac4aab2e331 in qpid::sys::Poller::wait (this=0xbb7e6e0,
    timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:563
#2  0x00002ac4aab2edc7 in qpid::sys::Poller::run (this=0xbb7e6e0)
    at qpid/sys/epoll/EpollPoller.cpp:515
#3  0x00002ac4aab2501a in qpid::sys::(anonymous namespace)::runRunnable (p=0x6)
    at qpid/sys/posix/Thread.cpp:35
#4  0x000000375160673d in start_thread () from /lib64/libpthread.so.0
#5  0x0000003750ad40cd in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x2ac4ab0ab020 (LWP 28107)):
#0  0x000000375160aee9 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
#1  0x00002ac4aa7a7b8b in pop (this=0xcb9d9e0, result=...,
    timeout=<value optimized out>) at ../include/qpid/sys/posix/Condition.h:63
#2  qpid::client::LocalQueueImpl::get (this=0xcb9d9e0, result=...,
    timeout=<value optimized out>) at qpid/client/LocalQueueImpl.cpp:49
#3  0x00002ac4aa7a83f9 in qpid::client::LocalQueueImpl::get (this=0xcb9d9e0,
    timeout=...) at qpid/client/LocalQueueImpl.cpp:40
#4  0x00002ac4aa7a8599 in qpid::client::LocalQueueImpl::pop (this=0x80,
    timeout=...) at qpid/client/LocalQueueImpl.cpp:36
#5  0x00002ac4aa7a55ac in qpid::client::LocalQueue::pop (
    this=<value optimized out>, timeout=...) at qpid/client/LocalQueue.cpp:43
#6  0x000000000040f7d7 in qpid::tests::Controller::process(size_t, qpid::client::LocalQueue, std::string, boost::function<void ()(const std::basic_string<char, std::char_traits<char>, std::allocator<char> >&),std::allocator<void> >) (
    this=<value optimized out>, n=1, lq=..., queue=<value optimized out>,
    msgFn=...) at qpid-perftest.cpp:382
#7  0x0000000000413f76 in qpid::tests::Controller::run (this=0x7ffffc582a60)
    at qpid-perftest.cpp:422
#8  0x000000000040cceb in main (argc=196636864, argv=<value optimized out>)
    at qpid-perftest.cpp:719
(gdb) Detaching from program: /usr/bin/qpid-perftest, process 28107
Comment 27 Frantisek Reznicek 2011-02-02 08:25:01 EST
Created attachment 476565 [details]
The other detected hangs (4 cases out of hundreds)
Comment 28 Frantisek Reznicek 2011-02-02 08:59:11 EST
Created attachment 476572 [details]
The hang reports from common reproducer

The issue has been retested once more and comparative analysis of the hangs was done to confirm the above claimed distinction.

If you look at hangs dumped for -22 and -27 c++ client (qpid-perftest) you will see very the same process states for both versions.

For instance compare following two dumps:
b.dump vs dump_pid_10926.20110202_083305.qpid-perftest


Above hang analysis confirms that current hangs are observed at about the same places.
Comment 30 Andrew Stitcher 2011-02-02 10:37:59 EST
The important difference between the symptoms of this and bug 674011 is that 674011 hangs in pthread_join() in the main thread - this only occurs when the client is exiting for some reason. What the other threads are doing isn't relevant.

Bug 674056 doesn't have a complete set of stack traces for all threads so I cannot comment on it.
Comment 32 Frantisek Reznicek 2011-02-02 12:21:07 EST
The bug 674056 hang report is attached as attachment 476572 [details] (see comment 28).
I retested it as the complete dump was not attached.

If you look at the dump_pid_10926.20110202_083305.qpid-perftest you will see
this:

  5 Thread 0x4257e940 (LWP 10927)  0x00000036226d44b8 in epoll_wait ()
   from /lib64/libc.so.6
  4 Thread 0x419c4940 (LWP 10929)  0x00000036226d44b8 in epoll_wait ()
   from /lib64/libc.so.6
  3 Thread 0x43980940 (LWP 10938)  0x000000362320aee9 in
pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  2 Thread 0x44381940 (LWP 10939)  0x00000036226d44b8 in epoll_wait ()
   from /lib64/libc.so.6
* 1 Thread 0x2ad2a95ca020 (LWP 10926)  0x000000362320aee9 in
pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
(gdb) 
Thread 5 (Thread 0x4257e940 (LWP 10927)):
#0  0x00000036226d44b8 in epoll_wait () from /lib64/libc.so.6
#1  0x00002ad2a90434c1 in qpid::sys::Poller::wait (this=0x1e16eb40, 
    timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:563
#2  0x00002ad2a9043f57 in qpid::sys::Poller::run (this=0x1e16eb40)
    at qpid/sys/epoll/EpollPoller.cpp:515
#3  0x00002ad2a903a1aa in qpid::sys::(anonymous namespace)::runRunnable (p=0x6)
    at qpid/sys/posix/Thread.cpp:35
#4  0x000000362320673d in start_thread () from /lib64/libpthread.so.0
#5  0x00000036226d40cd in clone () from /lib64/libc.so.6

Thread 4 (Thread 0x419c4940 (LWP 10929)):
#0  0x00000036226d44b8 in epoll_wait () from /lib64/libc.so.6
#1  0x00002ad2a90434c1 in qpid::sys::Poller::wait (this=0x1e16eb40, 
    timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:563
#2  0x00002ad2a9043f57 in qpid::sys::Poller::run (this=0x1e16eb40)
    at qpid/sys/epoll/EpollPoller.cpp:515
#3  0x00002ad2a903a1aa in qpid::sys::(anonymous namespace)::runRunnable (p=0x6)
    at qpid/sys/posix/Thread.cpp:35
#4  0x000000362320673d in start_thread () from /lib64/libpthread.so.0
#5  0x00000036226d40cd in clone () from /lib64/libc.so.6

Thread 3 (Thread 0x43980940 (LWP 10938)):
#0  0x000000362320aee9 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
#1  0x00002ad2a8cc320b in pop (this=0x2aaab0001120, result=..., 
    timeout=<value optimized out>) at ../include/qpid/sys/posix/Condition.h:63
#2  qpid::client::LocalQueueImpl::get (this=0x2aaab0001120, result=..., 
    timeout=<value optimized out>) at qpid/client/LocalQueueImpl.cpp:49
#3  0x00002ad2a8cc3a79 in qpid::client::LocalQueueImpl::get (
    this=0x2aaab0001120, timeout=...) at qpid/client/LocalQueueImpl.cpp:40
#4  0x00002ad2a8cc3c19 in qpid::client::LocalQueueImpl::pop (this=0x80, 
    timeout=...) at qpid/client/LocalQueueImpl.cpp:36
#5  0x00002ad2a8cc0c2c in qpid::client::LocalQueue::pop (
    this=<value optimized out>, timeout=...) at qpid/client/LocalQueue.cpp:43
#6  0x000000000041b0f2 in qpid::tests::SubscribeThread::run (this=0x1e16eb90)
    at qpid-perftest.cpp:626
#7  0x00002ad2a903a1aa in qpid::sys::(anonymous namespace)::runRunnable (
    p=0x2aaab080610c) at qpid/sys/posix/Thread.cpp:35
#8  0x000000362320673d in start_thread () from /lib64/libpthread.so.0
#9  0x00000036226d40cd in clone () from /lib64/libc.so.6

Thread 2 (Thread 0x44381940 (LWP 10939)):
#0  0x00000036226d44b8 in epoll_wait () from /lib64/libc.so.6
#1  0x00002ad2a90434c1 in qpid::sys::Poller::wait (this=0x1e16eb40, 
    timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:563
#2  0x00002ad2a9043f57 in qpid::sys::Poller::run (this=0x1e16eb40)
    at qpid/sys/epoll/EpollPoller.cpp:515
#3  0x00002ad2a903a1aa in qpid::sys::(anonymous namespace)::runRunnable (p=0x6)
    at qpid/sys/posix/Thread.cpp:35
#4  0x000000362320673d in start_thread () from /lib64/libpthread.so.0
#5  0x00000036226d40cd in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x2ad2a95ca020 (LWP 10926)):
#0  0x000000362320aee9 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
#1  0x00002ad2a8cc320b in pop (this=0x1e9b45f0, result=..., 
    timeout=<value optimized out>) at ../include/qpid/sys/posix/Condition.h:63
#2  qpid::client::LocalQueueImpl::get (this=0x1e9b45f0, result=..., 
    timeout=<value optimized out>) at qpid/client/LocalQueueImpl.cpp:49
#3  0x00002ad2a8cc3a79 in qpid::client::LocalQueueImpl::get (this=0x1e9b45f0, 
    timeout=...) at qpid/client/LocalQueueImpl.cpp:40
#4  0x00002ad2a8cc3c19 in qpid::client::LocalQueueImpl::pop (this=0x80, 
    timeout=...) at qpid/client/LocalQueueImpl.cpp:36
#5  0x00002ad2a8cc0c2c in qpid::client::LocalQueue::pop (
    this=<value optimized out>, timeout=...) at qpid/client/LocalQueue.cpp:43
#6  0x000000000040f7d7 in qpid::tests::Controller::process(size_t,
qpid::client::LocalQueue, std::string, boost::function<void ()(const
std::basic_string<char, std::char_traits<char>, std::allocator<char>
>&),std::allocator<void> >) (
    this=<value optimized out>, n=1, lq=..., queue=<value optimized out>, 
    msgFn=...) at qpid-perftest.cpp:382
#7  0x0000000000413f76 in qpid::tests::Controller::run (this=0x7fff659fa910)
    at qpid-perftest.cpp:422
#8  0x000000000040cceb in main (argc=504818656, argv=<value optimized out>)
    at qpid-perftest.cpp:719

This case shows perftest did not hand at the exit.

I'm in complete agreement about bug 674011 and this defect and I'm going to
remove the dependency on 674011 right-a-way.
Comment 33 Andrew Stitcher 2011-02-03 11:00:13 EST
Yep, I agree this stacktrace can't distinguish between the the bugs [Even though I know from a white box perspective that the underlying causes must be entirely different].
Comment 37 Misha H. Ali 2011-06-01 19:53:15 EDT
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,3 +1,3 @@
-Previously the native RDMA protocol transport for Qpid only worked for Infiniband type network interfaces. If you used Qpid RDMA with an iWarp network interface then the client process would be unable to send more 30-40 messages on a single connection because flow control messages got lost.
+Previously, the RDMA protocol transport for Qpid supported only Infiniband network interfaces. As a consequence of using Qpid RDMA with an iWarp network interface, the client process was unable to transmit more than 30-40 messages on a single connection due to lost flow control messages. Qpid's usage of RDMA now has changed to support iWarp network interfaces. 
-The way that Qpid uses RDMA now to transport AMQP has changed incompatibly to fix this, this was necessary because before we used features of RDMA that are not supported by iWarp.
+
-Note that previous users of RDMA over Infiniband will need to upgrade any brokers before upgrading their clients if the upgrade is staged. This is because the new brokers recognise both the old and new protocol and automatically switch, but the new clients will only use the new protocol.+Current users of RDMA must upgrade any brokers before upgrading their clients if the upgrade is staged. This upgrade order is necessary because the new brokers can detect both the old and new protocols and switch automatically but the new clients will only use the new protocol.
Comment 38 Misha H. Ali 2011-06-05 23:16:32 EDT
Technical note can be viewed in the release notes for 2.0 at the documentation stage here:

http://documentation-stage.bne.redhat.com/docs/en-US/Red_Hat_Enterprise_MRG/2.0/html-single/MRG_Release_Notes/index.html#tabl-MRG_Release_Notes-RHM_Update_Notes-RHM_Update_Notes
Comment 39 errata-xmlrpc 2011-06-23 11:42:51 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2011-0890.html

Note You need to log in before you can comment on or make changes to this bug.