Hide Forgot
While verifying bz484691 following errors appeared in qpidd log. Used iWARP over Chelsio S310-CR. Both qpid-perftest and qpid-latency-test were running simultaneously over rdma against qpidd on the other side. Description of problem: 2011-01-28 08:41:38 error RDMA: qp=0x2aaab42d2cc0: Deleting queue before all write buffers finished 2011-01-28 08:49:38 error RDMA: qp=0x2aaab8aa1f50: Deleting queue before all write buffers finished 2011-01-28 09:33:02 error RDMA: qp=0x2aaab4964a20: Deleting queue before all write buffers finished 2011-01-28 09:36:06 error RDMA: qp=0x2aaab49e1610: Deleting queue before all write buffers finished 2011-01-28 09:42:22 error RDMA: qp=0x2aaab4b13c80: Deleting queue before all write buffers finished Version-Release number of selected component (if applicable): rpm -qa | grep qpid | sort -u python-qpid-0.7.946106-15.el5 qpid-cpp-client-0.7.946106-27.el5 qpid-cpp-client-devel-0.7.946106-27.el5 qpid-cpp-client-devel-docs-0.7.946106-27.el5 qpid-cpp-client-rdma-0.7.946106-27.el5 qpid-cpp-client-ssl-0.7.946106-27.el5 qpid-cpp-server-0.7.946106-27.el5 qpid-cpp-server-cluster-0.7.946106-27.el5 qpid-cpp-server-devel-0.7.946106-27.el5 qpid-cpp-server-rdma-0.7.946106-27.el5 qpid-cpp-server-ssl-0.7.946106-27.el5 qpid-cpp-server-store-0.7.946106-27.el5 qpid-cpp-server-xml-0.7.946106-27.el5 qpid-java-client-0.7.946106-14.el5 qpid-java-common-0.7.946106-14.el5 qpid-java-example-0.7.946106-14.el5 qpid-tools-0.7.946106-12.el5 libcxgb3-1.2.5-2.el5 kernel-2.6.18-238.el5 How reproducible: Steps to Reproduce: HostA (192.168.1.5 ) 1. qpidd --auth no --mgmt-enable no --log-to-file /tmp/qpidd.log -d HostB (192.168.1.4) 2. while true; do date; qpid-perftest -b 192.168.1.5 --count 100 --protocol rdma --log-to-file /tmp/qpid-perftest.log --log-to-stderr no --base-name "perf.$(date +%s%N)" 2>&1 ; sleep 0.5; done>>/tmp/qpid-perftest.log 3. while true; do date; qpid-latency-test -b 192.168.1.5 --count 100 --protocol rdma --log-to-file /tmp/qpid-latency-test.log --log-to-stderr no --queue-base-name "latency.$(date +%s%N)" 2>&1 ; sleep 0.5; done>>/tmp/qpid-latency-test.log Actual results: Error messages in qpid.log Expected results: No error messages Additional info:
Is this expected?
These messages are expected if the peer disconnects abruptly without receiving all the buffered messages that it should have received. If the peer is does not disconnect abruptly but shuts down normally then these messages probably indicate a problem occurring and should be investigated. If there is only one set of messages then they could well be appearing from the final interrupting of the perftest/latencytest to stop the test. The message should probably be downgraded to warning rather than error, as it can happen without necessarily being an error (although it does look fishy in this case as the should only have been orderly shut downs here.
Same issue was observed with qpid-0.18-14 on rhel6.4 with Mellanox infiniband devices (IPoIB). Scenario is the same. Clients reports exit code 0, as far as I observe there was no message loss. HW: InfiniBand: Mellanox Technologies MT26428 Log messages: 2013-01-31 23:04:08 [System] error RDMA: qp=0x4bd32950: Deleting queue before all write buffers finished 2013-01-31 23:04:19 [System] error RDMA: qp=0x52e34b50: Deleting queue before all write buffers finished 2013-01-31 23:04:22 [System] error RDMA: qp=0x6b1544c0: Deleting queue before all write buffers finished Packages: python-qpid-0.18-4.el6.noarch python-qpid-qmf-0.18-14.el6.x86_64 qpid-cpp-client-0.18-14.el6.x86_64 qpid-cpp-client-devel-0.18-14.el6.x86_64 qpid-cpp-client-devel-docs-0.18-14.el6.noarch qpid-cpp-client-rdma-0.18-14.el6.x86_64 qpid-cpp-server-0.18-14.el6.x86_64 qpid-cpp-server-devel-0.18-14.el6.x86_64 qpid-cpp-server-rdma-0.18-14.el6.x86_64 qpid-cpp-server-store-0.18-14.el6.x86_64 qpid-cpp-server-xml-0.18-14.el6.x86_64 qpid-java-client-0.18-7.el6.noarch qpid-java-common-0.18-7.el6.noarch qpid-java-example-0.18-7.el6.noarch qpid-qmf-0.18-14.el6.x86_64 qpid-tools-0.18-7.el6_3.noarch libcxgb3-1.3.1-1.el6.x86_64 kernel-2.6.32-358.el6.x86_64
This issue is also present with Chelsio devices via iWarp: Chelsio Communications Inc T320 10GbE Dual Port Adapter