Hide Forgot
Running qpid-perftest in high thread-count scenarios will result in intermittent hanging of the test. The probability of hanging increases with threads. Version: Trunk (1152240/4468) Broker (running on blade, mrg43; /data is mounted Fusion-io SSD): rm -rf /data/store/*; ./qpidd --auth no --load-module /home/kpvdr/mrg/store.ref/lib/.libs/msgstore.so --store-dir /data/store --jfile-size 512 --num-jfiles 32 --log-enable info+ Client (running on blade, mrg42): ./qpid-perftest -b 20.0.10.43 -s --iterations 5 --count 500000 --durable no --npubs 1 --qt 20 --nsubs 1 This will start to hang at a qt count of 8 and above. Setting --npubs and --nsubs to a higher value will increase the problem significantly. Once the test has hung, stack traces of both the broker and client show normal patterns. Checking the queue with qpid-stat after a hang shows that the messages bound for queue qpid-perftest17 ended up on queue qpid-perftest12 instead: [kpvdr@mrg42 store.ref]$ /home/kpvdr/mrg/qpid.ref/tools/src/py/qpid-stat -b 20.0.10.43 -q Queues queue dur autoDel excl msg msgIn msgOut bytes bytesIn bytesOut cons bind ============================================================================================================================= qpid-perftest17 0 0 0 0 0 0 1 1 qpid-perftest16 0 500k 500k 0 512m 512m 1 1 qpid-perftest15 0 500k 500k 0 512m 512m 1 1 qpid-perftest14 0 500k 500k 0 512m 512m 1 1 qpid-perftest13 0 500k 500k 0 512m 512m 1 1 qpid-perftest12 0 500k 500k 0 512m 512m 1 1 qpid-perftest11 0 500k 500k 0 512m 512m 1 1 qpid-perftest10 0 500k 500k 0 512m 512m 1 1 qpid-perftest_pub_done 0 20 20 0 349 349 1 1 qpid-perftest19 0 500k 500k 0 512m 512m 1 1 qpid-perftest18 0 500k 500k 0 512m 512m 1 1 qpid-perftest_sub_iteration 0 0 0 0 0 0 20 1 qmfc-v2-ui-mrg42.lab.bos.redhat.com.20322.1 Y Y 0 0 0 0 0 0 1 1 topic-mrg42.lab.bos.redhat.com.20322.1 Y Y 0 0 0 0 0 0 1 4 qmfc-v2-mrg42.lab.bos.redhat.com.20322.1 Y Y 0 11 11 0 72.5k 72.5k 1 2 qpid-perftest_sub_done 0 19 19 0 342 342 1 1 reply-mrg42.lab.bos.redhat.com.20322.1 Y Y 0 58 58 0 22.4k 22.4k 1 2 qpid-perftest9 0 500k 500k 0 512m 512m 1 1 qpid-perftest8 0 500k 500k 0 512m 512m 1 1 qpid-perftest7 0 500k 500k 0 512m 512m 1 1 qpid-perftest6 0 500k 500k 0 512m 512m 1 1 qpid-perftest5 0 500k 500k 0 512m 512m 1 1 qpid-perftest4 0 500k 500k 0 512m 512m 1 1 qpid-perftest3 0 500k 500k 0 512m 512m 1 1 qpid-perftest2 500k 1.00m 500k 512m 1.02g 512m 1 1 qpid-perftest1 0 500k 500k 0 512m 512m 1 1 qpid-perftest0 0 500k 500k 0 512m 512m 1 1 qpid-perftest_pub_start 0 20 20 0 100 100 20 1 qmfc-v2-hb-mrg42.lab.bos.redhat.com.20322.1 Y Y 0 0 0 0 0 0 1 2 qpid-perftest_sub_ready 0 20 20 0 100 100 1 1 Could this be a race condition in which the publisher destination is being muddled/overwritten somehow?
From Description above: Checking the queue with qpid-stat after a hang shows that the messages bound for queue qpid-perftest17 ended up on queue qpid-perftest12 instead: ^^^^^^^^^^^^^^^ This _should_ be: Checking the queue with qpid-stat after a hang shows that the messages bound for queue qpid-perftest17 ended up on queue qpid-perftest2 instead:
Created attachment 515913 [details] qpid-stat for another case of mis-sent messages. Additional example in which 2 queues had their messages misplaced in the same test: both qpid-perftest4 and qpid-perftest5 had their 500k messages sent to qpid-perftest0 and qpid-perftest2.
Fixed by Gordon r.1152825
This is in the 0.14 rebase
CLOSED/CRELEASE -> ASSIGNED -> ON_QA The defect has to go through QA process.
Tested on RHEL5.8 and RHEL6.2 on both main architectures (i386 and x86_64). This problem was fixed. Packages used for testing: RHEL5.8 qpid-cpp-client-0.14-14.el5 qpid-cpp-client-devel-0.14-14.el5 qpid-cpp-client-devel-docs-0.14-14.el5 qpid-cpp-client-ssl-0.14-14.el5 qpid-cpp-server-0.14-14.el5 qpid-cpp-server-cluster-0.14-14.el5 qpid-cpp-server-devel-0.14-14.el5 qpid-cpp-server-ssl-0.14-14.el5 qpid-cpp-server-store-0.14-14.el5 qpid-cpp-server-xml-0.14-14.el5 RHEL6.2 qpid-cpp-client-0.14-14.el6_2 qpid-cpp-client-devel-0.14-14.el6_2 qpid-cpp-client-devel-docs-0.14-14.el6_2 qpid-cpp-client-rdma-0.14-14.el6_2 qpid-cpp-client-ssl-0.14-14.el6_2 qpid-cpp-debuginfo-0.14-14.el6_2 qpid-cpp-server-0.14-14.el6_2 qpid-cpp-server-cluster-0.14-14.el6_2 qpid-cpp-server-devel-0.14-14.el6_2 qpid-cpp-server-rdma-0.14-14.el6_2 qpid-cpp-server-ssl-0.14-14.el6_2 qpid-cpp-server-store-0.14-14.el6_2 qpid-cpp-server-xml-0.14-14.el6_2 rh-qpid-cpp-tests-0.14-14.el6_2 -> VERIFIED