Description of problem: With HA enabled, 2 or more node cluster. If a message is enqueued in a transaction, and then dequeued with no transaction, the message is left on the backup brokers. Version-Release number of selected component (if applicable): Up to trunk r1615992, 0.22-mrg f22fae4 How reproducible: 100% Steps to Reproduce: qpid-send -a 'q;{create:always}' -b 20.0.20.200 --content-string foo --tx 1 qpid-receive -a 'q;{create:always}' -b 20.0.20.200 Then on each node: qpid-stat -q --ha-admin Actual results: Primary q is empty, backup q still has 1 message Message is enqueued twice on backup and dequeued only once. Expected results: Primary and backup q empty, message enqueued and dequeued once on primary and backup Additional info: With a tx dequeue (add --tx 1 to qpid-receive) the message is removed completely on the backup so at first glance things appear OK. However qpid-stat shows it is enqueued and dequeued _twice_ which is incorrect.
Using the python client there is still a message left on the backup, but there is only 1 enqueue and no dequeue, rather than 2 enqueues and 1 dequeue.
http://git.app.eng.bos.redhat.com/git/rh-qpid.git/commit/?h=0.22-mrg-aconway-bz1127089&id=4d4091e1b3ea961bb140ac0f8b88195372d9903c Bug 1127089 - QPID-5966: HA mixing tx enqueue and non-tx dequeue leaves extra messages on backup.0.22-mrg-aconway-bz1127089 There were several problems: 1. Positions of transactionally enqueued messages not known to QueueReplicator, so not dequeued on backup if dequeued outside a TX on primary. 2. Race condition if tx created immediately after queue could cause duplication of TX message. 3. Replication IDs were not being set during recovery from store (regression, store change?) Fix: 1. Update positions QueueReplicator positions via QueueObserver::enqueued to see all enqueues. 2. Check for duplicate replication-ids on backup in QueueReplicator::route. 3. Set replication-id in publish() if not already set in record().
The issue has been fixed, tested on qpid HAs on RHEL 6.5 i686 / x86_64 using packages: perl-qpid-debuginfo-0.22-13.el6.x86_64 python-qpid-0.22-17.el6.noarch python-qpid-qmf-0.22-37.el6.x86_64 qpid-cpp-client-0.22-46.el6.x86_64 qpid-cpp-client-devel-0.22-46.el6.x86_64 qpid-cpp-client-devel-docs-0.22-46.el6.noarch qpid-cpp-client-rdma-0.22-46.el6.x86_64 qpid-cpp-debuginfo-0.22-46.el6.x86_64 qpid-cpp-server-0.22-46.el6.x86_64 qpid-cpp-server-devel-0.22-46.el6.x86_64 qpid-cpp-server-ha-0.22-46.el6.x86_64 qpid-cpp-server-linearstore-0.22-46.el6.x86_64 qpid-cpp-server-rdma-0.22-46.el6.x86_64 qpid-cpp-server-xml-0.22-46.el6.x86_64 qpid-java-client-0.22-6.el6.noarch qpid-java-common-0.22-6.el6.noarch qpid-java-example-0.22-6.el6.noarch qpid-jca-0.22-2.el6.noarch qpid-jca-xarecovery-0.22-2.el6.noarch qpid-jca-zip-0.22-2.el6.noarch qpid-proton-c-0.7-3.el6.x86_64 qpid-proton-c-devel-0.7-3.el6.x86_64 qpid-proton-debuginfo-0.7-3.el6.x86_64 qpid-qmf-0.22-37.el6.x86_64 qpid-qmf-debuginfo-0.22-37.el6.x86_64 qpid-qmf-devel-0.22-37.el6.x86_64 qpid-snmpd-1.0.0-16.el6.x86_64 qpid-snmpd-debuginfo-1.0.0-16.el6.x86_64 qpid-tests-0.22-16.el6.noarch qpid-tools-0.22-14.el6.noarch perl-qpid-debuginfo-0.22-13.el6.i686 python-qpid-0.22-17.el6.noarch python-qpid-qmf-0.22-37.el6.i686 qpid-cpp-client-0.22-46.el6.i686 qpid-cpp-client-devel-0.22-46.el6.i686 qpid-cpp-client-devel-docs-0.22-46.el6.noarch qpid-cpp-client-rdma-0.22-46.el6.i686 qpid-cpp-debuginfo-0.22-46.el6.i686 qpid-cpp-server-0.22-46.el6.i686 qpid-cpp-server-devel-0.22-46.el6.i686 qpid-cpp-server-ha-0.22-46.el6.i686 qpid-cpp-server-linearstore-0.22-46.el6.i686 qpid-cpp-server-rdma-0.22-46.el6.i686 qpid-cpp-server-xml-0.22-46.el6.i686 qpid-java-client-0.22-6.el6.noarch qpid-java-common-0.22-6.el6.noarch qpid-java-example-0.22-6.el6.noarch qpid-jca-0.22-2.el6.noarch qpid-jca-xarecovery-0.22-2.el6.noarch qpid-jca-zip-0.22-2.el6.noarch qpid-proton-c-0.7-3.el6.i686 qpid-proton-c-devel-0.7-3.el6.i686 qpid-proton-debuginfo-0.7-3.el6.i686 qpid-qmf-0.22-37.el6.i686 qpid-qmf-debuginfo-0.22-37.el6.i686 qpid-qmf-devel-0.22-37.el6.i686 qpid-snmpd-1.0.0-16.el6.i686 qpid-snmpd-debuginfo-1.0.0-16.el6.i686 qpid-tests-0.22-16.el6.noarch qpid-tools-0.22-14.el6.noarch Non-transaction dequeue removs message from primary and all backups. -> VERIFIED Repro steps: # common variables M=192.168.6.251 # virt IP Q=BZ1127089 # on one node (one message sent, one received, in c++|python) /var/dtests/node_data/clients/qc2_spout -b $M --log-msgs dict --content foo --tx-batch-size 1 -c 1 --tx-action commit "${Q}_cpp;{create:always}" /var/dtests/node_data/clients/qc2_drain -b $M --log-msgs dict -c 1 "${Q}_cpp;{create:always}" /var/dtests/node_data/clients/qc2_spout.py -b $M --log-msgs dict --content foo --tx-batch-size 1 -c 1 --tx-action commit "${Q}_py;{create:always}" /var/dtests/node_data/clients/qc2_drain.py -b $M --log-msgs dict -c 1 "${Q}_py;{create:always}" # on all nodes (queue depth 0 "0 1 1") qpid-stat -q --ha-admin | grep "${Q}_"