Description of problem: During a high-throughput longevity test of HA, several extended periods of throughput drop-outs were recorded. Version-Release number of selected component (if applicable): Qpid 0.18 How reproducible: Frequently Steps to Reproduce: 1. Bi-directionally federate a chain of several HA brokers w/ acks enabled 2. Have client apps pull messages from a queue tied to each federation bridges' destination exchange into the bridge queue of the next federation hop 3. Utilize ring queue policies on all queues, disable producer flow control and queue threshold events 5. Inject messages at high-throughput concurrently from both sides of the broker chain Actual results: Witnessed multiple multi-minute periods where throughput in the system dropped to zero messages / sec Expected results: The throughput of the system remains consistent
Created attachment 628211 [details] Proposed patch Limit the window size for the HA backup bridge's queue subscription. Improve the performance of the ring queue policy from O(n) to O(1). Reduce lock contention on Queue::messageLock caused by Queue::UsageBarrier.
Posted a fix on mrg repo, 4 commits on branch aconway-ha-2, branched off 0.18-mrg: http://mrg1.lab.bos.redhat.com/cgit/qpid.git/log/?h=aconway-ha-2 082dd81 Bug 867030 - QPID-4374: Use map instead of SequenceSet for QueueGuard::delayed. acb459f Bug 867030 - QPID-4374: Use configurable credit window for HA backup subscriptions. 74ae16d Bug 867030 - QPID-4374: Improve performance of ring queue policy index. 5fc42e9 Bug 867030 - QPID-4374: Reduce contention on Queue::messageLock Porting to trunk has caused a regression there, will post to trunk as soon as its fixed.
Fixed on trunk by the following 2 commits: r1399814 | Bug 867030 - QPID-4374: Make QueueGuard::cancel idempotent (Jason Dillaman) r1399813 | Bug 867030 - QPID-4374: Use configurable credit window for HA backup subscriptions (Jason Dillaman) r1399812 | Bug 867030 - QPID-4374: Reduce contention on Queue::messageLock (Jason Dillaman)