Bug 1076533 - qpidd's ha-replication hardcoded undocumented limit (65434 queues)
Summary: qpidd's ha-replication hardcoded undocumented limit (65434 queues)
Keywords:
Status: VERIFIED
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: Messaging_Installation_and_Configuration_Guide
Version: Development
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: 3.0
: ---
Assignee: Nobody
QA Contact: Messaging QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-03-14 13:55 UTC by Frantisek Reznicek
Modified: 2023-06-23 14:13 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 889552 0 high CLOSED HA broker deadlock after loss of primary broker 2021-02-22 00:41:40 UTC

Internal Links: 889552

Description Frantisek Reznicek 2014-03-14 13:55:10 UTC
Description of problem:

qpidd's ha-replicate=all have hardcoded limits (65434 queues).

By testing was detected that there is hard limit to number of replicated queues set to 65434 (queues).

When this threshold is exceeded broker says:
2014-03-14 14:38:25 [HA] error Primary: Cannot create replicated queue test_a1_c1_mcL_msL_md0_txrxkillN_sigterm_brks_00 exceeds limit of 65434 replicated queues.
2014-03-14 14:38:25 [Broker] error Execution exception: resource-limit-exceeded: Exceeded replicated queue limit.


This behavior is not reflected in documentation (MICG):

9.1.16. Controlling replication of queues and exchanges
By default, queues and exchanges are not replicated automatically. You can change the default behavior by setting the ha-replicate configuration option. It has one of the following values:

all
    Replicate everything automatically: queues, exchanges, bindings and messages. 
...


There are two problems:
 * Limit is non-configurable (hardcoded)
 * Limit is not documented

a] HA replication limit HAS TO be documented. (hard requirement)
b] HA replication limit should be configurable. (soft requirement)

Feel free to reassign to MCIG if b] is going to be rejected.


Version-Release number of selected component (if applicable):
perl-qpid-0.22-11.el6.x86_64
perl-qpid-debuginfo-0.22-11.el6.x86_64
python-qpid-0.22-12.el6.noarch
python-qpid-qmf-0.22-28.el6.x86_64
qpid-cpp-client-0.22-36.el6.x86_64
qpid-cpp-client-devel-0.22-36.el6.x86_64
qpid-cpp-client-devel-docs-0.22-36.el6.noarch
qpid-cpp-client-rdma-0.22-36.el6.x86_64
qpid-cpp-debuginfo-0.22-36.el6.x86_64
qpid-cpp-server-0.22-36.el6.x86_64
qpid-cpp-server-devel-0.22-36.el6.x86_64
qpid-cpp-server-ha-0.22-36.el6.x86_64
qpid-cpp-server-linearstore-0.22-36.el6.x86_64
qpid-cpp-server-rdma-0.22-36.el6.x86_64
qpid-cpp-server-xml-0.22-36.el6.x86_64
qpid-java-client-0.22-6.el6.noarch
qpid-java-common-0.22-6.el6.noarch
qpid-java-example-0.22-6.el6.noarch
qpid-jca-0.22-2.el6.noarch
qpid-jca-xarecovery-0.22-2.el6.noarch
qpid-jca-zip-0.22-2.el6.noarch
qpid-proton-c-0.6-1.el6.x86_64
qpid-proton-c-devel-0.6-1.el6.x86_64
qpid-proton-debuginfo-0.6-1.el6.x86_64
qpid-qmf-0.22-28.el6.x86_64
qpid-qmf-debuginfo-0.22-28.el6.x86_64
qpid-qmf-devel-0.22-28.el6.x86_64
qpid-snmpd-1.0.0-16.el6.x86_64
qpid-snmpd-debuginfo-1.0.0-16.el6.x86_64
qpid-tools-0.22-9.el6.noarch
rh-qpid-cpp-tests-0.22-36.el6.x86_64


How reproducible:
100%

Steps to Reproduce:
1. start qpid in HA environment
2. create enough of queues to hit the limit
3. kill the broker will fail start

Actual results:
Qpid HA replication is hard-limited to certain number of queues, documentation says nothing.

Expected results:
Documentation should discuss the limit. Qpid HA replication limit should be configurable

Additional info:

Comment 2 Alan Conway 2014-03-24 13:57:50 UTC
Yes we do have a limit, see also bug 889552. It's not trivial to remove the limit, I think we should document for next release and perhaps keep a BZ open to fix it later.

Comment 3 Justin Ross 2014-03-24 14:11:10 UTC
Changed component for doc work.

Comment 5 Alan Conway 2014-04-28 15:43:10 UTC
Noticed one thing needing updating:

"Transactional changes to queue state are not replicated atomically. If the primary crashes during a transaction, it is possible that the backup could contain only part of the changes introduced by a transaction."

That is no longer true for local TX transactions. Local transactions are replicated correctly. We don't yet support distributed DTX transactions in a HA cluster.

Comment 6 Frantisek Reznicek 2014-06-30 13:41:06 UTC
The replication limit is now documented ok.


I'd like to get corrected transactional behavior as highlighted by Alan in comment 5 still as part of this defect.

"Transactional changes to queue state are not replicated atomically. If the primary crashes during a transaction, it is possible that the backup could contain only part of the changes introduced by a transaction."

->

"Local transactional changes are replicated atomically. If the primary crashes during a local transaction, no data are lost. Distributed transactions are not yet supported by HA cluster."


-> ASSIGNED

Comment 8 Frantisek Reznicek 2014-07-08 11:15:00 UTC
Thanks for the doc change.

-> VERIFIED


Note You need to log in before you can comment on or make changes to this bug.