Bug 648927
| Summary: | Clustered broker crashes in assertion in cluster/ExpiryPolicy.cpp | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise MRG | Reporter: | Alan Conway <aconway> | ||||
| Component: | qpid-cpp | Assignee: | Alan Conway <aconway> | ||||
| Status: | CLOSED ERRATA | QA Contact: | ppecka <ppecka> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 1.3 | CC: | freznice, iboverma, jneedle, tross | ||||
| Target Milestone: | 1.3.2-RC1 | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | qpid-cpp-mrg-0.7.946106-26 | Doc Type: | Bug Fix | ||||
| Doc Text: |
When a message with the time to live (TTL) value set was sent to multiple queues by a fanout or topic exchange before a new member joined the cluster, it could time out too early on the new member. This could put queues to an inconsistent state, causing a broker to terminate unexpectedly. With this update, the underlying source code has been adapted to manage message expiration in a cluster correctly, and this error no longer occurs.
|
Story Points: | --- | ||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2011-02-15 12:11:29 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 654872 | ||||||
| Attachments: |
|
||||||
|
Description
Alan Conway
2010-11-02 14:29:40 UTC
To reproduce run make check TESTS=run_cluster_tests "CLUSTER_TESTS=*.test_management -DDURATION=4" in a loop. I've seen the failure in 2-6 iterations. Note you must use a debug build. A release build with -DNDEBUG has assertions compiled out so it will not show this problem. An RPM is a release build, you won't see this issue with RPM-installed qpidd. Created attachment 460882 [details]
Repliable reproducer script.
Attached script reproduces the problem reliably, every time.
The problem is to do with messages that are fanned-out to multiple queues.
The cluster update process does not recognize the same message on different queues and updates as if it were two distinct messages. The cluster expiry code expects a 1-1 correspondence between messages and expiry-ids, which are assigned per message, not per-queued-message.
Fixed on trunk r1036589
Technical note added. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.
New Contents:
C: Bug in code managing message expiry in a cluster.
C: If a message with TTL (Time To Live) set
is sent to multiple queues by a fanout or topic exchange before a new member joins the cluster, it could be timed out too early on the new member. This could lead to queues becoming inconsistent causing a broker to exit with an invalid-argument error.
F: The bug was corrected.
R: Error no longer occurs.
This may be the same issue as Bug 654872 VERIFIED RHEL 5.6 i386 / x86_64: packages used qpid-cpp-mrg-0.7.946106-27.el5.src.rpm qpid-tools-0.7.946106-12.el5.src.rpm openais-0.80.6-28.el5 --> VERIFIED
Technical note updated. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.
Diffed Contents:
@@ -1,5 +1 @@
-C: Bug in code managing message expiry in a cluster.
+When a message with the time to live (TTL) value set was sent to multiple queues by a fanout or topic exchange before a new member joined the cluster, it could time out too early on the new member. This could put queues to an inconsistent state, causing a broker to terminate unexpectedly. With this update, the underlying source code has been adapted to manage message expiration in a cluster correctly, and this error no longer occurs.-C: If a message with TTL (Time To Live) set
-is sent to multiple queues by a fanout or topic exchange before a new member joins the cluster, it could be timed out too early on the new member. This could lead to queues becoming inconsistent causing a broker to exit with an invalid-argument error.
-F: The bug was corrected.
-R: Error no longer occurs.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2011-0217.html |