| Summary: | cached exchange reference can cause cluster inconsistencies if exchange is deleted/recreated | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise MRG | Reporter: | Gordon Sim <gsim> | |
| Component: | qpid-cpp | Assignee: | Alan Conway <aconway> | |
| Status: | CLOSED ERRATA | QA Contact: | Petr Matousek <pematous> | |
| Severity: | unspecified | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 1.3 | CC: | aconway, freznice, iboverma, pematous, tross | |
| Target Milestone: | 2.0 | |||
| Target Release: | --- | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | qpid-cpp-mrg-0.10-4 | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 703724 716260 (view as bug list) | Environment: | ||
| Last Closed: | 2011-06-23 15:43:13 UTC | Type: | --- | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Bug Depends On: | ||||
| Bug Blocks: | 703724, 716260 | |||
I think this is a standalone bug as well. Messages should not be routed by a deleted exchange. Invalidating the cache when the exchange is deleted would solve both standalone and cluster issues. Upstream JIRA https://issues.apache.org/jira/browse/QPID-3215 Fixed on upstream trunk r1095144 Committed to mrg_2.0.x branch http://mrg1.lab.bos.redhat.com/cgit/qpid.git/commit/?h=mrg_2.0.x&id=d56be5929faf81c1d0a44f903613df27d08d5835 Original commit was incomplete, fixed on mrg_2.0.x http://mrg1.lab.bos.redhat.com/cgit/qpid.git/commit/?h=mrg_2.0.x&id=b68cf9408fc95abc5e62590cb9c1dda6c6ba92fd This issue has been fixed in qpid-cpp-mrg-0.10-4 for RHEL5, but not yet available in any RHEL6 package. The bug was cloned for RHEL6: please see bug 703724 Verified on RHEL5.6 architectures: i386, x86_64 packages installed: python-qpid-0.10-1.el5 python-qpid-qmf-0.10-6.el5 qpid-cpp-client-0.10-4.el5 qpid-cpp-client-devel-0.10-4.el5 qpid-cpp-client-devel-docs-0.10-4.el5 qpid-cpp-client-ssl-0.10-4.el5 qpid-cpp-mrg-debuginfo-0.10-4.el5 qpid-cpp-server-0.10-4.el5 qpid-cpp-server-cluster-0.10-4.el5 qpid-cpp-server-devel-0.10-4.el5 qpid-cpp-server-ssl-0.10-4.el5 qpid-cpp-server-store-0.10-4.el5 qpid-cpp-server-xml-0.10-4.el5 qpid-java-client-0.10-4.el5 qpid-java-common-0.10-4.el5 qpid-java-example-0.10-4.el5 qpid-qmf-0.10-6.el5 qpid-qmf-devel-0.10-6.el5 qpid-tools-0.10-4.el5 -> VERIFIED An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2011-0890.html |
Description of problem: SemanticState::route() uses a simple cache variable to avoid looking up the exchange for every message. However if the exchange in question is deleted, even if then recreated, this can cause inconsistencies in a cluster. Version-Release number of selected component (if applicable): 1.3 How reproducible: 100% (Quite a contrived example though) Steps to Reproduce: 1. start one cluster node 2. create an exchange, a queue and a binding between them qpid-config add exchange topic x qpid-config add queue q qpid-config bind x q k 3. start a session and send a message to the exchange with the relevant key (leave session running) qpid-send --content-stdin --address x/k then enter a few lines to send some messages 4. start a new cluster node 5. delete and recreate the exchange, this time add in a different binding qpid-config del exchange x qpid-config add exchange topic x qpid-config add queue q2 qpid-config bind x q2 k 6. send some more messages on the session from 3. with same exchange and key (i.e. type in some more messages if using qpid-send as suggested) now have an inconsistency where the second node has some messages in q2 and some (though fewer than first node) in q1, whereas for first node all the messages are in q1 7. qpid-receive --address 'q2; {mode: browse}' --broker localhost:5673 --capacity 1 (assuming second node is 5673) Actual results: First node shutsdown with inconsistent error Expected results: No inconsistency, should be able to run the command in 7 against q or q2 on either node and see the same results. Additional info: