Bug 568863 - Dynamic federation tears links down incorrectly
Summary: Dynamic federation tears links down incorrectly
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: 1.2
Hardware: All
OS: Linux
high
high
Target Milestone: 1.3
: ---
Assignee: Ken Giusti
QA Contact: ppecka
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-02-26 19:00 UTC by Gordon Sim
Modified: 2010-10-14 16:07 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
When a dynamic federation of exchanges was in use and multiple brokers shared the same binding key, deleting a bound queue on one of them caused the routes to other queues with the same binding to be removed as well. To avoid this, the broker now tracks the origin of each remote binding, so that deleting a single queue no longer affects other queues.
Clone Of:
Environment:
Last Closed: 2010-10-14 16:07:57 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
bash shell script to repo the failure case (2.39 KB, application/x-shellscript)
2010-04-06 20:32 UTC, Ken Giusti
no flags Details
bash shell script to repo the failure case (4.10 KB, application/x-shellscript)
2010-04-07 15:38 UTC, Ken Giusti
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2010:0773 0 normal SHIPPED_LIVE Moderate: Red Hat Enterprise MRG Messaging and Grid Version 1.3 2010-10-14 15:56:44 UTC

Description Gordon Sim 2010-02-26 19:00:38 UTC
Description of problem:

In a 3 broker 'dynamic' federation of an exchange, links get tron down incorrectly.

Version-Release number of selected component (if applicable):

1.2 and qpid trunk as of r915773

How reproducible:

100%

Steps to Reproduce:
1. start three brokers and dynamically federate amq.direct in each direction between each pair of brokers
   (e.g. for brokers on port 5672, 5673 and 5674:
    
   qpid-route dynamic add localhost:5672 localhost:5673 amq.direct
   qpid-route dynamic add localhost:5672 localhost:5674 amq.direct
   qpid-route dynamic add localhost:5673 localhost:5672 amq.direct
   qpid-route dynamic add localhost:5673 localhost:5674 amq.direct
   qpid-route dynamic add localhost:5674 localhost:5672 amq.direct
   qpid-route dynamic add localhost:5674 localhost:5673 amq.direct
   )
2. create a queue on two of the brokers and bind each to amq.direct with the same binding key

   (e.g. qpid-config -a localhost:5673 add queue test-queue
   qpid-config -a localhost:5673 bind amq.direct test-queue abc

   qpid-config -a localhost:5674 add queue test-queue
   qpid-config -a localhost:5674 bind amq.direct test-queue abc)

3. send a message to amq.direct on the other broker, with routing key abc and verify that each of these queues gets the message (actually gets two copies as there are two independent routes).

   ( e.g. echo "Message 1" | ./src/tests/sender --exchange amq.direct --routing-key abc
   ./src/tests/receiver --port 5673 --messages 2
   ./src/tests/receiver --port 5674 --messages 2)
   

4. now delete the queue on one of the brokers

   (e.g. qpid-config -a localhost:5673 del queue test-queue)

5. send another message to amq.direct on the first broker, with routing key abc and verify that the remaining queue sees this message

   (e.g. echo "Message 2" | ./src/tests/sender --exchange amq.direct --routing-key abc
   ./src/tests/receiver --port 5674)
  
Actual results:

No message is received

Expected results:

Message 2 should be received
Additional info:

Comment 1 Ken Giusti 2010-04-06 20:32:12 UTC
Created attachment 404783 [details]
bash shell script to repo the failure case

should be run from the directory containing qpid/

Comment 3 Ken Giusti 2010-04-07 15:38:50 UTC
Created attachment 405001 [details]
bash shell script to repo the failure case

The setup-3.sh bash shell will repo this problem, but it runs for several successful iterations before the problem is hit.

On a successful run, after the queue is unbound, the bindings look good:

Broker 46890
Exchange 'amq.direct' (direct)
    bind [abc] => bridge_queue_1_eea3f547-c09b-45e8-9bdc-5718e8e7d655
    bind [abc] => bridge_queue_1_f96e5868-800a-4be3-b0ed-e135d3f2ddbb
    bind [reply-localhost.localdomain.13737.1] => reply-localhost.localdomain.13737.1
Broker 36113
Exchange 'amq.direct' (direct)
    bind [abc] => bridge_queue_1_e9cef399-358c-4e15-b09c-c61ce547db43
    bind [abc] => bridge_queue_1_eea3f547-c09b-45e8-9bdc-5718e8e7d655
    bind [reply-localhost.localdomain.13752.1] => reply-localhost.localdomain.13752.1
Broker 41500
Exchange 'amq.direct' (direct)
    bind [abc] => bridge_queue_1_e9cef399-358c-4e15-b09c-c61ce547db43
    bind [abc] => bridge_queue_1_f96e5868-800a-4be3-b0ed-e135d3f2ddbb
    bind [reply-localhost.localdomain.13767.1] => reply-localhost.localdomain.13767.1
    bind [abc] => test-queue


However, in the failure case, the bindings are incorrect - some bindings for "abc" are missing:



Broker 60733
Exchange 'amq.direct' (direct)
    bind [reply-localhost.localdomain.14216.1] => reply-localhost.localdomain.14216.1
Broker 33302
Exchange 'amq.direct' (direct)
    bind [abc] => bridge_queue_1_3084df55-675e-4559-bf3d-7902696c476f
    bind [reply-localhost.localdomain.14231.1] => reply-localhost.localdomain.14231.1
Broker 53716
Exchange 'amq.direct' (direct)
    bind [abc] => bridge_queue_1_704becfd-36de-42e7-b0c0-a76059d4cbb4
    bind [abc] => bridge_queue_1_902711b2-588c-46c6-875c-3dd04bc8c1a2
    bind [reply-localhost.localdomain.14246.1] => reply-localhost.localdomain.14246.1
    bind [abc] => test-queue

Comment 4 Ken Giusti 2010-04-08 18:10:46 UTC
Fixed checked into upstream:

http://svn.apache.org/viewvc?view=revision&revision=932032

Upstream JIRA: https://issues.apache.org/jira/browse/QPID-2487

-K

Comment 5 ppecka 2010-05-26 14:42:06 UTC
verified on RHEL 4.8 / 5.5 - i386 / x86_64

#rpm -qa | grep qpid
qpid-cpp-server-store-0.7.946106-1.el5
rh-tests-distribution-MRG-Messaging-qpid_common-1.6-27
qpid-cpp-server-0.7.946106-1.el5
python-qpid-0.7.946106-1.el5
qpid-tools-0.7.946106-4.el5
qpid-java-client-0.7.946106-3.el5
qpid-cpp-server-devel-0.7.946106-1.el5
qpid-cpp-client-ssl-0.7.946106-1.el5
qpid-cpp-client-0.7.946106-1.el5
qpid-cpp-server-ssl-0.7.946106-1.el5
qpid-cpp-client-devel-docs-0.7.946106-1.el5
qpid-cpp-client-devel-0.7.946106-1.el5
qpid-java-common-0.7.946106-3.el5
qpid-cpp-server-cluster-0.7.946106-1.el5
qpid-cpp-server-xml-0.7.946106-1.el5


--> VERIFIED

Comment 6 Ken Giusti 2010-10-05 15:33:41 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
* Cause:  Dynamic federation of exchanges, using the same binding key on different brokers, then deleting one of the bound queues on one of the brokers.
* Consequence:  The routes to the other brokers are removed, however they should not be.
* Fix:  The broker now tracks the origin of each remote binding.
* Result: Deleting a queue will not cause routes to other queues using the same binding to be deleted.

Comment 7 Jaromir Hradilek 2010-10-06 13:03:02 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,4 +1 @@
-* Cause:  Dynamic federation of exchanges, using the same binding key on different brokers, then deleting one of the bound queues on one of the brokers.
+When a dynamic federation of exchanges was in use and multiple brokers shared the same binding key, deleting a bound queue on one of them caused the routes to other queues with the same binding to be removed as well. To avoid this, the broker now tracks the origin of each remote binding, so that deleting a single queue no longer affects other queues.-* Consequence:  The routes to the other brokers are removed, however they should not be.
-* Fix:  The broker now tracks the origin of each remote binding.
-* Result: Deleting a queue will not cause routes to other queues using the same binding to be deleted.

Comment 9 errata-xmlrpc 2010-10-14 16:07:57 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0773.html


Note You need to log in before you can comment on or make changes to this bug.