Bug 462461

Summary: Clustering broker fail-over must replicate federation links
Product: Red Hat Enterprise MRG Reporter: William Henry <whenry>
Component: qpid-cppAssignee: mick <mgoulish>
Status: CLOSED ERRATA QA Contact: Frantisek Reznicek <freznice>
Severity: high Docs Contact:
Priority: medium    
Version: 1.1CC: esammons, freznice, gsim, jonathan.robie, tross
Target Milestone: 1.3   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
When a cluster was used as a single node in a federation, shutting down one of its brokers may have caused the transmission on a federated link to fail. This error has been fixed, federated links are now properly replicated to other members of the cluster, and the shutdown of one of the brokers in a cluster no longer affects the transmissions.
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-10-14 16:14:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description William Henry 2008-09-16 14:33:16 UTC
Description of problem:

If a broker in a cluster has federation links then:

1. These links must be replicated (inactive) in case of broker failure.
2. If the broker fails then the failover broker must reestablish the federated links so that the cluster maintains a similar state.

Version-Release number of selected component (if applicable):
1.1

How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Gordon Sim 2009-02-17 12:48:13 UTC
Mick, I believe this is now implemented but we need an automated test. Could I ask you to add one to the 'check-long' set?

1. start two clusters each of two nodes
2. create a federation bridge between them
3. have a sender send messages to one cluster that will be routed by the bridge to the other cluster where a receiver checks them
4. kill a node in either cluster and verify that this does not halt the flow of messages through the bridge

Comment 2 Gordon Sim 2009-02-17 12:50:50 UTC
*** Bug 470087 has been marked as a duplicate of this bug. ***

Comment 6 mick 2010-10-05 19:09:54 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause:        
Federation links were not replicated from one clustered broker to another.


Consequence:  
If a cluster was used as one node in a federation, the death of one of the clustered brokers could cause the federated link to fail.


Fix:          
Replicate federation links to other members of the cluster.


Result:
Killing clustered brokers and adding new ones does not halt transmission on a federated link to a cluster.

Comment 7 Jaromir Hradilek 2010-10-06 15:22:54 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,14 +1 @@
-Cause:        
+When a cluster was used as a single node in a federation, shutting down one of its brokers may have caused the transmission on a federated link to fail. This error has been fixed, federated links are now properly replicated to other members of the cluster, and the shut down of one of the brokers in a cluster no longer affects the transmissions.-Federation links were not replicated from one clustered broker to another.
-
-
-Consequence:  
-If a cluster was used as one node in a federation, the death of one of the clustered brokers could cause the federated link to fail.
-
-
-Fix:          
-Replicate federation links to other members of the cluster.
-
-
-Result:
-Killing clustered brokers and adding new ones does not halt transmission on a federated link to a cluster.

Comment 8 Douglas Silas 2010-10-11 09:41:46 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1 @@
-When a cluster was used as a single node in a federation, shutting down one of its brokers may have caused the transmission on a federated link to fail. This error has been fixed, federated links are now properly replicated to other members of the cluster, and the shut down of one of the brokers in a cluster no longer affects the transmissions.+When a cluster was used as a single node in a federation, shutting down one of its brokers may have caused the transmission on a federated link to fail. This error has been fixed, federated links are now properly replicated to other members of the cluster, and the shutdown of one of the brokers in a cluster no longer affects the transmissions.

Comment 9 Frantisek Reznicek 2010-10-11 14:30:00 UTC
The issues have been fixed, tested on RHEL 5.5 i386 / x86_64 on packages:
python-qmf-0.7.946106-13.el5
python-qpid-0.7.946106-14.el5
qmf-*0.7.946106-17.el5
qpid-cpp-*-0.7.946106-17.el5
qpid-dotnet-0.4.738274-2.el5
qpid-java-client-0.7.946106-10.el5
qpid-java-common-0.7.946106-10.el5
qpid-tools-0.7.946106-11.el5
ruby-qmf-0.7.946106-17.el5
ruby-qpid-0.7.946106-2.el5

-> VERIFIED

Comment 11 errata-xmlrpc 2010-10-14 16:14:20 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0773.html