Bug 441698

Summary: Feature: Support async queue replication
Product: Red Hat Enterprise MRG Reporter: Gordon Sim <gsim>
Component: qpid-cppAssignee: Gordon Sim <gsim>
Status: CLOSED ERRATA QA Contact: Kim van der Riet <kim.vdriet>
Severity: low Docs Contact:
Priority: urgent    
Version: betaCC: freznice, iboverma
Target Milestone: 1.1.1   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-04-21 16:16:22 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 478882    
Attachments:
Description Flags
Notes on setting up and using the async replication feature none

Description Gordon Sim 2008-04-09 15:25:04 UTC
For disaster recovery there is a desire to have asynchronous replication of
changes to particular queues from one data centre to a passive replica at
another site.

On failure some updates could be lost due to the async nature of the
replication; this is acceptable. The switch to the passive backup would require
manual intervention. The backup system should be available within seconds of
being made active; it could take a further hour or so to recover all the
messages for very large queues.

Comment 1 Sergei Vorobiev 2008-06-19 14:41:03 UTC
Asynchronous queue replication from one clustered "primary" broker to another
clustered "secondary" broker (asynchronous site to site replication). For
disaster recovery purposes there is a need to have asynchronous sequential
replication of changes (enqueue and de-queue of messages) made to queues on
primary broker in one data center to a secondary broker at another data center.
Although asynchronous, the updates' sequence should be precisely the same on
secondary broker queues as it was on the primary broker queues in the primary
site. Transactional changes to the primary broker queues should be applied to
secondary broker's queues in the same transactional fashion. Secondary broker
queues are not used (no messages are sent to or read from queues) by any AMQP
clients until the replica is manually activated in case of a DR. In case of a
failure all updates done on primary broker that have not been copied to the
secondary broker could be lost due to the asynchronous nature of the
replication. Switching the secondary broker to become active in case of DR is
manual, in which case it could take some time to recover all the messages for
very large queues.

Comment 2 Gordon Sim 2009-01-26 12:19:33 UTC
Created attachment 329978 [details]
Notes on setting up and using the async replication feature

This text has also been added to the upstream wiki:

http://cwiki.apache.org/confluence/display/qpid/queue+state+replication

Comment 3 Gordon Sim 2009-01-26 12:23:51 UTC
See attachement above for details on this feature. There are also two test scenarios checked in to the qpid svn:

* src/tests/replication_test

Simple low volume verification of the basic functionality of replication where after configuration messages are sent and acknowleged then the state of the backup queues are validated. Both enqueue-and-dequeue and enqueue-only modes are tested. This is run as part of 'make check'.

* src/tests/reliable_replication_test

This tests reliability in the face of bridge/link failure. It only deals with the enqueue-only mode as the other functionallity is covered by the test above. During replication of a large volume of messages the link is destroyed then re-created periodically and at the end the test ensures that all expected messages were replicated with no duplications.

Comment 4 Gordon Sim 2009-01-26 14:45:41 UTC
Should have also added that src/tests/reliable_replication_test is automated as part of 'make check-long' (need to be in src/tests for that).

Comment 6 Frantisek Reznicek 2009-03-02 15:14:10 UTC
The feature has been implemented and above mentioned tests are passing on RHEL 4.7 / 5.3 i386 / x86_64 on packages:
qpidd-0.4.744917-1.el5, rhm-0.4.3116-3.el5, python-qpid-0.4.743856-1.el5


->VERIFIED

Comment 8 errata-xmlrpc 2009-04-21 16:16:22 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2009-0434.html