441698 – Feature: Support async queue replication

Bug 441698 - Feature: Support async queue replication

Summary: Feature: Support async queue replication

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise MRG
Classification:	Red Hat
Component:	qpid-cpp
Sub Component:
Version:	beta
Hardware:	All
OS:	Linux
Priority:	urgent
Severity:	low
Target Milestone:	1.1.1
Target Release:	---
Assignee:	Gordon Sim
QA Contact:	Kim van der Riet
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	478882
TreeView+	depends on / blocked

Reported:	2008-04-09 15:25 UTC by Gordon Sim
Modified:	2009-04-21 16:16 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2009-04-21 16:16:22 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
Notes on setting up and using the async replication feature (8.69 KB, text/plain) 2009-01-26 12:19 UTC, Gordon Sim	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2009:0434	0	normal	SHIPPED_LIVE	Red Hat Enterprise MRG Messaging and Grid Version 1.1.1	2009-04-21 16:15:50 UTC

Description Gordon Sim 2008-04-09 15:25:04 UTC

For disaster recovery there is a desire to have asynchronous replication of
changes to particular queues from one data centre to a passive replica at
another site.

On failure some updates could be lost due to the async nature of the
replication; this is acceptable. The switch to the passive backup would require
manual intervention. The backup system should be available within seconds of
being made active; it could take a further hour or so to recover all the
messages for very large queues.

Comment 1 Sergei Vorobiev 2008-06-19 14:41:03 UTC

Asynchronous queue replication from one clustered "primary" broker to another
clustered "secondary" broker (asynchronous site to site replication). For
disaster recovery purposes there is a need to have asynchronous sequential
replication of changes (enqueue and de-queue of messages) made to queues on
primary broker in one data center to a secondary broker at another data center.
Although asynchronous, the updates' sequence should be precisely the same on
secondary broker queues as it was on the primary broker queues in the primary
site. Transactional changes to the primary broker queues should be applied to
secondary broker's queues in the same transactional fashion. Secondary broker
queues are not used (no messages are sent to or read from queues) by any AMQP
clients until the replica is manually activated in case of a DR. In case of a
failure all updates done on primary broker that have not been copied to the
secondary broker could be lost due to the asynchronous nature of the
replication. Switching the secondary broker to become active in case of DR is
manual, in which case it could take some time to recover all the messages for
very large queues.

Comment 2 Gordon Sim 2009-01-26 12:19:33 UTC

Created attachment 329978 [details]
Notes on setting up and using the async replication feature

This text has also been added to the upstream wiki:

http://cwiki.apache.org/confluence/display/qpid/queue+state+replication

Comment 3 Gordon Sim 2009-01-26 12:23:51 UTC

See attachement above for details on this feature. There are also two test scenarios checked in to the qpid svn:

* src/tests/replication_test

Simple low volume verification of the basic functionality of replication where after configuration messages are sent and acknowleged then the state of the backup queues are validated. Both enqueue-and-dequeue and enqueue-only modes are tested. This is run as part of 'make check'.

* src/tests/reliable_replication_test

This tests reliability in the face of bridge/link failure. It only deals with the enqueue-only mode as the other functionallity is covered by the test above. During replication of a large volume of messages the link is destroyed then re-created periodically and at the end the test ensures that all expected messages were replicated with no duplications.

Comment 4 Gordon Sim 2009-01-26 14:45:41 UTC

Should have also added that src/tests/reliable_replication_test is automated as part of 'make check-long' (need to be in src/tests for that).

Comment 6 Frantisek Reznicek 2009-03-02 15:14:10 UTC

The feature has been implemented and above mentioned tests are passing on RHEL 4.7 / 5.3 i386 / x86_64 on packages:
qpidd-0.4.744917-1.el5, rhm-0.4.3116-3.el5, python-qpid-0.4.743856-1.el5


->VERIFIED

Comment 8 errata-xmlrpc 2009-04-21 16:16:22 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2009-0434.html

Note You need to log in before you can comment on or make changes to this bug.