Logically the following scenario can exist: 1. start a cluster, more than one node 2. publish durable messages (to durable queue) to one node in the cluster 3. confirm, they are on all nodes 4. kill one of the nodes 5. (optional) publish some more messages 5. Rejoin the cluster with the failed node.... (this will fail.) Reason, the joined node will be synced from the running cluster, but also try to recover from the store. What needs to happen is: a.) The first node is a cluster to start needs to recover the store b.) All joining nodes need to sync data, as they do today but ignore any store they may have (the bug -- they don't ignore their store if they have one)
This can be worked around by identifying the node to start first, and removing the stores from the other nodes before restart.
in broker.cpp if (store.get() != 0) { RecoveryManagerImpl recoverer(queues, exchanges, links, dtxManager, conf.stagingThreshold); store->recover(recoverer); } needs to be not called for joining nodes.
In revision 740793 Cluster sets recovery flag on Broker for first member in cluster. Disable recovery from local store if the recovery flag is not set.
Need store test case, tbd kim
Changing priority to high; set target milestone to 1.1.2.
*** Bug 486991 has been marked as a duplicate of this bug. ***
The error described in Bug 486991 (marked as a dup of this one) is the result of BDB errors when trying to set up mandatory broker exchanges when they have already been restored. This happens on all cluster nodes which are not the first in the cluster and are restored from the persistence store. The work-around up until now has been to delete the store directory from all the nodes (or all the nodes except the first to be restarted) when there are messages to be recovered. A fix now modifies the startup sequence of the store so that when a node is not the first in a cluster to restart and has been restored, the restored data is discarded and the store files are "pushed down" into a bak folder (in case the order of cluster recovery is incorrect, and the store from other nodes can be restored) then the node is restarted without recovery. QA: This bug is easy to reproduce: 1. Start a multi-node cluster. 2. Shut down any node in the cluster. 3. Restart that node. The broker start will fail with "Exchange already exists: amq.direct (MessageStoreImpl.cpp:488)" message. 4. If all nodes are shut down, then all nodes after the first will fail with this error. Built-in store python test test_Cluster_04_SingleClusterRemoveRestoreNodes tests this scenario. qpid r. 773004 store r. 3368
Reproduced on RHEL5.3 i386. Related packages (mrg-devel repo): qpidd-cluster-0.5.752581-5.el5 qpidd-0.5.752581-5.el5 openais-0.80.3-22.el5_3.4 Waiting for new packages to verify.
Backported qpid r.773004 onto git mrg_1.1.x branch: http://git.et.redhat.com/git/qpid.git/?p=qpid.git;a=commitdiff;h=441c88204cb0135564669d7b004d62a1bc03828a
Verified on qpidd-0.5.752581-28.el5, both i386 and x86_64.
Included in store backport for 1.2.
I forgot to mention rhm-0.5.3206-14.el5
Release note added. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cluster joining nodes now recover correctly by preserving (instead of replicating) any stored data they already had prior to rejoining (483807)
Modified the release note to the following: Only the first node started in a cluster will restore from the store. All subsequent nodes added to the cluster will discard the store data (the store files will be pushed down into a bak directory) and will instead synchronize with the master node in the cluster. (483807)
Release note updated. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1 +1 @@ -Cluster joining nodes now recover correctly by preserving (instead of replicating) any stored data they already had prior to rejoining (483807)+Only the first node started in a cluster will restore from the store. All subsequent nodes added to the cluster will discard the store data (the store files will be pushed down into a bak directory) and will instead synchronize with the master node in the cluster. (483807)
*** Bug 539287 has been marked as a duplicate of this bug. ***
Release note updated. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1 +1,8 @@ -Only the first node started in a cluster will restore from the store. All subsequent nodes added to the cluster will discard the store data (the store files will be pushed down into a bak directory) and will instead synchronize with the master node in the cluster. (483807)+Messaging bug fix + +C: When a node in a cluster failed, and was then brought back up, it was attempting to sync with both the store, and the running cluster +C: The node that attempting to rejoin the running cluster failed +F: Only the first node started in a cluster will restore from the store. All subsequent nodes added to the cluster will discard the store data and will synchronize with the master node in the cluster. +R: Rejoining a running cluster now operates as expected. + +When a node in a cluster failed, and was then brought back up, it was attempting to restore using information from both the store, and the running master node. This resulted in the node that was attempting to rejoin failing. This has been corrected, so that only the first node started in a cluster will restore from the store. All subsequent nodes added to the cluster will discard the store data and will synchronize with the master node in the cluster. Rejoining a running cluster now operates as expected.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2009-1633.html