Bug 483807 - resolve join state for store recover in cluster for joining nodes
resolve join state for store recover in cluster for joining nodes
Status: CLOSED ERRATA
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp (Show other bugs)
1.1
All Linux
high Severity high
: 1.2
: ---
Assigned To: Kim van der Riet
Jan Sarenik
:
: 486991 539287 (view as bug list)
Depends On:
Blocks: 527551
  Show dependency treegraph
 
Reported: 2009-02-03 13:04 EST by Carl Trieloff
Modified: 2010-10-23 03:28 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Messaging bug fix C: When a node in a cluster failed, and was then brought back up, it was attempting to sync with both the store, and the running cluster C: The node that attempting to rejoin the running cluster failed F: Only the first node started in a cluster will restore from the store. All subsequent nodes added to the cluster will discard the store data and will synchronize with the master node in the cluster. R: Rejoining a running cluster now operates as expected. When a node in a cluster failed, and was then brought back up, it was attempting to restore using information from both the store, and the running master node. This resulted in the node that was attempting to rejoin failing. This has been corrected, so that only the first node started in a cluster will restore from the store. All subsequent nodes added to the cluster will discard the store data and will synchronize with the master node in the cluster. Rejoining a running cluster now operates as expected.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-12-03 04:17:43 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Carl Trieloff 2009-02-03 13:04:13 EST
Logically the following scenario can exist:

1. start a cluster, more than one node
2. publish durable messages (to durable queue) to one node in the cluster
3. confirm, they are on all nodes
4. kill one of the nodes
5. (optional)  publish some more messages
5. Rejoin the cluster with the failed node.... (this will fail.)

Reason, the joined node will be synced from the running cluster, but also try to recover from the store.

What needs to happen is:

a.) The first node is a cluster to start needs to recover the store
b.) All joining nodes need to sync data, as they do today but ignore any store they may have (the bug -- they don't ignore their store if they have one)
Comment 1 Carl Trieloff 2009-02-03 13:05:08 EST
This can be worked around by identifying the node to start first, and removing the stores from the other nodes before restart.
Comment 2 Carl Trieloff 2009-02-03 13:19:02 EST
in broker.cpp
    if (store.get() != 0) {
        RecoveryManagerImpl recoverer(queues, exchanges, links, dtxManager, 
                                      conf.stagingThreshold);
        store->recover(recoverer);
    }

needs to be not called for joining nodes.
Comment 3 Alan Conway 2009-02-04 12:05:16 EST
In revision 740793

Cluster sets recovery flag on Broker for first member in cluster.
Disable recovery from local store if the recovery flag is not set.
Comment 4 Carl Trieloff 2009-02-04 12:29:02 EST
Need store test case, tbd kim
Comment 5 Kim van der Riet 2009-02-04 12:42:35 EST
Changing priority to high; set target milestone to 1.1.2.
Comment 6 Kim van der Riet 2009-05-08 09:52:33 EDT
*** Bug 486991 has been marked as a duplicate of this bug. ***
Comment 7 Kim van der Riet 2009-05-08 10:28:29 EDT
The error described in Bug 486991 (marked as a dup of this one) is the result of BDB errors when trying to set up mandatory broker exchanges when they have already been restored. This happens on all cluster nodes which are not the first in the cluster and are restored from the persistence store.

The work-around up until now has been to delete the store directory from all the nodes (or all the nodes except the first to be restarted) when there are messages to be recovered.

A fix now modifies the startup sequence of the store so that when a node is not the first in a cluster to restart and has been restored, the restored data is discarded and the store files are "pushed down" into a bak folder (in case the order of cluster recovery is incorrect, and the store from other nodes can be restored) then the node is restarted without recovery.

QA: This bug is easy to reproduce:
1. Start a multi-node cluster.
2. Shut down any node in the cluster.
3. Restart that node. The broker start will fail with "Exchange already exists:
amq.direct (MessageStoreImpl.cpp:488)" message.
4. If all nodes are shut down, then all nodes after the first will fail with this error.

Built-in store python test test_Cluster_04_SingleClusterRemoveRestoreNodes tests this scenario.

qpid r. 773004
store r. 3368
Comment 8 Jan Sarenik 2009-05-12 05:05:28 EDT
Reproduced on RHEL5.3 i386.

Related packages (mrg-devel repo):
 qpidd-cluster-0.5.752581-5.el5
 qpidd-0.5.752581-5.el5
 openais-0.80.3-22.el5_3.4

Waiting for new packages to verify.
Comment 10 Kim van der Riet 2009-10-05 14:35:19 EDT
Backported qpid r.773004 onto git mrg_1.1.x branch: http://git.et.redhat.com/git/qpid.git/?p=qpid.git;a=commitdiff;h=441c88204cb0135564669d7b004d62a1bc03828a
Comment 11 Jan Sarenik 2009-10-08 09:02:25 EDT
Verified on qpidd-0.5.752581-28.el5, both i386 and x86_64.
Comment 12 Kim van der Riet 2009-10-08 09:48:43 EDT
Included in store backport for 1.2.
Comment 13 Jan Sarenik 2009-10-09 04:31:06 EDT
I forgot to mention rhm-0.5.3206-14.el5
Comment 14 Irina Boverman 2009-10-28 13:35:05 EDT
Release note added. If any revisions are required, please set the 
"requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

New Contents:
Cluster joining nodes now recover correctly by preserving (instead of replicating) any stored data they already had prior to rejoining (483807)
Comment 15 Kim van der Riet 2009-10-29 15:31:15 EDT
Modified the release note to the following:

Only the first node started in a cluster will restore from the store. All subsequent nodes added to the cluster will discard the store data (the store files will be pushed down into a bak directory) and will instead synchronize with the master node in the cluster. (483807)
Comment 16 Kim van der Riet 2009-10-29 15:31:15 EDT
Release note updated. If any revisions are required, please set the 
"requires_release_notes"  flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

Diffed Contents:
@@ -1 +1 @@
-Cluster joining nodes now recover correctly by preserving (instead of replicating) any stored data they already had prior to rejoining (483807)+Only the first node started in a cluster will restore from the store. All subsequent nodes added to the cluster will discard the store data (the store files will be pushed down into a bak directory) and will instead synchronize with the master node in the cluster. (483807)
Comment 17 Gordon Sim 2009-11-19 14:59:27 EST
*** Bug 539287 has been marked as a duplicate of this bug. ***
Comment 18 Lana Brindley 2009-11-23 01:50:01 EST
Release note updated. If any revisions are required, please set the 
"requires_release_notes"  flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

Diffed Contents:
@@ -1 +1,8 @@
-Only the first node started in a cluster will restore from the store. All subsequent nodes added to the cluster will discard the store data (the store files will be pushed down into a bak directory) and will instead synchronize with the master node in the cluster. (483807)+Messaging bug fix
+
+C: When a node in a cluster failed, and was then brought back up, it was attempting to sync with both the store, and the running cluster
+C: The node that attempting to rejoin the running cluster failed 
+F: Only the first node started in a cluster will restore from the store. All subsequent nodes added to the cluster will discard the store data and will synchronize with the master node in the cluster.
+R: Rejoining a running cluster now operates as expected.
+
+When a node in a cluster failed, and was then brought back up, it was attempting to restore using information from both the store, and the running master node. This resulted in the node that was attempting to rejoin failing. This has been corrected, so that only the first node started in a cluster will restore from the store. All subsequent nodes added to the cluster will discard the store data and will synchronize with the master node in the cluster. Rejoining a running cluster now operates as expected.
Comment 20 errata-xmlrpc 2009-12-03 04:17:43 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2009-1633.html

Note You need to log in before you can comment on or make changes to this bug.