Bug 483807 - resolve join state for store recover in cluster for joining nodes
Summary: resolve join state for store recover in cluster for joining nodes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: 1.1
Hardware: All
OS: Linux
high
high
Target Milestone: 1.2
: ---
Assignee: Kim van der Riet
QA Contact: Jan Sarenik
URL:
Whiteboard:
: 486991 539287 (view as bug list)
Depends On:
Blocks: 527551
TreeView+ depends on / blocked
 
Reported: 2009-02-03 18:04 UTC by Carl Trieloff
Modified: 2018-10-27 14:57 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Messaging bug fix C: When a node in a cluster failed, and was then brought back up, it was attempting to sync with both the store, and the running cluster C: The node that attempting to rejoin the running cluster failed F: Only the first node started in a cluster will restore from the store. All subsequent nodes added to the cluster will discard the store data and will synchronize with the master node in the cluster. R: Rejoining a running cluster now operates as expected. When a node in a cluster failed, and was then brought back up, it was attempting to restore using information from both the store, and the running master node. This resulted in the node that was attempting to rejoin failing. This has been corrected, so that only the first node started in a cluster will restore from the store. All subsequent nodes added to the cluster will discard the store data and will synchronize with the master node in the cluster. Rejoining a running cluster now operates as expected.
Clone Of:
Environment:
Last Closed: 2009-12-03 09:17:43 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2009:1633 0 normal SHIPPED_LIVE Red Hat Enterprise MRG Messaging and Grid Version 1.2 2009-12-03 09:15:33 UTC

Description Carl Trieloff 2009-02-03 18:04:13 UTC
Logically the following scenario can exist:

1. start a cluster, more than one node
2. publish durable messages (to durable queue) to one node in the cluster
3. confirm, they are on all nodes
4. kill one of the nodes
5. (optional)  publish some more messages
5. Rejoin the cluster with the failed node.... (this will fail.)

Reason, the joined node will be synced from the running cluster, but also try to recover from the store.

What needs to happen is:

a.) The first node is a cluster to start needs to recover the store
b.) All joining nodes need to sync data, as they do today but ignore any store they may have (the bug -- they don't ignore their store if they have one)

Comment 1 Carl Trieloff 2009-02-03 18:05:08 UTC
This can be worked around by identifying the node to start first, and removing the stores from the other nodes before restart.

Comment 2 Carl Trieloff 2009-02-03 18:19:02 UTC
in broker.cpp
    if (store.get() != 0) {
        RecoveryManagerImpl recoverer(queues, exchanges, links, dtxManager, 
                                      conf.stagingThreshold);
        store->recover(recoverer);
    }

needs to be not called for joining nodes.

Comment 3 Alan Conway 2009-02-04 17:05:16 UTC
In revision 740793

Cluster sets recovery flag on Broker for first member in cluster.
Disable recovery from local store if the recovery flag is not set.

Comment 4 Carl Trieloff 2009-02-04 17:29:02 UTC
Need store test case, tbd kim

Comment 5 Kim van der Riet 2009-02-04 17:42:35 UTC
Changing priority to high; set target milestone to 1.1.2.

Comment 6 Kim van der Riet 2009-05-08 13:52:33 UTC
*** Bug 486991 has been marked as a duplicate of this bug. ***

Comment 7 Kim van der Riet 2009-05-08 14:28:29 UTC
The error described in Bug 486991 (marked as a dup of this one) is the result of BDB errors when trying to set up mandatory broker exchanges when they have already been restored. This happens on all cluster nodes which are not the first in the cluster and are restored from the persistence store.

The work-around up until now has been to delete the store directory from all the nodes (or all the nodes except the first to be restarted) when there are messages to be recovered.

A fix now modifies the startup sequence of the store so that when a node is not the first in a cluster to restart and has been restored, the restored data is discarded and the store files are "pushed down" into a bak folder (in case the order of cluster recovery is incorrect, and the store from other nodes can be restored) then the node is restarted without recovery.

QA: This bug is easy to reproduce:
1. Start a multi-node cluster.
2. Shut down any node in the cluster.
3. Restart that node. The broker start will fail with "Exchange already exists:
amq.direct (MessageStoreImpl.cpp:488)" message.
4. If all nodes are shut down, then all nodes after the first will fail with this error.

Built-in store python test test_Cluster_04_SingleClusterRemoveRestoreNodes tests this scenario.

qpid r. 773004
store r. 3368

Comment 8 Jan Sarenik 2009-05-12 09:05:28 UTC
Reproduced on RHEL5.3 i386.

Related packages (mrg-devel repo):
 qpidd-cluster-0.5.752581-5.el5
 qpidd-0.5.752581-5.el5
 openais-0.80.3-22.el5_3.4

Waiting for new packages to verify.

Comment 10 Kim van der Riet 2009-10-05 18:35:19 UTC
Backported qpid r.773004 onto git mrg_1.1.x branch: http://git.et.redhat.com/git/qpid.git/?p=qpid.git;a=commitdiff;h=441c88204cb0135564669d7b004d62a1bc03828a

Comment 11 Jan Sarenik 2009-10-08 13:02:25 UTC
Verified on qpidd-0.5.752581-28.el5, both i386 and x86_64.

Comment 12 Kim van der Riet 2009-10-08 13:48:43 UTC
Included in store backport for 1.2.

Comment 13 Jan Sarenik 2009-10-09 08:31:06 UTC
I forgot to mention rhm-0.5.3206-14.el5

Comment 14 Irina Boverman 2009-10-28 17:35:05 UTC
Release note added. If any revisions are required, please set the 
"requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

New Contents:
Cluster joining nodes now recover correctly by preserving (instead of replicating) any stored data they already had prior to rejoining (483807)

Comment 15 Kim van der Riet 2009-10-29 19:31:15 UTC
Modified the release note to the following:

Only the first node started in a cluster will restore from the store. All subsequent nodes added to the cluster will discard the store data (the store files will be pushed down into a bak directory) and will instead synchronize with the master node in the cluster. (483807)

Comment 16 Kim van der Riet 2009-10-29 19:31:15 UTC
Release note updated. If any revisions are required, please set the 
"requires_release_notes"  flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

Diffed Contents:
@@ -1 +1 @@
-Cluster joining nodes now recover correctly by preserving (instead of replicating) any stored data they already had prior to rejoining (483807)+Only the first node started in a cluster will restore from the store. All subsequent nodes added to the cluster will discard the store data (the store files will be pushed down into a bak directory) and will instead synchronize with the master node in the cluster. (483807)

Comment 17 Gordon Sim 2009-11-19 19:59:27 UTC
*** Bug 539287 has been marked as a duplicate of this bug. ***

Comment 18 Lana Brindley 2009-11-23 06:50:01 UTC
Release note updated. If any revisions are required, please set the 
"requires_release_notes"  flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

Diffed Contents:
@@ -1 +1,8 @@
-Only the first node started in a cluster will restore from the store. All subsequent nodes added to the cluster will discard the store data (the store files will be pushed down into a bak directory) and will instead synchronize with the master node in the cluster. (483807)+Messaging bug fix
+
+C: When a node in a cluster failed, and was then brought back up, it was attempting to sync with both the store, and the running cluster
+C: The node that attempting to rejoin the running cluster failed 
+F: Only the first node started in a cluster will restore from the store. All subsequent nodes added to the cluster will discard the store data and will synchronize with the master node in the cluster.
+R: Rejoining a running cluster now operates as expected.
+
+When a node in a cluster failed, and was then brought back up, it was attempting to restore using information from both the store, and the running master node. This resulted in the node that was attempting to rejoin failing. This has been corrected, so that only the first node started in a cluster will restore from the store. All subsequent nodes added to the cluster will discard the store data and will synchronize with the master node in the cluster. Rejoining a running cluster now operates as expected.

Comment 20 errata-xmlrpc 2009-12-03 09:17:43 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2009-1633.html


Note You need to log in before you can comment on or make changes to this bug.