Bug 752560

Summary: Correct documentation regarding recovery from dirty stores
Product: Red Hat Enterprise MRG Reporter: Justin Ross <jross>
Component: Messaging_Programming_ReferenceAssignee: Cheryn Tan <chetan>
Status: CLOSED CURRENTRELEASE QA Contact: Leonid Zhaldybin <lzhaldyb>
Severity: high Docs Contact:
Priority: high    
Version: 2.0CC: esammons, iboverma, lbrindle, lzhaldyb, mgoulish, rlandman, spurrier
Target Milestone: 2.1.2   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-05-02 05:31:31 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Justin Ross 2011-11-09 20:52:01 UTC
Mick will expand on this.

Comment 1 mick 2011-11-14 15:37:10 UTC
Usage problems seem significant enough with the current persistent-
cluster restart strategy that I am suggesting a new approach.  It is outlined 
in a comment in BZ 752942 "Automate persistent cluster restart."

I am marking this BZ as being blocked by that one.  After the new approach is
implemented, the doc should be re-written to describe what it does, what system
adminstrators should expect, and what, if anything, they will still need to do
in some cases.

Comment 3 mick 2011-11-30 08:14:25 UTC
After testing, here is my Suggestion for New and Improved Instructions for the Doco.

----- snip ----- snip ----- snip ----- snip ----- snip ----- snip -----


If the cluster has previously had a total failure and there are no
clean stores then the brokers will fail to start with the log message
"Daemon startup failed: Cannot recover, no clean store."

If this happens, you can restart the cluster by marking one of the
brokers' data directories as 'clean'.  If you can see from time-stamps
that one of the data directories is more recent that the other, choose 
that one to mark as clean.  But most likely, if all stores are dirty, 
the brokers all died simultaneously.  In that case, choose one broker's
data directory arbitrarily.

To mark a data directory as clean, look in it for one or more
subdirectories of the form:

     _cluster.bak.<nnnn>

These subdirectories will exist if previous restarts were attempted.
If no such subdirectory exists, use the qpid-cluster-store command now.
( See below. )

If one or more _cluster.bak.<nnnn> subdirectories do exist, note the one
with the highest 4-digit number.  It is the newest.  Use it in this
sequence of commands:

     cd <data-dir>
     mv rhm rhm.bak
     cp -a _cluster.bak.<nnnn>/rhm .

Now you are ready to mark this store as clean:

     qpid-cluster-store -c <data-dir>

Now you can restart the cluster, and all brokers' stores will be
initialized from the one that you marked as clean.


----- snip ----- snip ----- snip ----- snip ----- snip ----- snip -----

Comment 5 Cheryn Tan 2011-12-19 06:19:50 UTC
Fixed in Section 8.5.4 - Starting a persistent cluster with no clean store

documentation-stage.bne.redhat.com/docs/en-US/Red_Hat_Enterprise_MRG/2/html/Messaging_User_Guide/sect-Messaging_User_Guide-Persistence_in_High_Availability_Message_Clusters-Starting_a_persistent_cluster_with_no_clean_store.html

Comment 7 mick 2012-02-09 13:25:25 UTC
Removed blocker 752942 - I believe that the current cluster impl will not be changed significantly before it is replaced by the new impl, so that should no longer block this BZ.

Comment 8 Leonid Zhaldybin 2012-02-09 15:54:49 UTC
Starting a persistent cluster with no clean store is properly described in the documentation, moving it to VERIFIED.

-> VERIFIED

Comment 9 Cheryn Tan 2012-05-02 05:31:31 UTC
Documents have been published on http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_MRG/index.html as part of the MRG-M 2.1.2 update.