Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 544092

Summary:	message store should not delete backups when qpidd starts
Product:	Red Hat Enterprise MRG	Reporter:	Alan Conway <aconway>
Component:	qpid-cpp	Assignee:	Kim van der Riet <kim.vdriet>
Status:	CLOSED ERRATA	QA Contact:	ppecka <ppecka>
Severity:	high	Docs Contact:
Priority:	high
Version:	1.2	CC:	ppecka
Target Milestone:	1.3
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:	When a broker first started up, it checked to see if it was the first node in the cluster. If it was, it used the backup store. If it was not the first node, it saved the store directory, created a new, clean store, and then proceeded to fill it through cluster synchronization. However, if that same node was the first node to start in a cluster twice in a row, then the original backup store was overwritten by the newer store being synced, thus destroying the original. This was due to only one copy of the backup store being kept for all nodes in the cluster. With this update, backup store directories are sequentially numbered and can be multiple in number, with the result that it is no longer possible to delete an original backup store. Note that with this update, the cluster administrator must manually remove old and unneeded backup stores.	Story Points:	---
Clone Of:		Environment:
Last Closed:	2010-10-14 16:08:20 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Alan Conway 2009-12-03 21:56:03 UTC

Description of problem:

The cluster can instruct the message store to move the current store to a backup location and start a new one. This is used by the cluster during recovery. 

If the cluster is started a second time in this mode the original backup is destroyed.  

Backups should not be destroyed automatically, only by an administrator. The store should create numbered backups if multiple backups are required.

Comment 1 Kim van der Riet 2009-12-04 11:58:37 UTC

Backups are now in serialized directories. Manual deletion is required to prevent excessive build-up of backup directories.

Fixed r.3735

Comment 3 ppecka 2010-09-21 15:36:20 UTC

VERIFIED on RHEL 5.5 oth i386 / x86_64

# rpm -qa | grep qpid | sort -u
python-qpid-0.7.946106-14.el5
qpid-cpp-client-0.7.946106-15.el5
qpid-cpp-client-devel-0.7.946106-15.el5
qpid-cpp-client-devel-docs-0.7.946106-15.el5
qpid-cpp-client-ssl-0.7.946106-15.el5
qpid-cpp-mrg-debuginfo-0.7.946106-15.el5
qpid-cpp-server-0.7.946106-15.el5
qpid-cpp-server-cluster-0.7.946106-15.el5
qpid-cpp-server-devel-0.7.946106-15.el5
qpid-cpp-server-ssl-0.7.946106-15.el5
qpid-cpp-server-store-0.7.946106-15.el5
qpid-cpp-server-xml-0.7.946106-15.el5
qpid-java-client-0.7.946106-9.el5
qpid-java-common-0.7.946106-9.el5
qpid-tools-0.7.946106-10.el5

--> VERIFIED

Comment 4 Kim van der Riet 2010-10-05 15:44:44 UTC

    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause: Restarting a broker as a member of a cluster, but not as the first node more than once deletes the backup of the original store that was saved from the first restart.

Consequence: The ability to use this store as the first node is eliminated when the store is deleted. This can be significant when there is some doubt as to which node to start first, and several retries may be required.

Fix: The store now saves the store from each successive cluster start in a serialized directory.

Result: Every restart as the second or subsequent node creates a new serialized directory containing the previous store. No store is erased. However, there is the potential for a build-up of old store directories and steps must be taken to ensure that the disk does not become full.

Comment 5 Jaromir Hradilek 2010-10-06 16:08:56 UTC

    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,7 +1 @@
-Cause: Restarting a broker as a member of a cluster, but not as the first node more than once deletes the backup of the original store that was saved from the first restart.
+Upon a subsequent restart of a broker that was a member of a cluster but was not started as its first node, the older backup of the message storage was deleted, making it impossible to use it as the first node in the future. This was especially inconvenient when it was unclear which node to start first, and several retries were required. With this update, the message stores are now saved in serialized directories, so that no backup is ever deleted automatically. Note that because of this, appropriate steps must be taken to prevent reaching the disk capacity.-
-Consequence: The ability to use this store as the first node is eliminated when the store is deleted. This can be significant when there is some doubt as to which node to start first, and several retries may be required.
-
-Fix: The store now saves the store from each successive cluster start in a serialized directory.
-
-Result: Every restart as the second or subsequent node creates a new serialized directory containing the previous store. No store is erased. However, there is the potential for a build-up of old store directories and steps must be taken to ensure that the disk does not become full.

Comment 6 Douglas Silas 2010-10-06 16:22:17 UTC

    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1 @@
-Upon a subsequent restart of a broker that was a member of a cluster but was not started as its first node, the older backup of the message storage was deleted, making it impossible to use it as the first node in the future. This was especially inconvenient when it was unclear which node to start first, and several retries were required. With this update, the message stores are now saved in serialized directories, so that no backup is ever deleted automatically. Note that because of this, appropriate steps must be taken to prevent reaching the disk capacity.+* When a broker first started up, it checked to see if it was the first node in the cluster. If it was, it used the backup store. If it was not the first node, it saved the store directory, created a new, clean store, and then proceeded to fill it through cluster synchronization. However, if that same node was the first node to start in a cluster twice in a row, then the original backup store was overwritten by the newer store being synced, thus destroying the original. This was due to only one copy of the backup store being kept for all nodes in the cluster. With this update, backup store directories are sequentially numbered and can be multiple in number, with the result that it is no longer possible to delete an original backup store. Note that with this update, the cluster administrator must manually remove old and unneeded backup stores.

Comment 7 Douglas Silas 2010-10-06 16:22:32 UTC

    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1 @@
-* When a broker first started up, it checked to see if it was the first node in the cluster. If it was, it used the backup store. If it was not the first node, it saved the store directory, created a new, clean store, and then proceeded to fill it through cluster synchronization. However, if that same node was the first node to start in a cluster twice in a row, then the original backup store was overwritten by the newer store being synced, thus destroying the original. This was due to only one copy of the backup store being kept for all nodes in the cluster. With this update, backup store directories are sequentially numbered and can be multiple in number, with the result that it is no longer possible to delete an original backup store. Note that with this update, the cluster administrator must manually remove old and unneeded backup stores.+When a broker first started up, it checked to see if it was the first node in the cluster. If it was, it used the backup store. If it was not the first node, it saved the store directory, created a new, clean store, and then proceeded to fill it through cluster synchronization. However, if that same node was the first node to start in a cluster twice in a row, then the original backup store was overwritten by the newer store being synced, thus destroying the original. This was due to only one copy of the backup store being kept for all nodes in the cluster. With this update, backup store directories are sequentially numbered and can be multiple in number, with the result that it is no longer possible to delete an original backup store. Note that with this update, the cluster administrator must manually remove old and unneeded backup stores.

Comment 9 errata-xmlrpc 2010-10-14 16:08:20 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0773.html