Under some conditions, it is possible for a store to fill up without triggering an enqueue capacity full condition. A full store journal is a fatal condition, and results in the store closing. While data is not lost when this occurs and is still in the store journal files, it is currently not possible to restart the broker and consume the messages through a broker. Currently recovery of a full journal results in a condition identical to that which caused the shutdown in the first place - a full journal which cannot be written without the risk of data loss. There are two issues to be considered: 1. Whether the store can be prevented from ever filling up (ie a hard guarantee) through limits and restrictions; and/or 2. If such a condition should occur, the store should be able to remedy the situation during recovery and allow the messages to be consumed. The addition of auto-expand to the store will minimize, but not preclude the occurrence of this condition, primarily because it can be defeated (ie turned off, being an option), and because the store will place limits on how much a store can ultimately expand by this means.
A python tool called resize was written to analyze and resize the journal. The original journal is pushed down into a backup dir, and a new journal created. The remaining records in the old journal are transferred to the new journal. Note that this procedure cannot be carried out on a running broker, but on the store of a stopped broker. The broker is then restarted on the new store. This strategy, while not ideal, does provide a path to data recovery in a journal which has stopped because of becoming full. There are other broker-size strategies which could also be developed, such as tracking the amount of space needed to fully dequeue all existing records, and using this value as a dynamic threshold for enqueue threshold exceptions. These will be left for a later time and version, however. Last update to resize tool: r.3735
The resize tool is included and functional, tested on RHEL 4.8 / 5.5, i386 / x86_64 on packages: python-qmf-0.7.946106-13.el5 python-qpid-0.7.946106-14.el5 qmf-0.7.946106-17.el5 qmf-devel-0.7.946106-17.el5 qpid-cpp-*-0.7.946106-17.el5 qpid-dotnet-0.4.738274-2.el5 qpid-java-*-0.7.946106-10.el5 qpid-tools-0.7.946106-11.el5 ruby-qmf-0.7.946106-17.el5 ruby-qpid-0.7.946106-2.el5 -> VERIFIED
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause: Under some limited conditions in which the store file size is too small, the store can be filled such that no recovery is possible. Consequence: While messages are not lost per se, the store cannot deliver the messages because dequeueing them requires a write operation, and this is not possible when the store is full. Fix: A tool which allows the store to be resized off-line (ie when the broker is not running) has been written. Result: The store can now be increased in size, which in turn allows messages on the store to be dequeued after recovery.
Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1,7 +1 @@ -Cause: Under some limited conditions in which the store file size is too small, the store can be filled such that no recovery is possible. +Since a write operation is required when processing a message queue, reaching the maximum storage capacity rendered any recovery impossible. To target this issue, this update introduces a utility that allows the storage to be resized, so that the messages can be recovered and delivered as expected. Note that the broker must be stopped in order to run this tool.- -Consequence: While messages are not lost per se, the store cannot deliver the messages because dequeueing them requires a write operation, and this is not possible when the store is full. - -Fix: A tool which allows the store to be resized off-line (ie when the broker is not running) has been written. - -Result: The store can now be increased in size, which in turn allows messages on the store to be dequeued after recovery.
Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1 +1 @@ -Since a write operation is required when processing a message queue, reaching the maximum storage capacity rendered any recovery impossible. To target this issue, this update introduces a utility that allows the storage to be resized, so that the messages can be recovered and delivered as expected. Note that the broker must be stopped in order to run this tool.+Because a write operation is required when processing a message queue, reaching the maximum storage capacity rendered any recovery impossible. To target this issue, this update introduces a "resize" utility that allows storage to be resized so that the messages can be recovered and delivered as expected. Note that the broker must be stopped in order to run this tool.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2010-0773.html