Red Hat Bugzilla – Bug 509573
Journal loses capacity with use
Last modified: 2009-10-06 12:17:51 EDT
Description of problem:
The capacity of a durable queue seems to shrink with use.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. create durable queue
2. send large number of durable messages
3. consume them all
4. repeat steps 2. and 3. in a loop
E.g. with default journal options:
qpid-config add queue test-queue --durable
then in a loop:
for i in `seq 1 35000`; do echo "Message$i"; done | sender --durable true --send-eos 1
receiver --port 5674 > /dev/null
Eventually an "Enqueue capacity threshold exceeded" exception is thrown. However the capacity was large enough for the first few iterations. Note that in my example the byte depth was never as much as 500k which should be well within the capacity.
Capacity stays the same over life of queue.
A message stored on the journal consumes more than just the capacity of the message itself. The storage budget is as follows:
Enqueue header: 32 bytes
Message header: 90 bytes (for this test, can vary)
Message content: 8-12 bytes ("Message1" - "Message32000" in this test)
Enqueue tail: 12 bytes
Total: 142-146 bytes
Since each message is stored in storage blocks of 128 bytes (known as data blocks or dblks), each message footprint in the journal for this test is 2 dblks = 256 bytes. For 35,000 messages in the test above, this consumes 8.96MB ~ 72% of a 12.5MB journal.
Clearly for small messages the storage budget is significant, and can result in low storage efficiencies. To keep a message in 1 dblk, the message header and content together must be 84 or fewer bytes.
The variability in the result (ie initial runs fitting while later runs do not) result from the algorithm used to check for space:
If the current message + 20% of the total journal capacity points to a journal file that does NOT contain any enqueued records (ie is safe to overwrite), then the enqueue proceeds, otherwise the enqueue capacity exception is returned. Clearly this method can result in variations in the threshold in terms of percentage of the total capacity because the position of the enqueues relative to the file itself plays a roll. The worst case scenario results when the test above points to the first byte in a file in which only the last record is still enqueued, while the best scenario occurs when the test above points to the last byte in a file, the next file containing the first still enqueued record.
This variability is made worse when the number of files is small (eg 4 files), and is minimized when using a larger number of files. We may need to clarify this behaviour in our documentation, which does not address this kind of detail. For small messages, we may need to provide different guidelines for sizing the journal.
Setting to CLOSED/NOTABUG. However, please reinstate if there is still an issue here I have not seen.