Bug 509573

Summary: Journal loses capacity with use
Product: Red Hat Enterprise MRG Reporter: Gordon Sim <gsim>
Component: qpid-cppAssignee: messaging-bugs <messaging-bugs>
Status: CLOSED NOTABUG QA Contact: MRG Quality Engineering <mrgqe-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 1.0CC: kim.vdriet
Target Milestone: 1.3   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-07-06 12:53:51 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Gordon Sim 2009-07-03 17:14:46 UTC
Description of problem:

The capacity of a durable queue seems to shrink with use. 

Version-Release number of selected component (if applicable):

qpidd-0.5.752581-22.el5
rhm-0.5.3206-5.el5

How reproducible:

Readily

Steps to Reproduce:
1. create durable queue
2. send large number of durable messages
3. consume them all
4. repeat steps 2. and 3. in a loop

E.g. with default journal options:

qpid-config add queue test-queue --durable

then in a loop:

for i in `seq 1 35000`; do echo "Message$i"; done | sender --durable true --send-eos 1
receiver --port 5674 > /dev/null

  
Actual results:

Eventually an "Enqueue capacity threshold exceeded" exception is thrown. However the capacity was large enough for the first few iterations. Note that in my example the byte depth was never as much as 500k which should be well within the capacity.

Expected results:

Capacity stays the same over life of queue.

Additional info:

Comment 1 Kim van der Riet 2009-07-06 12:53:51 UTC
A message stored on the journal consumes more than just the capacity of the message itself. The storage budget is as follows:

For enqueues:
Enqueue header:  32 bytes
Message header:  90 bytes (for this test, can vary)
Message content:  8-12 bytes ("Message1" - "Message32000" in this test)
Enqueue tail:    12 bytes
-------------------------
Total:          142-146 bytes

Since each message is stored in storage blocks of 128 bytes (known as data blocks or dblks), each message footprint in the journal for this test is 2 dblks = 256 bytes. For 35,000 messages in the test above, this consumes 8.96MB ~ 72% of a 12.5MB journal.

Clearly for small messages the storage budget is significant, and can result in low storage efficiencies. To keep a message in 1 dblk, the message header and content together must be 84 or fewer bytes.

The variability in the result (ie initial runs fitting while later runs do not) result from the algorithm used to check for space:

If the current message + 20% of the total journal capacity points to a journal file that does NOT contain any enqueued records (ie is safe to overwrite), then the enqueue proceeds, otherwise the enqueue capacity exception is returned. Clearly this method can result in variations in the threshold in terms of percentage of the total capacity because the position of the enqueues relative to the file itself plays a roll. The worst case scenario results when the test above points to the first byte in a file in which only the last record is still enqueued, while the best scenario occurs when the test above points to the last byte in a file, the next file containing the first still enqueued record.

This variability is made worse when the number of files is small (eg 4 files), and is minimized when using a larger number of files. We may need to clarify this behaviour in our documentation, which does not address this kind of detail. For small messages, we may need to provide different guidelines for sizing the journal.

Setting to CLOSED/NOTABUG. However, please reinstate if there is still an issue here I have not seen.