Bug 698721 - durable messages transmission - performance issue (1.3.2 -> 2.0 candidate)
Summary: durable messages transmission - performance issue (1.3.2 -> 2.0 candidate)
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: Development
Hardware: Unspecified
OS: Unspecified
Target Milestone: 2.0
: ---
Assignee: Ken Giusti
QA Contact: Petra Svobodová
Depends On:
TreeView+ depends on / blocked
Reported: 2011-04-21 15:29 UTC by ppecka
Modified: 2011-09-26 14:35 UTC (History)
8 users (show)

Fixed In Version: qpid-cpp-mrg-0.10-7
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2011-06-23 15:44:27 UTC

Attachments (Terms of Use)
modified python spout example (3.92 KB, application/octet-stream)
2011-04-21 15:29 UTC, ppecka
no flags Details
Patch adding durable and sync options to python spout (1.15 KB, patch)
2011-04-26 13:15 UTC, Kim van der Riet
no flags Details | Diff
Modified spout client for measuring performace (4.12 KB, application/octet-stream)
2011-05-27 13:59 UTC, Petra Svobodová
no flags Details

System ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2011:0890 normal SHIPPED_LIVE Red Hat Enterprise MRG Messaging 2.0 Release 2011-06-23 15:42:41 UTC
Red Hat Bugzilla 728586 None None None Never

Internal Links: 728586

Description ppecka 2011-04-21 15:29:46 UTC
Created attachment 493897 [details]
modified python spout example

Description of problem:
Modified python spout example, which sends durable messages reveals performance issue compared to mrg 1.3 version. On version 2.0 the send ratio is approximately 1 durable msg per second. which is more than 100times slower than on 1.3 with very same spout client.

Version-Release number of selected component (if applicable):
# rpm -qa | grep -P '(qpid|qmf)' | sort -u

How reproducible:

Steps to Reproduce:
1. MRG version 1.3
time ./spout_durable -c 10000 'my-queue; {create: always, delete:receiver, node:{durable:True}}'

real	0m48.786s
user	0m13.378s
sys	0m3.164s

2. MRG version 2.0

 time ./spout_durable -c 100 'my-queue; {create: always, delete:receiver, node:{durable:True}}'

real	1m40.004s
user	0m0.290s
sys	0m0.030s

Comment 2 Kim van der Riet 2011-04-25 20:12:02 UTC
I have confirmed by testing against trunk that this slow-down can indeed be observed and I have tracked the checkin to r.1072356 (the store will use r.4440). As this checkin has to do with producer flow control, I am guessing that the test will need to be modified in some way to accommodate a change in client behaviour as a result of this change.

Assigning to Ken Giusti for a closer look.

I doubt this would rise to the level of a blocker; Ken can confirm.

Comment 3 Kim van der Riet 2011-04-26 13:13:38 UTC
Further testing shows that if the store is loaded and the message is marked durable, then the transmission is slow. It appears that checkin r.1072356 enforces synchronous transmission. The store, not seeing ongoing messages relies on the 1-second flush timer to flush the message, and only then is the message acknowledged and the next message sent.

Changing the send options to be asynchronous

snd.send(msg, sync=False)

solves the problem, and all messages are sent without waiting for the store to write the message to disk.

Comment 4 Kim van der Riet 2011-04-26 13:15:44 UTC
Created attachment 494900 [details]
Patch adding durable and sync options to python spout

This patch makes the durability and synchronous changes needed. Running the tests using the -d and not using the -s option will fix this problem.

Comment 5 Ken Giusti 2011-04-26 14:25:25 UTC
Kim is correct - this behaviour is "by design".  The bugfix that makes this change is:


This bugfix is in 2.0.   Prior to 2.0, the broker would not guarantee that the messages were flushed to store before it sent the ack.  This bugfix forces the broker to delay the ack until the message has been flushed to store.

As Kim pointed out, the modified spout code uses the default "sync" behavior when it calls send() - this behavior requires the send to block until the message has been acked.   Since the example is only sending 1 message at a time, the "write cache" used by the journal is not filling up, and thus it waits for the 1 second timer to expire before it is written (and the ack is then sent).  This results in 1 message xfer per second.

The example code can be sped up as Kim pointed out by setting the sync=False flag in the send() call.  Once all messages are sent, the example can call "snd.sync()" to wait for all messages to be written to disk.  This improves the performance greatly.

In the end, this is a tradeoff between performance and certainty of getting the messages to disk.  The application has to choose how many messages it will allow to be outstanding and possibly lost should the broker crash, vs how fast it needs to transmit messages.

Comment 6 Gordon Sim 2011-05-09 09:31:42 UTC
"Prior to 2.0, the broker would not guarantee that the
messages were flushed to store before it sent the ack." -  I don't believe that is correct... The completions sent to the sending client were held up previously pending completions on all enqueues.

Is it possible that we have introduced a regression here where the broker no longer flushes the store directly (which it used to do in this case)?

Comment 7 Ken Giusti 2011-05-09 14:18:06 UTC
It's possible, but I don't recall specifically removing a flush from the broker's receive path.  Did the broker previously flush to disk each time a durable message was enqueued?   My understanding of the store is that this would introduce poor disk write performance - as opposed to letting a disk buffer's worth of messages get scheduled for writing to store as a single unit.

I'll have to take a closer look at a 1.3 broker's behavior against the testcase.

Comment 8 Gordon Sim 2011-05-09 14:25:48 UTC
The broker used to flush when a sync came in and there were some messages pending enqueue completion  (i.e. not for every durable message, but when needed).

This was done in the old IncompleteMessageList class, when handling a sync request. 

I'm re-opening for now until we are sure there is no change here. The JMS client will often use the sync-per-durable-message pattern which is slow enough as it is without further introducing delays waiting for the autoflush timeout.

Comment 9 Ken Giusti 2011-05-09 14:37:21 UTC
Understood and I agree - the sync handling has changed quite a bit, most likely a regression.  I'm on it now.

Comment 10 Ken Giusti 2011-05-09 21:11:52 UTC
Regression - see upstream jira:


Comment 11 Justin Ross 2011-05-09 21:26:31 UTC
This means that users of the python api in synchronous mode (the default) would not have the option of issuing explicit sync calls to avoid waiting on the store's timer.  If that's correct, this is definitely 2.0 blocker material.

Comment 12 Ken Giusti 2011-05-11 13:04:19 UTC
Fix provided upstream:


pending blocker approval....

Comment 13 Ken Giusti 2011-05-12 14:17:18 UTC
Merged to mrg_2.0.x branch:


Testing notes for QA:

To exercise this fix's codepath, I suggest running a long duration test that:

1) enables persistence
2) enables flow control, and occasionally triggers it
3) Runs several producer clients that send durable messages with the sync parameter to the send() method explicitly set to false, and occasionally calls session.sync() at random intervals.  Pseudocode:

   While (some long duration) do
       For (some random batch size) do
           msg = create durable msg
           sender.send(msg, sync=False)

Also, verify no appreciable performance degradation with respect to MRG1.3 when sending transient (non-durable) message traffic (original problem report).



Comment 14 Petra Svobodová 2011-05-27 13:59:29 UTC
Created attachment 501309 [details]
Modified spout client for measuring performace

Comment 15 Petra Svobodová 2011-05-27 14:00:30 UTC
Overview of measured duration of sending messages by using the attached client "spout_durable_mod" according to the requirement:

Durable messages

100 messages:
Rhel5-i	(2.0)	1.413 s
Rhel5-i (1.3.2)	1.540 s
Rhel5-i (1.3)	1.880 s
Rhel5-x	(2.0)	1.520 s
Rhel5-x (1.3.2)	2.404 s
Rhel6-i	(2.0)	0.934 s
Rhel6-x	(2.0)	0.855 s

1000 messages:
Rhel5-i	(2.0)	3.296 s
Rhel5-i (1.3.2)	3.818 s
Rhel5-i (1.3)	3.887 s
Rhel5-x	(2.0)	3.279 s
Rhel5-x (1.3.2)	3.946 s
Rhel6-i	(2.0)	2.679 s
Rhel6-x	(2.0)	2.489 s

10000 messages:
Rhel5-i	(2.0)	20.456 s
Rhel5-i (1.3.2)	22.418 s
Rhel5-i (1.3)	23.821 s
Rhel5-x	(2.0)	21.003 s
Rhel5-x (1.3.2)	20.832 s
Rhel6-i	(2.0)	20.324 s
Rhel6-x	(2.0)	20.448 s

Non-durable messages

100 messages:
Rhel5-i	(2.0)	0.239 s
Rhel5-i (1.3.2)	0.239 s
Rhel5-i (1.3)	0.239 s
Rhel5-x	(2.0)	0.285 s
Rhel5-x (1.3.2)	0.281 s
Rhel6-i	(2.0)	0.428 s
Rhel6-x	(2.0)	0.407 s

1000 messages:
Rhel5-i	(2.0)	1.639 s
Rhel5-i (1.3.2)	1.818 s
Rhel5-i (1.3)	1.833 s
Rhel5-x	(2.0)	1.806 s
Rhel5-x (1.3.2)	1.745 s
Rhel6-i	(2.0)	1.504 s
Rhel6-x	(2.0)	1.693 s

10000 messages:
Rhel5-i	(2.0)	16.268 s
Rhel5-i (1.3.2)	16.203 s
Rhel5-i (1.3)	16.053 s
Rhel5-x	(2.0)	16.582 s
Rhel5-x (1.3.2)	16.403 s
Rhel6-i	(2.0)	16.350 s
Rhel6-x	(2.0)	15.000 s

There is no significant performance degradation (using modified client behaviour) in the mrg 2.0 version.


Comment 16 errata-xmlrpc 2011-06-23 15:44:27 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.