Bug 698721

Summary: durable messages transmission - performance issue (1.3.2 -> 2.0 candidate)
Product: Red Hat Enterprise MRG Reporter: ppecka <ppecka>
Component: qpid-cppAssignee: Ken Giusti <kgiusti>
Status: CLOSED ERRATA QA Contact: Petra Svobodov√° <psvobodo>
Severity: high Docs Contact:
Priority: urgent    
Version: DevelopmentCC: freznice, gsim, iboverma, jneedle, jross, kim.vdriet, lzhaldyb, tross
Target Milestone: 2.0Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qpid-cpp-mrg-0.10-7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-06-23 15:44:27 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Attachments:
Description Flags
modified python spout example
none
Patch adding durable and sync options to python spout
none
Modified spout client for measuring performace none

Description ppecka 2011-04-21 15:29:46 UTC
Created attachment 493897 [details]
modified python spout example

Description of problem:
Modified python spout example, which sends durable messages reveals performance issue compared to mrg 1.3 version. On version 2.0 the send ratio is approximately 1 durable msg per second. which is more than 100times slower than on 1.3 with very same spout client.


Version-Release number of selected component (if applicable):
# rpm -qa | grep -P '(qpid|qmf)' | sort -u
python-qpid-0.10-1.el5
python-qpid-qmf-0.10-2.el5
qpidc-debuginfo-0.5.752581-3.el5
qpid-cpp-client-0.10-3.el5
qpid-cpp-client-devel-0.10-3.el5
qpid-cpp-client-devel-docs-0.10-3.el5
qpid-cpp-client-ssl-0.10-3.el5
qpid-cpp-server-0.10-3.el5
qpid-cpp-server-cluster-0.10-3.el5
qpid-cpp-server-devel-0.10-3.el5
qpid-cpp-server-ssl-0.10-3.el5
qpid-cpp-server-store-0.10-3.el5
qpid-cpp-server-xml-0.10-3.el5
qpid-java-client-0.10-2.el5
qpid-java-common-0.10-2.el5
qpid-java-example-0.10-2.el5
qpid-qmf-0.10-2.el5
qpid-qmf-devel-0.10-2.el5
qpid-tools-0.10-3.el5


How reproducible:
100%

Steps to Reproduce:
1. MRG version 1.3
time ./spout_durable -c 10000 'my-queue; {create: always, delete:receiver, node:{durable:True}}'

real	0m48.786s
user	0m13.378s
sys	0m3.164s


2. MRG version 2.0

 time ./spout_durable -c 100 'my-queue; {create: always, delete:receiver, node:{durable:True}}'

real	1m40.004s
user	0m0.290s
sys	0m0.030s

Comment 2 Kim van der Riet 2011-04-25 20:12:02 UTC
I have confirmed by testing against trunk that this slow-down can indeed be observed and I have tracked the checkin to r.1072356 (the store will use r.4440). As this checkin has to do with producer flow control, I am guessing that the test will need to be modified in some way to accommodate a change in client behaviour as a result of this change.

Assigning to Ken Giusti for a closer look.

I doubt this would rise to the level of a blocker; Ken can confirm.

Comment 3 Kim van der Riet 2011-04-26 13:13:38 UTC
Further testing shows that if the store is loaded and the message is marked durable, then the transmission is slow. It appears that checkin r.1072356 enforces synchronous transmission. The store, not seeing ongoing messages relies on the 1-second flush timer to flush the message, and only then is the message acknowledged and the next message sent.

Changing the send options to be asynchronous

snd.send(msg, sync=False)

solves the problem, and all messages are sent without waiting for the store to write the message to disk.

Comment 4 Kim van der Riet 2011-04-26 13:15:44 UTC
Created attachment 494900 [details]
Patch adding durable and sync options to python spout

This patch makes the durability and synchronous changes needed. Running the tests using the -d and not using the -s option will fix this problem.

Comment 5 Ken Giusti 2011-04-26 14:25:25 UTC
Kim is correct - this behaviour is "by design".  The bugfix that makes this change is:

https://issues.apache.org/jira/browse/QPID-2921

This bugfix is in 2.0.   Prior to 2.0, the broker would not guarantee that the messages were flushed to store before it sent the ack.  This bugfix forces the broker to delay the ack until the message has been flushed to store.

As Kim pointed out, the modified spout code uses the default "sync" behavior when it calls send() - this behavior requires the send to block until the message has been acked.   Since the example is only sending 1 message at a time, the "write cache" used by the journal is not filling up, and thus it waits for the 1 second timer to expire before it is written (and the ack is then sent).  This results in 1 message xfer per second.

The example code can be sped up as Kim pointed out by setting the sync=False flag in the send() call.  Once all messages are sent, the example can call "snd.sync()" to wait for all messages to be written to disk.  This improves the performance greatly.

In the end, this is a tradeoff between performance and certainty of getting the messages to disk.  The application has to choose how many messages it will allow to be outstanding and possibly lost should the broker crash, vs how fast it needs to transmit messages.

Comment 6 Gordon Sim 2011-05-09 09:31:42 UTC
"Prior to 2.0, the broker would not guarantee that the
messages were flushed to store before it sent the ack." -  I don't believe that is correct... The completions sent to the sending client were held up previously pending completions on all enqueues.

Is it possible that we have introduced a regression here where the broker no longer flushes the store directly (which it used to do in this case)?

Comment 7 Ken Giusti 2011-05-09 14:18:06 UTC
It's possible, but I don't recall specifically removing a flush from the broker's receive path.  Did the broker previously flush to disk each time a durable message was enqueued?   My understanding of the store is that this would introduce poor disk write performance - as opposed to letting a disk buffer's worth of messages get scheduled for writing to store as a single unit.

I'll have to take a closer look at a 1.3 broker's behavior against the testcase.

Comment 8 Gordon Sim 2011-05-09 14:25:48 UTC
The broker used to flush when a sync came in and there were some messages pending enqueue completion  (i.e. not for every durable message, but when needed).

This was done in the old IncompleteMessageList class, when handling a sync request. 

I'm re-opening for now until we are sure there is no change here. The JMS client will often use the sync-per-durable-message pattern which is slow enough as it is without further introducing delays waiting for the autoflush timeout.

Comment 9 Ken Giusti 2011-05-09 14:37:21 UTC
Understood and I agree - the sync handling has changed quite a bit, most likely a regression.  I'm on it now.

Comment 10 Ken Giusti 2011-05-09 21:11:52 UTC
Regression - see upstream jira:

https://issues.apache.org/jira/browse/QPID-3252

Comment 11 Justin Ross 2011-05-09 21:26:31 UTC
This means that users of the python api in synchronous mode (the default) would not have the option of issuing explicit sync calls to avoid waiting on the store's timer.  If that's correct, this is definitely 2.0 blocker material.

Comment 12 Ken Giusti 2011-05-11 13:04:19 UTC
Fix provided upstream:

http://svn.apache.org/viewvc?view=revision&revision=1101864

pending blocker approval....

Comment 13 Ken Giusti 2011-05-12 14:17:18 UTC
Merged to mrg_2.0.x branch:

http://mrg1.lab.bos.redhat.com/git/?p=qpid.git;a=commitdiff;h=692bf60596cd78ad5409857ae307c170c389853d

Testing notes for QA:

To exercise this fix's codepath, I suggest running a long duration test that:

1) enables persistence
2) enables flow control, and occasionally triggers it
3) Runs several producer clients that send durable messages with the sync parameter to the send() method explicitly set to false, and occasionally calls session.sync() at random intervals.  Pseudocode:

   While (some long duration) do
       For (some random batch size) do
           msg = create durable msg
           sender.send(msg, sync=False)
       End-For
       session.sync()
   End-While

Also, verify no appreciable performance degradation with respect to MRG1.3 when sending transient (non-durable) message traffic (original problem report).

thanks,

-K

Comment 14 Petra Svobodov√° 2011-05-27 13:59:29 UTC
Created attachment 501309 [details]
Modified spout client for measuring performace

Comment 15 Petra Svobodov√° 2011-05-27 14:00:30 UTC
Overview of measured duration of sending messages by using the attached client "spout_durable_mod" according to the requirement:

Durable messages
----------------

100 messages:
Rhel5-i	(2.0)	1.413 s
Rhel5-i (1.3.2)	1.540 s
Rhel5-i (1.3)	1.880 s
Rhel5-x	(2.0)	1.520 s
Rhel5-x (1.3.2)	2.404 s
Rhel6-i	(2.0)	0.934 s
Rhel6-x	(2.0)	0.855 s

1000 messages:
Rhel5-i	(2.0)	3.296 s
Rhel5-i (1.3.2)	3.818 s
Rhel5-i (1.3)	3.887 s
Rhel5-x	(2.0)	3.279 s
Rhel5-x (1.3.2)	3.946 s
Rhel6-i	(2.0)	2.679 s
Rhel6-x	(2.0)	2.489 s

10000 messages:
Rhel5-i	(2.0)	20.456 s
Rhel5-i (1.3.2)	22.418 s
Rhel5-i (1.3)	23.821 s
Rhel5-x	(2.0)	21.003 s
Rhel5-x (1.3.2)	20.832 s
Rhel6-i	(2.0)	20.324 s
Rhel6-x	(2.0)	20.448 s


Non-durable messages
--------------------

100 messages:
Rhel5-i	(2.0)	0.239 s
Rhel5-i (1.3.2)	0.239 s
Rhel5-i (1.3)	0.239 s
Rhel5-x	(2.0)	0.285 s
Rhel5-x (1.3.2)	0.281 s
Rhel6-i	(2.0)	0.428 s
Rhel6-x	(2.0)	0.407 s

1000 messages:
Rhel5-i	(2.0)	1.639 s
Rhel5-i (1.3.2)	1.818 s
Rhel5-i (1.3)	1.833 s
Rhel5-x	(2.0)	1.806 s
Rhel5-x (1.3.2)	1.745 s
Rhel6-i	(2.0)	1.504 s
Rhel6-x	(2.0)	1.693 s

10000 messages:
Rhel5-i	(2.0)	16.268 s
Rhel5-i (1.3.2)	16.203 s
Rhel5-i (1.3)	16.053 s
Rhel5-x	(2.0)	16.582 s
Rhel5-x (1.3.2)	16.403 s
Rhel6-i	(2.0)	16.350 s
Rhel6-x	(2.0)	15.000 s


There is no significant performance degradation (using modified client behaviour) in the mrg 2.0 version.

--> VERIFIED

Comment 16 errata-xmlrpc 2011-06-23 15:44:27 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2011-0890.html