1087828 – [AMQP 1.0] client should notify broker about session.sync() invoked by application

Bug 1087828 - [AMQP 1.0] client should notify broker about session.sync() invoked by application

Summary: [AMQP 1.0] client should notify broker about session.sync() invoked by applic...

Keywords:
Status:	CLOSED UPSTREAM
Alias:	None
Product:	Red Hat Enterprise MRG
Classification:	Red Hat
Component:	qpid-cpp
Sub Component:
Version:	3.0
Hardware:	All
OS:	All
Priority:	medium
Severity:	medium
Target Milestone:	3.3
Target Release:	---
Assignee:	messaging-bugs
QA Contact:	Messaging QE
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2014-04-15 12:05 UTC by Pavel Moravec
Modified:	2025-02-10 03:35 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2025-02-10 03:35:37 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Apache JIRA	QPID-5701	0	None	None	None	Never

Description Pavel Moravec 2014-04-15 12:05:17 UTC

Description of problem:
When application using C++ client invokes session.sync() method on AMQP 0-10 connection, the client sends execution.sync AMQP primitive to the broker to get aligned with the broker asap. While AMQP 1.0 client does not invoke anything like that, causing the client redundantly waits until the broker decides to send disposition and flow performatives by itself.

Particular use case:
- send durable messages to a durable queue and time to time (e.g. after N messages) invoke session.sync() to be synced.
- message throughput is approx. N msgs/sec
- this is caused by store module flushing work done every second (though the only Duration I found relevant is 500ms)


Version-Release number of selected component (if applicable):
qpid-cpp 0.22-37


How reproducible:
100%


Steps to Reproduce:
1. Apply trivial patch to qpid-send to invoke session.sync() every N message (--group-size option misused just for the test):
Index: src/tests/qpid-send.cpp
===================================================================
--- src/tests/qpid-send.cpp	(revision 1587518)
+++ src/tests/qpid-send.cpp	(working copy)
@@ -413,6 +413,9 @@
                 sender.send(msg);
                 reporter.message(msg);
 
+                if (sent % opts.groupSize == 0)
+                    session.sync();
+
                 if (opts.tx && (sent % opts.tx == 0)) {
                     if (opts.rollbackFrequency &&
                         (++txCount % opts.rollbackFrequency == 0))

2. qpid-config add queue Durable --durable
3. qpid-send -a Durable -m 100 --connection-options {protocol:amqp1.0} --group-key=KEY --group-size=1 --report-total --durable=yes
4. qpid-send -a Durable -m 100 --connection-options {protocol:amqp1.0} --group-key=KEY --group-size=10 --report-total --durable=yes
5. qpid-send -a Durable -m 100 --connection-options {protocol:amqp1.0} --group-key=KEY --group-size=100 --report-total --durable=yes


Actual results:
tp/s differ extremely, like:
group-size=1 -> tp/s 1
group-size=10 -> tp/s 10
group-size=100 -> tp/s 100

When using AMQP 0-10, the difference isnt such huge and tp/s is in general much much higher (i.e. 2025 for group-size=10).


Expected results:
AMQP 1.0 to provide similar results like 0-10


Additional info:

Comment 1 Pavel Moravec 2014-04-15 12:21:22 UTC

> - this is caused by store module flushing work done every second (though the only Duration I found relevant is 500ms)

I confirm MessageStoreImpl::defJournalFlushTimeout = 500ms is the relevant Duration here

Comment 4 Mike Cressman 2015-06-09 14:01:32 UTC

Gordon,
Can you take a quick look at this issue?  In testing the new JMS client, Jakub is seeing some slow performance for durable messages.  (see comment #3)
Questions:
- is this a client or broker issue?
- is this a linear store issue?
- is there something we can do about this?

Comment 5 Gordon Sim 2015-06-09 16:26:25 UTC

(In reply to Mike Cressman from comment #4)
> Gordon,
> Can you take a quick look at this issue?  In testing the new JMS client,
> Jakub is seeing some slow performance for durable messages.  (see comment #3)
> Questions:
> - is this a client or broker issue?
> - is this a linear store issue?
> - is there something we can do about this?

Both linear- and legacy- store hold flush only when a page fills. It is only when the write is actually on disk that the store notifies the broker, allowing it to then acknowledge the message. If a client is sending only one message at a time, waiting for acknowledgement of that before sending the next, there is a delay as the store waits to fill up a page that will never fill. Eventually it times out and flushes anyway.

So it is a broker/store issue for a particular pattern of client usage. Unfortunately in JMS 1.1 that is the only safe pattern. 

In 0-10 the protocol the client could set a flag to indicate that a sync was expected, and on receiving this hint, the broker flushed the store explicitly. 

On the 1.0 path we don't have the same mechanism and so never explicitly flush at present. We could flush for every write. That would improve the performance of the sync-durable-publish case, but would reduce performance for async-durable-publish case.

In 1.0 there is a 'batchable' flag that if true could be used to indicate more messages were coming and prevent flushing. Unfortunately at present that is not exposed by proton and so can't be currently relied upon.

We could try to flush only when there are no more (immediately available) messages to process on a connection. The impact on the more asynchronous case would need to be evaluated as well though.

Comment 6 Pavel Moravec 2016-01-20 08:38:59 UTC

> On the 1.0 path we don't have the same mechanism and so never explicitly
> flush at present. We could flush for every write. That would improve the
> performance of the sync-durable-publish case, but would reduce performance
> for async-durable-publish case.
> 
> In 1.0 there is a 'batchable' flag that if true could be used to indicate
> more messages were coming and prevent flushing. Unfortunately at present
> that is not exposed by proton and so can't be currently relied upon.

Can't this be implemented such that batchable=false (presence of the flag but set to false) would force flush? Then both C++ and Java AMQP 1.0 clients can manage the flush (i.e. C++ client executing session.sync() would set it).


> 
> We could try to flush only when there are no more (immediately available)
> messages to process on a connection. The impact on the more asynchronous
> case would need to be evaluated as well though.

This sounds like a compromise - how difficult would it be to implement either option?

Comment 7 Gordon Sim 2016-01-21 17:00:41 UTC

(In reply to Pavel Moravec from comment #6)
> > On the 1.0 path we don't have the same mechanism and so never explicitly
> > flush at present. We could flush for every write. That would improve the
> > performance of the sync-durable-publish case, but would reduce performance
> > for async-durable-publish case.
> > 
> > In 1.0 there is a 'batchable' flag that if true could be used to indicate
> > more messages were coming and prevent flushing. Unfortunately at present
> > that is not exposed by proton and so can't be currently relied upon.
> 
> Can't this be implemented such that batchable=false (presence of the flag
> but set to false) would force flush? Then both C++ and Java AMQP 1.0 clients
> can manage the flush (i.e. C++ client executing session.sync() would set it).

That would mean that the behaviour with batchable unspecified was the same as for where batchable=true, though the spec defines the default as false, which would be a little odd.
 
> > We could try to flush only when there are no more (immediately available)
> > messages to process on a connection. The impact on the more asynchronous
> > case would need to be evaluated as well though.
> 
> This sounds like a compromise - how difficult would it be to implement
> either option?

I don't believe proton even exposes batchable at present so that option would require changes to proton-j, proton-c as well as the clients themselves and of course the broker. I don't think it would be hugely difficult (but as above, personally I don't quite like that solution).

Flushing when there are no more immediately available messages would be a broker only change. Using a thread-local variable to track the queues involved, these could then be flushed once the available incoming bytes have been processed. This would need some performance testing to ensure it doesn't have unexpected impact on the fully-async case.

Note that I think there is a now a store option by which the time the store takes to flush itself can be configured. This can be reduced significantly from the default and that should reduce the time spent waiting.

Comment 12 Red Hat Bugzilla 2025-02-10 03:35:37 UTC

This product has been discontinued or is no longer tracked in Red Hat Bugzilla.

Note You need to log in before you can comment on or make changes to this bug.