Bug 1330539

Summary: client does not update credit for expired messages
Product: Red Hat Enterprise MRG Reporter: Barbora Vassova <bvassova>
Component: qpid-cppAssignee: Gordon Sim <gsim>
Status: CLOSED ERRATA QA Contact: Messaging QE <messaging-qe-bugs>
Severity: low Docs Contact:
Priority: high    
Version: 3.2CC: gsim, jross, mcressma, pmoravec, tkratky, zkraus
Target Milestone: 3.2.2Keywords: Regression
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qpid-cpp-0.34-17 Doc Type: Bug Fix
Doc Text:
Cause: Messages that expire on the client side are not handled by the usual code path which includes the moving of the credit window. Consequence: If there are a large number of messages that expire on the client side, the flow of messages may stall as the window is never moved forward. Fix: The credit window is moved even for expired messages. Result: The flow of messages does not stall.
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-10-11 07:36:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Barbora Vassova 2016-04-26 12:22:29 UTC
Description of problem:
Send a batch of messages with very low ttl to the broker and it will pass as many as it can to the connected consumer and discard the expired ones. When consumer sends the acknowledgements and they expire on a way to the broker, broker still keeps the un-confirmed messages in the queue (while waiting for the acknowledgements) although their ttl has already expired either and they should be discarded. In order to clear the queue you have to either wait for periodical purge or disconnect/reconnect the consumer. 

Version-Release number of selected component (if applicable):
qpid-cpp-server-0.34-6.el6.x86_64
qpid-cpp-client-0.34-6.el6.x86_64

How reproducible:
Always

Steps to Reproduce:
1. create queue testqueue
2. create consumer "qpid-receive -a testqueue -b localhost:5672 -f | nl"
3. qpid-send -a testqueue -b localhost:5672 -m 100 --ttl 6 --content-string hello

Actual results:
queue-depth value in qpid-stat -q testqueue should be 0 

Expected results:
queue-depth value in qpid-stat -q testqueue is not 0 (it's usually equal to the amount of messages received by the consumer)

Additional info:

Comment 1 Gordon Sim 2016-04-26 14:33:02 UTC
I don't understand exactly what you mean by "consumer sends the acknowledgements and they expire on a way to the broker". Acknowledgements don't expire. If a client acknowledges a message the message will be discarded regardless of ttl.

"In order to clear the queue you have to either wait for periodical purge or disconnect/reconnect the consumer. " - actually, periodical purge will not purge expired message that are acquired. Arguably *that* is a bug, though I would argue a minor one as the messages would be discarded if the consumer acknowledges them or is disconnected.

Comment 2 Pavel Moravec 2016-04-27 12:47:45 UTC
Better scenario describing the problem (I *think* broker does not update prefetch/capacity for a consumer that just acknowledged a message that timeouted while it was acquired):

1) Have consumer (artificially slowly reacting but to every message):
qpid-receive -a testqueue -b localhost:5672 -f --capacity=5 --ack-frequency=1 | nl

2) Send huge amount of messages with tiny TTL, followed by messages _without_ TTL:
qpid-send -a testqueue -b localhost:5672 -m 100000 --ttl 6 --content-string ttl; qpid-send -a testqueue -m10 --content-string=NOttl

3) Check what messages the consumer receives - it receives some ttl messages but _no_ message without TTL.

4) qpid-stat -q testqueue shows the queue has thousands of messages.
5) qpid-stat -u shows sessUnacked=0 for the consumer (so the consumer acknowledged all received messages)

Once queue-purge-interval cleans the queue, the queue content shrinks, having just the "NOttl" messages that are _not_ offered to the consumer.


So, something prevents broker to traverse queue (in Queue::getNextMessage method?) for messages to offer them to the consumer - either expired ones (that should be purged instead of sending to consumer) or the "NOttl" ones.

Comment 3 Pavel Moravec 2016-04-27 13:17:04 UTC
Playing more on this:

- problem is also in upstream
- uncertain if the bug isnt on _client_, since taking tcpdump, latest AMQP command/frame is client sending message.accept, but _no_ session.flush or message.set-flow-mode or either the crucial message.flow

That sounds the client does acknowledge the message, but does not update flow/prefetch/capacity ?

Comment 4 Gordon Sim 2016-04-27 21:03:01 UTC
The problem is indeed on the client side and is a result of the change for QPID-5828. The client is detecting the expired message, acking then discarding it without passing it on to the application. Unfortunately it is also not going through the code the adjusts the credit. I'll figure out how best to fix this.

Comment 5 Gordon Sim 2016-04-28 20:13:39 UTC
Fixed upstream: https://svn.apache.org/r1741491 (test also added https://svn.apache.org/r1741505).

Comment 6 Pavel Moravec 2016-04-29 15:28:51 UTC
FYI it is a regression between qpid 0.18 and 0.34 (havent tested releases between), since:

- qpid-receive using 0.18 library talking to 0.18 broker: no issue
- qpid-receive using 0.18 library talking to 0.34 broker: no issue
- qpid-receive using 0.34 library talking to 0.18 broker: reported issue
- qpid-receive using 0.34 library talking to 0.34 broker: reported issue

Python client not affected.

Comment 13 Pavel Moravec 2016-05-23 11:53:20 UTC
One regression found: closing a consumer with unacked expired messages raises segfault with backtrace:

#0  qpid::client::amqp0_10::IncomingMessages::process (this=0x627870, handler=0x0, duration=...)
    at /usr/src/debug/qpid-cpp-0.34/src/qpid/client/amqp0_10/IncomingMessages.cpp:287
#1  0x00007ffff7649e4c in qpid::client::amqp0_10::IncomingMessages::releasePending (this=0x627870, destination="q")
    at /usr/src/debug/qpid-cpp-0.34/src/qpid/client/amqp0_10/IncomingMessages.cpp:252
#2  0x00007ffff7652ce1 in qpid::client::amqp0_10::SessionImpl::releasePending (this=0x627820, name="q")
    at /usr/src/debug/qpid-cpp-0.34/src/qpid/client/amqp0_10/SessionImpl.cpp:573
#3  0x00007ffff764e82a in qpid::client::amqp0_10::ReceiverImpl::closeImpl (this=0x628880)
    at /usr/src/debug/qpid-cpp-0.34/src/qpid/client/amqp0_10/ReceiverImpl.cpp:231
#4  0x00007ffff764f9b4 in operator() (this=0x627820, f=...)
    at /usr/src/debug/qpid-cpp-0.34/src/qpid/client/amqp0_10/ReceiverImpl.h:125
#5  qpid::client::amqp0_10::SessionImpl::execute<qpid::client::amqp0_10::ReceiverImpl::Close> (this=0x627820, f=...)
    at /usr/src/debug/qpid-cpp-0.34/src/qpid/client/amqp0_10/SessionImpl.h:102
#6  0x00007ffff764e954 in execute<qpid::client::amqp0_10::ReceiverImpl::Close> (this=<value optimized out>)
    at /usr/src/debug/qpid-cpp-0.34/src/qpid/client/amqp0_10/ReceiverImpl.h:140
#7  qpid::client::amqp0_10::ReceiverImpl::close (this=<value optimized out>)
    at /usr/src/debug/qpid-cpp-0.34/src/qpid/client/amqp0_10/ReceiverImpl.cpp:79
#8  0x0000000000405e91 in main ()


Reproducer:

qpid-send -a q -m 1000 --ttl=30 --content-size=10; ./drain_no_ack

Where drain_no_ack is trivially setting capacity to 10k, consuming message from a receiver that is closed after that:

        Connection connection(options.url, options.connectionOptions);
        connection.open();
        Session session = connection.createSession();
        Receiver receiver = session.createReceiver("q");
        receiver.setCapacity(10000);
        receiver.fetch(message, 10*Duration::SECOND);
        std::cout << "Message(properties=" << message.getProperties() << " ";
        if (message.getTtl().getMilliseconds()) std::cout << "TTL: " << message.getTtl().getMilliseconds();
        std::cout << std::endl;
        receiver.close();

(one might need to tune --ttl=30 per his/her VM used - simply let some message be sent to the client before it expires, but the others to expire in client)

Comment 14 Gordon Sim 2016-05-23 13:21:52 UTC
Sorry folks! My mistake. I've added a test case to cover this and a fix upstream: https://svn.apache.org/r1745195

Comment 17 Gordon Sim 2016-07-20 11:55:28 UTC
From the git repo it appears that tag qpid-cpp-0.34-13 on 0.34-mrg includes the second fix.

Comment 18 Gordon Sim 2016-07-20 11:56:10 UTC
Mike, can you confirm?

Comment 19 Mike Cressman 2016-07-20 13:27:58 UTC
Correct, the complete fix is included in qpid-cpp-0.34-13 or later.

Comment 34 errata-xmlrpc 2016-10-11 07:36:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2049.html