Bug 698783 - fetch(timeout=0) does not behave as expected on receiver with capacity=1
Summary: fetch(timeout=0) does not behave as expected on receiver with capacity=1
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: python-qpid
Version: Development
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: 2.1.2
: ---
Assignee: Gordon Sim
QA Contact: MRG Quality Engineering
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-04-21 18:43 UTC by ppecka
Modified: 2012-12-07 17:46 UTC (History)
5 users (show)

Fixed In Version: python-qpid-0.14
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
reproducer runme.sh plus log of latest try (3.14 KB, application/x-compressed)
2011-04-21 18:43 UTC, ppecka
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Apache JIRA QPID-3232 0 None None None Never

Description ppecka 2011-04-21 18:43:23 UTC
Created attachment 493965 [details]
reproducer runme.sh plus log of latest try

Description of problem:
This applies for RHEL5-U5 x86_64 2.6.18-194.el5, with slightly modified drain example(session receiver capacity=1), only one message out of 20 being read, then queue is deleted. 
Even with address {delete:receiver} you'll fetch one message per drain run.
Whenever capacity size is raised above 1 OR all reconnect settings are omited, all messages are being received, and after that (if set) queue is destroyed.


Version-Release number of selected component (if applicable):
redhat-release-5Server-5.5.0.2
# rpm -qa | grep qpid
qpid-cpp-client-0.10-3.el5
qpid-cpp-client-ssl-0.10-3.el5
qpid-cpp-server-xml-0.10-3.el5
qpid-cpp-client-devel-docs-0.10-3.el5
qpid-qmf-0.10-2.el5
qpid-cpp-client-devel-0.10-3.el5
python-qpid-0.10-1.el5
qpid-cpp-server-store-0.10-3.el5
qpid-qmf-devel-0.10-2.el5
python-qpid-qmf-0.10-2.el5
qpid-java-common-0.10-2.el5
qpid-java-client-0.10-2.el5
qpid-tools-0.10-3.el5
qpidc-debuginfo-0.5.752581-3.el5
qpid-cpp-server-ssl-0.10-3.el5
qpid-cpp-server-devel-0.10-3.el5
qpid-java-example-0.10-2.el5
qpid-cpp-server-0.10-3.el5
qpid-cpp-server-cluster-0.10-3.el5

How reproducible:
100% on x86_64
5% on i386

Steps to Reproduce:
1. echo '
log-enable=info+ 
log-to-file=/tmp/qpidd.log 
log-to-stdout=no 
log-to-stderr=no 
auth=no 
' > /etc/qpidd.conf

2. ./spout_credit -c 20 'my-queue; {create: always, node:{durable:True}}'
(on current package set this will send slowly 20 messages)

Message(durable=True, properties={'spout-id': '29813a28-b5dd-9e4a-87e8-28aec88a6f9e:0'})
...
Message(durable=True, properties={'spout-id': '29813a28-b5dd-9e4a-87e8-28aec88a6f9e:19'})

3. qpid-stat -q | grep my-queue
   my-queue Y 20 20 0 0 0 0 0 1


4.1. /drain_credit -r -i 1 -l 10 'my-queue; { create: always}'

4.2. ./drain_credit -r -i 1 -l 10 'my-queue; {delete:receiver, create: always}'

(this can be output either for 4.1 && 4.2)
Message(durable=True, properties={'x-amqp-0-10.routing-key': u'my-queue', u'spout-id': u'29813a28-b5dd-9e4a-87e8-28aec88a6f9e:0'})


5. qpid-stat -q | grep my-queue
(obviously shows nothing)







Actual results:
Non-empty queue is being deleted or messages are not being printed to stdout by drain example

Expected results:
Only after receiving all messages queue can be deleted

Additional info:
spout example was also modified

Comment 1 ppecka 2011-04-21 18:53:41 UTC
attachment is bzip2 file not gzip

Comment 3 Gordon Sim 2011-04-26 10:03:23 UTC
I suspect this may just require clarifying the intended behaviour of the 'delete:receiver' option.

That option will delete the queue whenever the receiver closes. In the modified drain, the receiver will close when the timeout expires and/or when the specified number of messages have been fetched. 

In the example above the timeout and count are not set. The timeout will default to 0. This means that if there is a delay when there are no messages on the queue (because the other receiver fetched the first one, and spout has yet to send another) then the second drain will exit and delete the queue.

Try setting a much larger timeout (e.g. the time for spout to produce all its messages).

Comment 4 ppecka 2011-04-27 14:46:38 UTC
(In reply to comment #3)
> I suspect this may just require clarifying the intended behaviour of the
> 'delete:receiver' option.
> 
> That option will delete the queue whenever the receiver closes. In the modified
> drain, the receiver will close when the timeout expires and/or when the
> specified number of messages have been fetched. 
> 
> In the example above the timeout and count are not set. The timeout will
> default to 0. This means that if there is a delay when there are no messages on
> the queue (because the other receiver fetched the first one, and spout has yet
> to send another) then the second drain will exit and delete the queue.

all reproducer steps should be subsequent, thus drain is reading from queue when no one else is reading/consuming messages


this behaviour is observed on x86_64 arch, on i386 we are getting all the messages before queue is being deleted

Comment 5 Gordon Sim 2011-04-28 12:03:22 UTC
The issue is that timeout=0 does not work as expected when the capacity=1. See https://issues.apache.org/jira/browse/QPID-3232. As a workaround you can set the timeout to 1.

There is a race condition where the credit that allows the second message to be delivered is issued after the fetch has flushed for that message which probably accounts for the variability you see.

Comment 6 Gordon Sim 2011-04-28 12:05:12 UTC
In my view this is not a blocker; it is not a regression and there is a reasonable workaround. We should also change the description or open another BZ to track the actual issue.

Comment 7 ppecka 2011-04-28 14:21:24 UTC
ok, changing description

Comment 8 Gordon Sim 2011-04-28 16:40:49 UTC
Fixed upstream: http://svn.apache.org/viewvc?rev=1097539&view=rev

Comment 9 Petr Matousek 2012-04-27 17:27:38 UTC
This issue has been fixed
(All 20 messages are drained with the reproducer before the queue is deleted.)

Verified on rhel5, rhel6 (x86_64,i386)

packages installed:
python-qpid-0.14-6.el5
python-qpid-0.14-7.el6_2

-> VERIFIED


Note You need to log in before you can comment on or make changes to this bug.