Bug 698783

Summary: fetch(timeout=0) does not behave as expected on receiver with capacity=1
Product: Red Hat Enterprise MRG Reporter: ppecka <ppecka>
Component: python-qpidAssignee: Gordon Sim <gsim>
Status: CLOSED CURRENTRELEASE QA Contact: MRG Quality Engineering <mrgqe-bugs>
Severity: high Docs Contact:
Priority: high    
Version: DevelopmentCC: freznice, gsim, iboverma, jross, pematous
Target Milestone: 2.1.2   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: python-qpid-0.14 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
reproducer runme.sh plus log of latest try none

Description ppecka 2011-04-21 18:43:23 UTC
Created attachment 493965 [details]
reproducer runme.sh plus log of latest try

Description of problem:
This applies for RHEL5-U5 x86_64 2.6.18-194.el5, with slightly modified drain example(session receiver capacity=1), only one message out of 20 being read, then queue is deleted. 
Even with address {delete:receiver} you'll fetch one message per drain run.
Whenever capacity size is raised above 1 OR all reconnect settings are omited, all messages are being received, and after that (if set) queue is destroyed.


Version-Release number of selected component (if applicable):
redhat-release-5Server-5.5.0.2
# rpm -qa | grep qpid
qpid-cpp-client-0.10-3.el5
qpid-cpp-client-ssl-0.10-3.el5
qpid-cpp-server-xml-0.10-3.el5
qpid-cpp-client-devel-docs-0.10-3.el5
qpid-qmf-0.10-2.el5
qpid-cpp-client-devel-0.10-3.el5
python-qpid-0.10-1.el5
qpid-cpp-server-store-0.10-3.el5
qpid-qmf-devel-0.10-2.el5
python-qpid-qmf-0.10-2.el5
qpid-java-common-0.10-2.el5
qpid-java-client-0.10-2.el5
qpid-tools-0.10-3.el5
qpidc-debuginfo-0.5.752581-3.el5
qpid-cpp-server-ssl-0.10-3.el5
qpid-cpp-server-devel-0.10-3.el5
qpid-java-example-0.10-2.el5
qpid-cpp-server-0.10-3.el5
qpid-cpp-server-cluster-0.10-3.el5

How reproducible:
100% on x86_64
5% on i386

Steps to Reproduce:
1. echo '
log-enable=info+ 
log-to-file=/tmp/qpidd.log 
log-to-stdout=no 
log-to-stderr=no 
auth=no 
' > /etc/qpidd.conf

2. ./spout_credit -c 20 'my-queue; {create: always, node:{durable:True}}'
(on current package set this will send slowly 20 messages)

Message(durable=True, properties={'spout-id': '29813a28-b5dd-9e4a-87e8-28aec88a6f9e:0'})
...
Message(durable=True, properties={'spout-id': '29813a28-b5dd-9e4a-87e8-28aec88a6f9e:19'})

3. qpid-stat -q | grep my-queue
   my-queue Y 20 20 0 0 0 0 0 1


4.1. /drain_credit -r -i 1 -l 10 'my-queue; { create: always}'

4.2. ./drain_credit -r -i 1 -l 10 'my-queue; {delete:receiver, create: always}'

(this can be output either for 4.1 && 4.2)
Message(durable=True, properties={'x-amqp-0-10.routing-key': u'my-queue', u'spout-id': u'29813a28-b5dd-9e4a-87e8-28aec88a6f9e:0'})


5. qpid-stat -q | grep my-queue
(obviously shows nothing)







Actual results:
Non-empty queue is being deleted or messages are not being printed to stdout by drain example

Expected results:
Only after receiving all messages queue can be deleted

Additional info:
spout example was also modified

Comment 1 ppecka 2011-04-21 18:53:41 UTC
attachment is bzip2 file not gzip

Comment 3 Gordon Sim 2011-04-26 10:03:23 UTC
I suspect this may just require clarifying the intended behaviour of the 'delete:receiver' option.

That option will delete the queue whenever the receiver closes. In the modified drain, the receiver will close when the timeout expires and/or when the specified number of messages have been fetched. 

In the example above the timeout and count are not set. The timeout will default to 0. This means that if there is a delay when there are no messages on the queue (because the other receiver fetched the first one, and spout has yet to send another) then the second drain will exit and delete the queue.

Try setting a much larger timeout (e.g. the time for spout to produce all its messages).

Comment 4 ppecka 2011-04-27 14:46:38 UTC
(In reply to comment #3)
> I suspect this may just require clarifying the intended behaviour of the
> 'delete:receiver' option.
> 
> That option will delete the queue whenever the receiver closes. In the modified
> drain, the receiver will close when the timeout expires and/or when the
> specified number of messages have been fetched. 
> 
> In the example above the timeout and count are not set. The timeout will
> default to 0. This means that if there is a delay when there are no messages on
> the queue (because the other receiver fetched the first one, and spout has yet
> to send another) then the second drain will exit and delete the queue.

all reproducer steps should be subsequent, thus drain is reading from queue when no one else is reading/consuming messages


this behaviour is observed on x86_64 arch, on i386 we are getting all the messages before queue is being deleted

Comment 5 Gordon Sim 2011-04-28 12:03:22 UTC
The issue is that timeout=0 does not work as expected when the capacity=1. See https://issues.apache.org/jira/browse/QPID-3232. As a workaround you can set the timeout to 1.

There is a race condition where the credit that allows the second message to be delivered is issued after the fetch has flushed for that message which probably accounts for the variability you see.

Comment 6 Gordon Sim 2011-04-28 12:05:12 UTC
In my view this is not a blocker; it is not a regression and there is a reasonable workaround. We should also change the description or open another BZ to track the actual issue.

Comment 7 ppecka 2011-04-28 14:21:24 UTC
ok, changing description

Comment 8 Gordon Sim 2011-04-28 16:40:49 UTC
Fixed upstream: http://svn.apache.org/viewvc?rev=1097539&view=rev

Comment 9 Petr Matousek 2012-04-27 17:27:38 UTC
This issue has been fixed
(All 20 messages are drained with the reproducer before the queue is deleted.)

Verified on rhel5, rhel6 (x86_64,i386)

packages installed:
python-qpid-0.14-6.el5
python-qpid-0.14-7.el6_2

-> VERIFIED