Bug 640618

Summary: qpid python client - message redelivered flag is permanently set incorrect (True) when receiving client is restarted
Product: Red Hat Enterprise MRG Reporter: Frantisek Reznicek <freznice>
Component: python-qpidAssignee: Rafael H. Schloming <rafaels>
Status: CLOSED NOTABUG QA Contact: ppecka <ppecka>
Severity: medium Docs Contact:
Priority: urgent    
Version: DevelopmentCC: esammons, gsim, iboverma, jneedle, ppecka, tross
Target Milestone: 2.0   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-04-27 09:46:07 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 508144    
Bug Blocks:    
Attachments:
Description Flags
Testing clients
none
comment #c8 attach none

Description Frantisek Reznicek 2010-10-06 13:34:22 UTC
Description of problem:

There is observed that is python client which consumes persistant messages (step 3.) is stopped (kill -2, step 4.) and restarted (step 5.) then all messages which are consumed during first client run has redelivered flag set to False AND all messages received by the second run of python client have redelivered flag marked True.

The expected behavior is that just first batch (one or few) of messages received by the re-runned python receiver (step 5.) should have redelivered flag marked True, the others should have redelivered flag set False.
  
Observed on RHEL 4.8 / 5.5 i386 / x86_64.
There was experiment which proved that qpid java client behaves correctly (just few of the messages are marked as redelivered).


Version-Release number of selected component (if applicable):
python-qmf-0.7.946106-13.el5
python-qpid-0.7.946106-14.el5
qmf-0.7.946106-17.el5
qmf-devel-0.7.946106-17.el5
qpid-cpp-client-0.7.946106-17.el5
qpid-cpp-client-devel-0.7.946106-17.el5
qpid-cpp-client-devel-docs-0.7.946106-17.el5
qpid-cpp-client-rdma-0.7.946106-17.el5
qpid-cpp-client-ssl-0.7.946106-17.el5
qpid-cpp-mrg-debuginfo-0.7.946106-17.el5
qpid-cpp-server-0.7.946106-17.el5
qpid-cpp-server-cluster-0.7.946106-17.el5
qpid-cpp-server-devel-0.7.946106-17.el5
qpid-cpp-server-rdma-0.7.946106-17.el5
qpid-cpp-server-ssl-0.7.946106-17.el5
qpid-cpp-server-store-0.7.946106-17.el5
qpid-cpp-server-xml-0.7.946106-17.el5
qpid-dotnet-0.4.738274-2.el5
qpid-java-client-0.7.946106-10.el5
qpid-java-common-0.7.946106-10.el5
qpid-tools-0.7.946106-11.el5
ruby-qmf-0.7.946106-17.el5
ruby-qpid-0.7.946106-2.el5


How reproducible:
100%

Steps to Reproduce:
1. start broker with persistance
2. python putmsg2.py localhost 5672 10000
3. python getmsg2.py
4. ^C
5. python getmsg2.py
  
Actual results:
  Redelivered flag for messages at step 3. is set to False.
  Redelivered flag for messages at step 5. is set to True.


Expected results:
  Redelivered flag for messages at step 3. is set to False.
  Redelivered flag for messages at step 5. should be set True just for first few messages (which might not get acknowledged) and the rest should have flag set to False.


Additional info:

Comment 1 Frantisek Reznicek 2010-10-06 13:38:59 UTC
Created attachment 451897 [details]
Testing clients

Comment 4 Gordon Sim 2011-01-04 17:40:47 UTC
See also bug 653923.

Comment 5 Gordon Sim 2011-01-04 18:03:12 UTC
See also bug 667172.

Comment 6 Rafael H. Schloming 2011-03-16 20:03:55 UTC
I don't think this is a bug. The old API disables credit entirely when you call queue.start(), so there is no prefetch limit in place, and given that the C++ broker is going to be quite a bit faster at writing messages than the python client will be at reading them, it's not at all surprising that way more than the first few messages are marked as redelivered.

Unless this can be reproduced with a credit limit in place, or if there are log entries that prove that the client is reporting redelivered=True for something that was sent redelivered=False, I would consider this expected behavior. Do you have anything else that might indicate a fault in the client?

Comment 7 Frantisek Reznicek 2011-03-23 13:41:28 UTC
The given explanation seems correct to me.

I'm going to retest together with bug 653923 and 667172 with focus on client credit to confirm or refute the issue.

Comment 8 ppecka 2011-04-21 19:10:00 UTC
I'm attaching modified reproducer, which shows that after qpidd is restarted, all messages are marked "redelivered=True". This is done with new API client. with current package set (the same as bz698783 ). I'm attaching reproducer's with modified drain/spout examples. Reproducer logs asynchronously, so i tried to neat it a bit below. This issue seems very similar to bz653923.
spout_credit sends durable messages
drain_credit reads messages with capacity set to some low number, and there is also sleep added to be able to restart broker in mean time of msg consumption.



 ./spout_credit -c 30 'my-queue; {create: always, node:{durable:True}}'
Message(durable=True, properties={'spout-id': 'c63616de-8a03-654c-ba1e-9b4d5b10b424:0'})
...
Message(durable=True, properties={'spout-id': 'c63616de-8a03-654c-ba1e-9b4d5b10b424:29'})


+ qpid-stat -q
  my-queue                              Y                     30    30      0       0      0        0         0     1



+ sleep 2

+ ./drain_credit -r -i 1 -l 10 'my-queue; {delete:receiver, create: always}'

+ service qpidd restart
Stopping Qpid AMQP daemon: 2011-04-21 14:48:02,709 WARNING recoverable 



error[attempt 0]: connection aborted
2011-04-21 14:48:02,709 WARNING trying: localhost:5672
2011-04-21 14:48:02,709 WARNING reconnect succeeded: localhost:5672
2011-04-21 14:48:02,713 WARNING recoverable error[attempt 0]: (104, 'Connection reset by peer')
2011-04-21 14:48:02,714 WARNING trying: localhost:5672
2011-04-21 14:48:02,714 WARNING recoverable error[attempt 1]: (111, 'Connection refused')
2011-04-21 14:48:02,714 WARNING sleeping 1.0 seconds
[60G[[0;32m  OK  [0;39m]



Starting Qpid AMQP daemon: [60G[[0;32m  OK  [0;39m]



+ qpid-stat -q
  my-queue                              Y                     25    27      2       0      0        0         1     1


Message(durable=True, properties={'x-amqp-0-10.routing-key': u'my-queue', u'spout-id': u'c63616de-8a03-654c-ba1e-9b4d5b10b424:0'})
False
Message(durable=True, properties={'x-amqp-0-10.routing-key': u'my-queue', u'spout-id': u'c63616de-8a03-654c-ba1e-9b4d5b10b424:1'})
False
Message(durable=True, properties={'x-amqp-0-10.routing-key': u'my-queue', u'spout-id': u'c63616de-8a03-654c-ba1e-9b4d5b10b424:2'})
False
Message(durable=True, properties={'x-amqp-0-10.routing-key': u'my-queue', u'spout-id': u'c63616de-8a03-654c-ba1e-9b4d5b10b424:3'})
False
Message(durable=True, properties={'x-amqp-0-10.routing-key': u'my-queue', u'spout-id': u'c63616de-8a03-654c-ba1e-9b4d5b10b424:4'})
False
Message(durable=True, properties={'x-amqp-0-10.routing-key': u'my-queue', u'spout-id': u'c63616de-8a03-654c-ba1e-9b4d5b10b424:5'})
False
Message(durable=True, redelivered=True, properties={'x-amqp-0-10.routing-key': u'my-queue', u'spout-id': u'c63616de-8a03-654c-ba1e-9b4d5b10b424:3'})
True
Message(durable=True, redelivered=True, properties={'x-amqp-0-10.routing-key': u'my-queue', u'spout-id': u'c63616de-8a03-654c-ba1e-9b4d5b10b424:4'})
True
Message(durable=True, redelivered=True, properties={'x-amqp-0-10.routing-key': u'my-queue', u'spout-id': u'c63616de-8a03-654c-ba1e-9b4d5b10b424:5'})
True
Message(durable=True, redelivered=True, properties={'x-amqp-0-10.routing-key': u'my-queue', u'spout-id': u'c63616de-8a03-654c-ba1e-9b4d5b10b424:6'})
True
Message(durable=True, redelivered=True, properties={'x-amqp-0-10.routing-key': u'my-queue', u'spout-id': u'c63616de-8a03-654c-ba1e-9b4d5b10b424:7'})
True
Message(durable=True, redelivered=True, properties={'x-amqp-0-10.routing-key': u'my-queue', u'spout-id': u'c63616de-8a03-654c-ba1e-9b4d5b10b424:8'})
True

...


Message(durable=True, redelivered=True, properties={'x-amqp-0-10.routing-key': u'my-queue', u'spout-id': u'c63616de-8a03-654c-ba1e-9b4d5b10b424:28'})
True
Message(durable=True, redelivered=True, properties={'x-amqp-0-10.routing-key': u'my-queue', u'spout-id': u'c63616de-8a03-654c-ba1e-9b4d5b10b424:29'})
True

Comment 9 ppecka 2011-04-21 19:11:53 UTC
Created attachment 493967 [details]
comment #c8 attach

attachment related to https://bugzilla.redhat.com/show_bug.cgi?id=640618#c8

Comment 10 ppecka 2011-04-21 19:18:59 UTC
(In reply to comment #6)
> I don't think this is a bug. The old API disables credit entirely when you call
> queue.start(), so there is no prefetch limit in place, 
can we clarify "disables credit" - does it mean that credit will not be considered during session when set to some value ( and thus is somehow ignored) OR it means credit is set to unlimited?
 
> Unless this can be reproduced with a credit limit in place, or if there are log
> entries that prove that the client is reporting redelivered=True for something
> that was sent redelivered=False, I would consider this expected behavior. Do
> you have anything else that might indicate a fault in the client?
So to clarify above, can you check comment #c8? mainly the bottom part where drain takes place

Comment 11 Gordon Sim 2011-04-26 09:44:48 UTC
From comment #8: 'after qpidd is restarted, all messages are marked "redelivered=True"'

Correct, the broker marks all durable messages as redelivered on restart. This is because it does not durably record delivery attempts. The setting of the redelivered flag is a hint, not a guarantee. It is usually better to suggest that the application check whether it has seen a message that may never have been redelivered rather than fail to hint about a message that has been redelivered.

Though this is an area that will be improved on, I believe it is orthogonal to the python client's handling of the flag.

Comment 12 ppecka 2011-04-27 09:46:07 UTC
ok, so this is probably NOTABUG

Comment 13 ppecka 2011-04-27 13:44:30 UTC
queried to be documented
https://bugzilla.redhat.com/show_bug.cgi?id=700107