Bug 812950

Summary: Duplicate delivery after consumer failover in HA configuration.
Product: Red Hat Enterprise MRG Reporter: Andrew Replogle <areplogl>
Component: python-qpidAssignee: messaging-bugs <messaging-bugs>
Status: NEW --- QA Contact: MRG Quality Engineering <mrgqe-bugs>
Severity: high Docs Contact:
Priority: medium    
Version: 2.0CC: gsim, jross
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
test client for pausing before acking allowing time to kill connected broker. none

Description Andrew Replogle 2012-04-16 16:24:00 UTC
Created attachment 577764 [details]
test client for pausing before acking allowing time to kill connected broker.

Description of problem:
Using connection.reconnect=True if a consumer goes to fetch a message off a broker in an HA cluster and the broker fails before the acknowledgement, the failed over broker doesn't remove the message from the queue after the acknowledgement. 


Version-Release number of selected component (if applicable):
qpid-cpp-server-cluster-0.12-6.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Connect to an HA cluster (using connection.reconnect = True) and fetch a message (don't ack).
2. Kill the broker you're connected to.
3. Acknowledge the message after your connection/session is failed over.
4. Examine the queue and see that your message is still on the queue even though you acknowledged w/out error.
  
Actual results:
No error, message is delivered to client but message is not removed from queue. 

Expected results:
After session.acknowledge the message is removed from the queue.

Additional info:

Comment 1 Gordon Sim 2012-04-16 16:51:30 UTC
This is really a messaging API issue. The call to acknowledge will not fail after reconnecting, even though the set of messages it actually refers to may have been changed by the failover. How best to signal this fact to the application this needs some further thought and debate.