Bug 873347

Summary: HA channel errors when adding/removing replicated queues
Product: Red Hat Enterprise MRG Reporter: Jason Dillaman <jdillama>
Component: qpid-cppAssignee: Chuck Rolke <crolke>
Status: CLOSED CURRENTRELEASE QA Contact: MRG Quality Engineering <mrgqe-bugs>
Severity: unspecified Docs Contact:
Priority: high    
Version: DevelopmentCC: aconway, esammons, jross, lzhaldyb, mcressma
Target Milestone: 2.3Keywords: OtherQA
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qpid-cpp-0.18-7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-03-19 16:38:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 698367    

Description Jason Dillaman 2012-11-05 15:38:40 UTC
Description of problem:
As you add and remove replicated queues within an HA broker, bridges are opened and closed for each queue.  Closing a bridge immediately re-adds the associated channel number into an available pool.  As a result, when the old bridge (channel X) is closed it sends a "detach" command to the broker and the new bridge (assigned the same channel X) sends an "attach" command.  The broker will eventually respond with a "detached" command which was meant for the original bridge on channel X.  Unfortunately, the new bridge on channel X handles this detached command from the broker and flags the bridge as detached.  This process can then repeat for several cycles until it break out of the detach/attach/detached race.

In addition to the detach/attach/detached race, the immediate re-use of channel numbers appears to create other issues like the following:

Nov  2 11:26:40 itcm31 qpidd[12122]: 2012-11-02 11:26:40 [Protocol] error Execution exception: invalid-argument: anonymous.qpid.bridge_session_qpid.replicator-Queue1.b64c23e6-cb01-4297-8935-c12b40
804ae2_84209514-2e58-4fb9-8d37-7c2440f5f144: confirmed < (2+0) but only sent < (0+0) (qpid/SessionState.cpp:154)

Version-Release number of selected component (if applicable):
Qpid 0.18-6

How reproducible:
Frequently

Steps to Reproduce:
1. Rapidly create and delete replicated queues
  
Actual results:
Detach/attach/detached loops and other channel errors are seen in the logs.  This issue stops HA replication which stops message flow through the broker.

Expected results:
Channel errors are not encountered

Additional info:

Comment 1 Chuck Rolke 2012-11-05 20:47:12 UTC
Fixed upstream at r1405946.