Bug 873347 - HA channel errors when adding/removing replicated queues
Summary: HA channel errors when adding/removing replicated queues
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: Development
Hardware: Unspecified
OS: Unspecified
high
unspecified
Target Milestone: 2.3
: ---
Assignee: Chuck Rolke
QA Contact: MRG Quality Engineering
URL:
Whiteboard:
Depends On:
Blocks: 698367
TreeView+ depends on / blocked
 
Reported: 2012-11-05 15:38 UTC by Jason Dillaman
Modified: 2013-03-19 16:38 UTC (History)
5 users (show)

Fixed In Version: qpid-cpp-0.18-7
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-03-19 16:38:55 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Apache JIRA QPID-4421 0 None None None 2012-11-05 20:52:49 UTC

Description Jason Dillaman 2012-11-05 15:38:40 UTC
Description of problem:
As you add and remove replicated queues within an HA broker, bridges are opened and closed for each queue.  Closing a bridge immediately re-adds the associated channel number into an available pool.  As a result, when the old bridge (channel X) is closed it sends a "detach" command to the broker and the new bridge (assigned the same channel X) sends an "attach" command.  The broker will eventually respond with a "detached" command which was meant for the original bridge on channel X.  Unfortunately, the new bridge on channel X handles this detached command from the broker and flags the bridge as detached.  This process can then repeat for several cycles until it break out of the detach/attach/detached race.

In addition to the detach/attach/detached race, the immediate re-use of channel numbers appears to create other issues like the following:

Nov  2 11:26:40 itcm31 qpidd[12122]: 2012-11-02 11:26:40 [Protocol] error Execution exception: invalid-argument: anonymous.qpid.bridge_session_qpid.replicator-Queue1.b64c23e6-cb01-4297-8935-c12b40
804ae2_84209514-2e58-4fb9-8d37-7c2440f5f144: confirmed < (2+0) but only sent < (0+0) (qpid/SessionState.cpp:154)

Version-Release number of selected component (if applicable):
Qpid 0.18-6

How reproducible:
Frequently

Steps to Reproduce:
1. Rapidly create and delete replicated queues
  
Actual results:
Detach/attach/detached loops and other channel errors are seen in the logs.  This issue stops HA replication which stops message flow through the broker.

Expected results:
Channel errors are not encountered

Additional info:

Comment 1 Chuck Rolke 2012-11-05 20:47:12 UTC
Fixed upstream at r1405946.


Note You need to log in before you can comment on or make changes to this bug.