Bug 696655

Summary: cached exchange reference can cause cluster inconsistencies if exchange is deleted/recreated
Product: Red Hat Enterprise MRG Reporter: Gordon Sim <gsim>
Component: qpid-cppAssignee: Alan Conway <aconway>
Status: CLOSED ERRATA QA Contact: Petr Matousek <pematous>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 1.3CC: aconway, freznice, iboverma, pematous, tross
Target Milestone: 2.0   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qpid-cpp-mrg-0.10-4 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 703724 716260 (view as bug list) Environment:
Last Closed: 2011-06-23 15:43:13 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 703724, 716260    

Description Gordon Sim 2011-04-14 14:33:15 UTC
Description of problem:

SemanticState::route() uses a simple cache variable to avoid looking up the exchange for every message. However if the exchange in question is deleted, even if then recreated, this can cause inconsistencies in a cluster.

Version-Release number of selected component (if applicable):

1.3

How reproducible:

100% (Quite a contrived example though)

Steps to Reproduce:
1. start one cluster node
2. create an exchange, a queue and a binding between them

  qpid-config add exchange topic x
  qpid-config add queue q
  qpid-config bind x q k

3. start a session and send a message to the exchange with the relevant key (leave session running)

  qpid-send --content-stdin --address x/k

then enter a few lines to send some messages

4. start a new cluster node
5. delete and recreate the exchange, this time add in a different binding

  qpid-config del exchange x
  qpid-config add exchange topic x
  qpid-config add queue q2
  qpid-config bind x q2 k  

6. send some more messages on the session from 3. with same exchange and key (i.e. type in some more messages if using qpid-send as suggested)

  now have an inconsistency where the second node has some messages in q2 and some (though fewer than first node) in q1, whereas for first node all the messages are in q1

7. qpid-receive --address 'q2; {mode: browse}' --broker localhost:5673 --capacity 1 (assuming second node is 5673)
  
Actual results:

First node shutsdown with inconsistent error

Expected results:

No inconsistency, should be able to run the command in 7 against q or q2 on either node and see the same results.

Additional info:

Comment 1 Alan Conway 2011-04-14 14:50:22 UTC
I think this is a standalone bug as well. Messages should not be routed by a deleted exchange. Invalidating the cache when the exchange is deleted would solve both standalone and cluster issues.

Comment 2 Alan Conway 2011-04-19 14:56:56 UTC
Upstream JIRA https://issues.apache.org/jira/browse/QPID-3215

Comment 3 Alan Conway 2011-04-19 17:46:42 UTC
Fixed on upstream trunk r1095144

Comment 5 Alan Conway 2011-04-19 20:47:46 UTC
Original commit was incomplete, fixed on mrg_2.0.x

http://mrg1.lab.bos.redhat.com/cgit/qpid.git/commit/?h=mrg_2.0.x&id=b68cf9408fc95abc5e62590cb9c1dda6c6ba92fd

Comment 7 Petr Matousek 2011-05-11 08:34:37 UTC
This issue has been fixed in qpid-cpp-mrg-0.10-4 for RHEL5, but not yet
available in any RHEL6 package. 

The bug was cloned for RHEL6: please see bug 703724

Verified on RHEL5.6 architectures: i386, x86_64

packages installed:
python-qpid-0.10-1.el5
python-qpid-qmf-0.10-6.el5
qpid-cpp-client-0.10-4.el5
qpid-cpp-client-devel-0.10-4.el5
qpid-cpp-client-devel-docs-0.10-4.el5
qpid-cpp-client-ssl-0.10-4.el5
qpid-cpp-mrg-debuginfo-0.10-4.el5
qpid-cpp-server-0.10-4.el5
qpid-cpp-server-cluster-0.10-4.el5
qpid-cpp-server-devel-0.10-4.el5
qpid-cpp-server-ssl-0.10-4.el5
qpid-cpp-server-store-0.10-4.el5
qpid-cpp-server-xml-0.10-4.el5
qpid-java-client-0.10-4.el5
qpid-java-common-0.10-4.el5
qpid-java-example-0.10-4.el5
qpid-qmf-0.10-6.el5
qpid-qmf-devel-0.10-6.el5
qpid-tools-0.10-4.el5

-> VERIFIED

Comment 8 errata-xmlrpc 2011-06-23 15:43:13 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2011-0890.html