Description of problem: If a dynamic bridge's session has been detached while attempting to propagate a binding event, the broker will delete the bridge. Normally, a detached bridge session will be automatically recovered during the maintenance periodic if possible. Needless to say, auto-deleting the bridge upon a session error prevents this normal recovery path from occurring. This event can occur in a production system during broker startup/federation and also during source broker recovery since there is a potential race condition between creation of the source exchange and the creation of the dynamic bridge on the destination broker. Log Message: Sep 30 19:41:40 localhost qpidd[10497]: 2012-09-30 19:41:40 [Broker] error Cannot propagate binding for dynamic bridge as session has been detached, deleting dynamic bridge Version-Release number of selected component (if applicable): Qpid 0.18 How reproducible: 100% Steps to Reproduce: 1. Create a dynamic bridge between two brokers. Destination broker should have a valid destination exchange but the source broker should be missing the source exchange. 2. Create a new binding on the destination exchange. Actual results: Bridge is deleted because the session was previously detached due to the missing exchange. Expected results: After the source exchange is created, session error is recovered during the maintenance periodic and the binding event properly propagates. Additional info:
Created attachment 619918 [details] Quick patch to prevent the bridge from being destroyed
What does one do to have the session appear detached? With these commands the brokers retry until the source exchange is created and then things proceeed normally. # src broker: localhost:5801 # dst broker: localhost:5803 # # Create exchange in dst broker # qpid-config -b localhost:5803 add exchange topic fed.topic # # create dynamic bridge # qpid-route dynamic add localhost:5803 localhost:5801 fed.topic # # create dst queue as bind target # qpid-config -b localhost:5803 add queue fed.topic.queue # # create binding on dest exchange # qpid-config -b localhost:5803 bind fed.topic fed.topic.queue
There is the potential that the session is not invalidated when you create the binding (might be in the process of recovering via the link maintenance interval -- which I believe you can increase). Recommend that you continue to send binding events until you see the log message above.
Proposed patch committed upstream QPID-4378, r1399837. Checked by Ted Ross. I never reproduced the bug through normal session errors but simply binding and unbinding caused an issue that this patch corrects.
Code prints a warning message when links are fine and a bridge is unbound. This is wrong. No warning is required and the else clause of the original patch should be to do nothing. This is included in QPID-4378, r1400736.
Tested on RHEL5.9 and RHEL6.3, both i386 and x86_64. The broker does not delete a dynamic bridge if its session is detached. Packages used for testing: RHEL5.9 python-qpid-0.18-4.el5 python-qpid-qmf-0.18-9.el5 qpid-cpp-client-0.18-10.el5 qpid-cpp-client-devel-0.18-10.el5 qpid-cpp-client-ssl-0.18-10.el55 qpid-cpp-server-0.18-10.el5 qpid-cpp-server-cluster-0.18-10.el5 qpid-cpp-server-devel-0.18-10.el5 qpid-cpp-server-ha-0.18-10.el5 qpid-cpp-server-ssl-0.18-10.el5 qpid-cpp-server-store-0.18-10.el5 qpid-cpp-server-xml-0.18-10.el5 qpid-java-client-0.18-5.el5 qpid-java-common-0.18-5.el5 qpid-java-example-0.18-5.el5 qpid-qmf-0.18-9.el5 qpid-qmf-devel-0.18-9.el5 qpid-tools-0.18-7.el5 RHEL6.3 python-qpid-0.18-4.el6 python-qpid-qmf-0.18-10.el6_3 qpid-cpp-client-0.18-10.el6_3 qpid-cpp-client-devel-0.18-10.el6_3 qpid-cpp-client-ssl-0.18-10.el6_3 qpid-cpp-server-0.18-10.el6_3 qpid-cpp-server-cluster-0.18-10.el6_3 qpid-cpp-server-devel-0.18-10.el6_3 qpid-cpp-server-ha-0.18-10.el6_3 qpid-cpp-server-ssl-0.18-10.el6_3 qpid-cpp-server-store-0.18-10.el6_3 qpid-cpp-server-xml-0.18-10.el6_3 qpid-java-client-0.18-5.el6 qpid-java-common-0.18-5.el6 qpid-java-example-0.18-5.el6 qpid-qmf-0.18-10.el6_3 qpid-qmf-devel-0.18-10.el6_3 qpid-tools-0.18-7.el6_3.noarch -> VERIFIED
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2013-0561.html