Description of problem: Broker crashes when using dynamic federation links if the link dies concurrently with the propagation of a bind or unbind. Version-Release number of selected component (if applicable): Since 1.1 How reproducible: Relatively easily Steps to Reproduce: 1. setup dynamic federation for a particular exchange e.g. start brokers on 5672 and 5673 qpid-config add exchange topic federated.topic qpid-config -a localhost:5673 add exchange topic federated.topic qpid-route dynamic add localhost:5672 localhost:5673 federated.topic qpid-route dynamic add localhost:5673 localhost:5672 federated.topic qpid-config add queue test-queue 2. bind/unbind continually on one broker e.g. run the following in a loop qpid-config bind federated.topic test-queue binding qpid-config unbind federated.topic test-queue binding 3. stop and restart the other broker e.g. stop and restart the broker on 5673 Actual results: Eventually the broker on 5672 crashes (segfault or aborts). Expected results: No crashing Additional info: Another way to reproduce is to setup federation between clusters, have receivers on one cluster, senders on the other and bounce the nodes in turn on the cluster for the receivers.
The root of the problem is that the binding information for dynamic routes are propagated by sending commands over a bridge on another connections IO thread. Bridge::propagateBinding() needs to record the details being propagated and request processing on the IO thread for the link to which it belongs.
More information... If you repeat the same test but use "amq.topic" instead of "federated.topic", the crash does not occur. It appears that the destination broker, when propagating a binding to a source broker that doesn't have the named exchange, receives an exception that leaves the destination broker in a bad state. Subsequent attempts to propagate bindings result in the broker crash. Here's another reproducer that illustrates this affect in a more focused way: #!/bin/sh b1=localhost:5672 b2=localhost:5673 echo "Creating fed.topic exchange on brokers..." qpid-config -a $b1 add exchange topic fed.topic qpid-config -a $b2 add exchange topic fed.topic echo "Creating bi-directional dynamic routes..." qpid-route dynamic add $b1 $b2 fed.topic qpid-route dynamic add $b2 $b1 fed.topic echo "Create queue..." qpid-config -a $b1 add queue test-queue echo "Please stop and restart $b2" sleep 10 echo "Create binding..." qpid-config -a $b1 bind fed.topic test-queue test-key echo "Create a second binding..." qpid-config -a $b1 bind fed.topic test-queue test-key2 echo "Create a third binding..." qpid-config -a $b1 bind fed.topic test-queue test-key3
*** Bug 509970 has been marked as a duplicate of this bug. ***
Similar crash observed from run of failover_soak (see https://bugzilla.redhat.com/show_bug.cgi?id=509970).
#0 0x009a9424 in __kernel_vsyscall () #1 0x0034c781 in raise () from /lib/libc.so.6 #2 0x0034e04a in abort () from /lib/libc.so.6 #3 0x07d2d44f in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/libstdc++.so.6 #4 0x07d2b385 in ?? () from /usr/lib/libstdc++.so.6 #5 0x07d2b3c2 in std::terminate() () from /usr/lib/libstdc++.so.6 #6 0x07d2c075 in __cxa_pure_virtual () from /usr/lib/libstdc++.so.6 #7 0x00b543f0 in qpid::framing::Proxy::send (this=0x83fc044, b=@0xbf8360ec) at ../../src/qpid/framing/Proxy.cpp:37 #8 0x00ab0f19 in qpid::framing::AMQP_ServerProxy::Exchange::bind (this=0x83fc044, queue="bridge_queue_1_0cdf920a-1cae-422e-954c-8fd0a37929e5", exchange="federated.topic", bindingKey="binding", arguments=@0xbf8361a0) at qpid/framing/AMQP_ServerProxy.cpp:328 #9 0x0064a974 in qpid::broker::Bridge::ioThreadPropagateBinding (this=0xb5502788, queue="bridge_queue_1_0cdf920a-1cae-422e-954c-8fd0a37929e5", exchange="federated.topic", key="binding", args={values = std::map with 3 elements = {...}}) at ../../src/qpid/broker/Bridge.cpp:316 #10 0x0064f13a in boost::_mfi::mf4<void, qpid::broker::Bridge, std::string const&, std::string const&, std::string const&, qpid::framing::FieldTable>::operator() (a4=<value optimized out>, a3=<value optimized out>, p=<value optimized out>, this=<value optimized out>, a2=<value optimized out>, a1=<value optimized out>) at /usr/include/boost/bind/mem_fn_template.hpp:494 #11 operator()<boost::_mfi::mf4<void, qpid::broker::Bridge, const std::string&, const std::string&, const std::string&, qpid::framing::FieldTable>, boost::_bi::list0> (a4=<value optimized out>, a3=<value optimized out>, p=<value optimized out>, this=<value optimized out>, a2=<value optimized out>, a1=<value optimized out>) at /usr/include/boost/bind.hpp:504 #12 boost::_bi::bind_t<void, boost::_mfi::mf4<void, qpid::broker::Bridge, std::string const&, std::string const&, std::string const&, qpid::framing::FieldTable>, boost::_bi::list5<boost::_bi::value<qpid::broker::Bridge*>, boost::_bi::value<std::string>, boost::_bi::value<std::string>, boost::_bi::value<std::string>, boost::_bi::value<qpid::framing::FieldTable> > >::operator() (a4=<value optimized out>, a3=<value optimized out>, p=<value optimized out>, this=<value optimized out>, a2=<value optimized out>, a1=<value optimized out>) at /usr/include/boost/bind/bind_template.hpp:20 #13 boost::detail::function::void_function_obj_invoker0<boost::_bi::bind_t<void, boost::_mfi::mf4<void, qpid::broker::Bridge, std::string const&, std::string const&, std::string const&, qpid::framing::FieldTable>, boost::_bi::list5<boost::_bi::value<qpid::broker::Bridge*>, boost::_bi::value<std::string>, boost::_bi::value<std::string>, boost::_bi::value<std::string>, boost::_bi::value<qpid::framing::FieldTable> > >, void>::invoke ( a4=<value optimized out>, a3=<value optimized out>, p=<value optimized out>, this=<value optimized out>, a2=<value optimized out>, a1=<value optimized out>) at /usr/include/boost/function/function_template.hpp:152 #14 0x0066f38c in boost::function0<void>::operator() (this=0xbf836290) at /usr/include/boost/function/function_template.hpp:989 #15 0x0066b924 in qpid::broker::Connection::doOutput (this=0xb554bc18) at ../../src/qpid/broker/Connection.cpp:276 #16 0x002815b9 in qpid::cluster::OutputInterceptor::deliverDoOutput (this=0x83bcc00, limit=2048) at ../../src/qpid/cluster/OutputInterceptor.cpp:86 #17 0x00256f27 in qpid::cluster::Connection::deliverDoOutput (this=0x83bcbc8, limit=2048) at ../../src/qpid/cluster/Connection.cpp:242 #18 0x00aee44d in qpid::framing::ClusterConnectionDeliverDoOutputBody::invoke<qpid::framing::AMQP_AllOperations::ClusterConnectionHandler> ( invocable=<value optimized out>, this=<value optimized out>) at ./qpid/framing/ClusterConnectionDeliverDoOutputBody.h:63 #19 qpid::framing::AMQP_AllOperations::ClusterConnectionHandler::Invoker::visit (invocable=<value optimized out>, this=<value optimized out>) at qpid/framing/AllInvoker.cpp:1100 #20 0x00afa3ab in qpid::framing::ClusterConnectionDeliverDoOutputBody::accept (this=0x83fb3d8, v=@0xbf8363ac) at ./qpid/framing/ClusterConnectionDeliverDoOutputBody.h:67 #21 0x00262193 in qpid::framing::invoke<qpid::cluster::Connection> (target=@0x83bcbc8, body=@0x83fb3d8) at ../../src/qpid/framing/Invoker.h:80 #22 0x0025c7fc in qpid::cluster::Connection::deliveredFrame (this=0x83bcbc8, f=@0xbf836890) at ../../src/qpid/cluster/Connection.cpp:250 #23 0x0022fdfa in qpid::cluster::Cluster::processFrame (this=0x83b51b8, e=@0xbf836890, l=@0xbf8368d8) at ../../src/qpid/cluster/Cluster.cpp:522 ---Type <return> to continue, or q <return> to quit--- #24 0x0023bc08 in qpid::cluster::Cluster::deliveredFrame (this=0x83b51b8, efConst=@0x83bd2a0) at ../../src/qpid/cluster/Cluster.cpp:506 #25 0x0023e674 in boost::_mfi::mf1<void, qpid::cluster::Cluster, qpid::cluster::EventFrame const&>::operator() (a1=<value optimized out>, p=<value optimized out>, this=<value optimized out>) at /usr/include/boost/bind/mem_fn_template.hpp:162 #26 operator()<boost::_mfi::mf1<void, qpid::cluster::Cluster, const qpid::cluster::EventFrame&>, boost::_bi::list1<const qpid::cluster::EventFrame&> > (a1=<value optimized out>, p=<value optimized out>, this=<value optimized out>) at /usr/include/boost/bind.hpp:292 #27 operator()<qpid::cluster::EventFrame> (a1=<value optimized out>, p=<value optimized out>, this=<value optimized out>) at /usr/include/boost/bind/bind_template.hpp:47 #28 boost::detail::function::void_function_obj_invoker1<boost::_bi::bind_t<void, boost::_mfi::mf1<void, qpid::cluster::Cluster, qpid::cluster::EventFrame const&>, boost::_bi::list2<boost::_bi::value<qpid::cluster::Cluster*>, boost::arg<1> > >, void, qpid::cluster::EventFrame const&>::invoke ( a1=<value optimized out>, p=<value optimized out>, this=<value optimized out>) at /usr/include/boost/function/function_template.hpp:152 #29 0x00242773 in boost::function1<void, qpid::cluster::EventFrame const&>::operator() (this=0x83b5580, a0=@0x83bd2a0) at /usr/include/boost/function/function_template.hpp:989 #30 0x00247a1a in qpid::cluster::PollableQueue<qpid::cluster::EventFrame>::handleBatch (this=0x83b54e8, values=std::vector of length 1, capacity 8 = {...}) at ../../src/qpid/cluster/PollableQueue.h:59 #31 0x0023e97a in boost::_mfi::mf1<__gnu_cxx::__normal_iterator<qpid::cluster::EventFrame const*, std::vector<qpid::cluster::EventFrame, std::allocator<qpid::cluster::EventFrame> > >, qpid::cluster::PollableQueue<qpid::cluster::EventFrame>, std::vector<qpid::cluster::EventFrame, std::allocator<qpid::cluster::EventFrame> > const&>::operator() (a1=<value optimized out>, p=<value optimized out>, this=<value optimized out>) at /usr/include/boost/bind/mem_fn_template.hpp:162 #32 operator()<__gnu_cxx::__normal_iterator<const qpid::cluster::EventFrame*, std::vector<qpid::cluster::EventFrame, std::allocator<qpid::cluster::EventFrame> > >, boost::_mfi::mf1<__gnu_cxx::__normal_iterator<const qpid::cluster::EventFrame*, std::vector<qpid::cluster::EventFrame, std::allocator<qpid::cluster::EventFrame> > >, qpid::cluster::PollableQueue<qpid::cluster::EventFrame>, const std::vector<qpid::cluster::EventFrame, std::allocator<qpid::cluster::EventFrame> >&>, boost::_bi::list1<const std::vector<qpid::cluster::EventFrame, std::allocator<qpid::cluster::EventFrame> >&> > ( a1=<value optimized out>, p=<value optimized out>, this=<value optimized out>) at /usr/include/boost/bind.hpp:282 #33 operator()<std::vector<qpid::cluster::EventFrame, std::allocator<qpid::cluster::EventFrame> > > (a1=<value optimized out>, p=<value optimized out>, this=<value optimized out>) at /usr/include/boost/bind/bind_template.hpp:47 #34 boost::detail::function::function_obj_invoker1<boost::_bi::bind_t<__gnu_cxx::__normal_iterator<qpid::cluster::EventFrame const*, std::vector<qpid::cluster::EventFrame, std::allocator<qpid::cluster::EventFrame> > >, boost::_mfi::mf1<__gnu_cxx::__normal_iterator<qpid::cluster::EventFrame const*, std::vector<qpid::cluster::EventFrame, std::allocator<qpid::cluster::EventFrame> > >, qpid::cluster::PollableQueue<qpid::cluster::EventFrame>, std::vector<qpid::cluster::EventFrame, std::allocator<qpid::cluster::EventFrame> > const&>, boost::_bi::list2<boost::_bi::value<qpid::cluster::PollableQueue<qpid::cluster::EventFrame>*>, boost::arg<1> > >, __gnu_cxx::__normal_iterator<qpid::cluster::EventFrame const*, std::vector<qpid::cluster::EventFrame, std::allocator<qpid::cluster::EventFrame> > >, std::vector<qpid::cluster::EventFrame, std::allocator<qpid::cluster::EventFrame> > const&>::invoke ( a1=<value optimized out>, p=<value optimized out>, this=<value optimized out>) at /usr/include/boost/function/function_template.hpp:131 #35 0x0024550a in boost::function1<__gnu_cxx::__normal_iterator<qpid::cluster::EventFrame const*, std::vector<qpid::cluster::EventFrame, std::allocator<qpid::cluster::EventFrame> > >, std::vector<qpid::cluster::EventFrame, std::allocator<qpid::cluster::EventFrame> > const&>::operator() ( this=0x83b5530, a0=std::vector of length 1, capacity 8 = {...}) at /usr/include/boost/function/function_template.hpp:989 #36 0x00246c08 in qpid::sys::PollableQueue<qpid::cluster::EventFrame>::process (this=0x83b54e8) at ../../src/qpid/sys/PollableQueue.h:151 #37 0x00247619 in qpid::sys::PollableQueue<qpid::cluster::EventFrame>::dispatch (this=0x83b54e8, cond=@0x83b5540) at ../../src/qpid/sys/PollableQueue.h:137 #38 0x0023e9c4 in boost::_mfi::mf1<void, qpid::sys::PollableQueue<qpid::cluster::EventFrame>, qpid::sys::PollableCondition&>::operator() ( a1=<value optimized out>, p=<value optimized out>, this=<value optimized out>) at /usr/include/boost/bind/mem_fn_template.hpp:162 #39 operator()<boost::_mfi::mf1<void, qpid::sys::PollableQueue<qpid::cluster::EventFrame>, qpid::sys::PollableCondition&>, boost::_bi::list1<qpid::sys:---Type <return> to continue, or q <return> to quit---
The test case in description still reproduces the problem easily. As pointed out in comment #2 the current cause of the crash is due to using the session in the bridge after an exception has occurred. Another example of a similar crash is to create a queue route, then delete the source queue, then remove the original queue route. E.g. for two brokers on localhost using ports 5672 and 5673: qpid-config add queue test-queue qpid-route queue add localhost:5673 localhost:5672 amq.fanout test-queue qpid-config -a localhost:5673 add queue test-queue qpid-config -a localhost:5673 bind amq.fanout test-queue echo msg | sender --send-eos 1 receiver --port 5673 qpid-config del queue test-queue --force qpid-route queue del localhost:5673 localhost:5672 amq.fanout test-queue
(Just for reference the issue mentioned comment #1 was addressed in r790698 and with that in place the cluster based reproducer in the descriptions 'additional info' section no longer fails)
Fixed on trunk (r952942) and in release branch (http://mrg1.lab.bos.redhat.com/git/?p=qpid.git;a=commitdiff;h=d6ead34fe2802092c0dd6490df11a6cc763506c1).
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Previously, attempting to propagate binding information over a dynamic link that was concurrently destroyed may have caused the broker to terminate unexpectedly. This update ensures that dynamic bridges are not propagated over destroyed links, and the broker no longer crashes.
The issues have been fixed: - description issue A - proved by failover_soak (bug 509970) and by semi-automated qpid_stress_test - comment 8 issue B - reproduced on qpid-cpp-*-0.7.939184-1.el5 using above repro. both verified on RHEL 4.8 / 5.5 i386 / x86_64 on packages: python-qmf-0.7.946106-13.el5 python-qpid-0.7.946106-14.el5 qmf-*0.7.946106-17.el5 qpid-cpp-*-0.7.946106-17.el5 qpid-dotnet-0.4.738274-2.el5 qpid-java-client-0.7.946106-10.el5 qpid-java-common-0.7.946106-10.el5 qpid-tools-0.7.946106-11.el5 ruby-qmf-0.7.946106-17.el5 ruby-qpid-0.7.946106-2.el5 -> VERIFIED
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2010-0773.html