Bug 611543 - Assertion when raising a link established event on clustered broker
Summary: Assertion when raising a link established event on clustered broker
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: beta
Hardware: All
OS: Linux
urgent
urgent
Target Milestone: 1.3
: ---
Assignee: Alan Conway
QA Contact: Jeff Needle
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-07-05 14:17 UTC by Gordon Sim
Modified: 2010-10-20 13:53 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-10-20 11:30:23 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Gordon Sim 2010-07-05 14:17:22 UTC
Description of problem:

If an outgoing link is created by a clustered broker, that results in an event being raised on the IO thread of that link connection. Raising an event causes messages to be put on subscriber queues and this currently fails the cluster safe assertion (if debug build) as it will lead to inconsistencies.

Version-Release number of selected component (if applicable):

Qpid trunk, r959461

How reproducible:

100%

Steps to Reproduce:
1. start up two distinct clusters (one node each, on same machine is fine)
2. run qpid-tool against the first one
3. run qpid-route to create a link (e.g. qpid-route route add <addr1> <addr2> amq.fanout "", where addr1 is the address of a/the node in first cluster - i.e. the one qpid-tool is connected to - and addr2 is the address of a/the node in the other cluster).

Actual results:

Core was generated by `/home/gordon/projects/qpid-svn-trunk/cpp/build/src/.libs/lt-qpidd --auth no --l'.
Program terminated with signal 6, Aborted.
#0  0x00122424 in __kernel_vsyscall ()
Missing separate debuginfos, use: debuginfo-install boost-1.37.0-7.fc11.i586 corosynclib-1.2.0-1.fc11.i586 cyrus-sasl-lib-2.1.22-22.fc11.i586 e2fsprogs-libs-1.41.4-12.fc11.i586 glibc-2.10.2-1.i686 libgcc-4.4.1-2.fc11.i586 libstdc++-4.4.1-2.fc11.i586 nss-softokn-freebl-3.12.4-3.fc11.i586
(gdb) bt
#0  0x00122424 in __kernel_vsyscall ()
#1  0x0034c781 in raise () from /lib/libc.so.6
#2  0x0034e04a in abort () from /lib/libc.so.6
#3  0x00bb480a in qpid::sys::assertClusterSafe () at ../../src/qpid/sys/ClusterSafe.cpp:42
#4  0x009023a9 in qpid::broker::Queue::push (this=0x86608e8, msg=@0xb54fdfac, isRecovery=false) at ../../src/qpid/broker/Queue.cpp:590
#5  0x00904640 in qpid::broker::Queue::deliver (this=0x86608e8, msg={p_ = 0xb49027d8}) at ../../src/qpid/broker/Queue.cpp:159
#6  0x008b4cfc in qpid::broker::DeliverableMessage::deliverTo (this=0xb54fe318, queue=@0x865e148) at ../../src/qpid/broker/DeliverableMessage.cpp:31
#7  0x008cb6f1 in qpid::broker::Exchange::doRoute (this=0x861b778, msg=@0xb54fe318, b={px = 0xb49010c8, pn = {pi_ = 0xb49029b0}})
    at ../../src/qpid/broker/Exchange.cpp:91
#8  0x00952254 in qpid::broker::TopicExchange::route (this=0x861b73c, msg=@0xb54fe318, 
    routingKey="console.event.1.0.org.apache.qpid.broker.brokerLinkUp") at ../../src/qpid/broker/TopicExchange.cpp:321
#9  0x00985cf7 in qpid::broker::ManagementTopicExchange::route (this=0x861b730, msg=@0xb54fe318, 
    routingKey="console.event.1.0.org.apache.qpid.broker.brokerLinkUp", args=0x0) at ../../src/qpid/management/ManagementTopicExchange.cpp:53
#10 0x009647d2 in qpid::management::ManagementAgent::sendBufferLH (this=0xb6e9e008, buf=@0xb54fe68c, length=84, exchange=
        {px = 0x861b778, pn = {pi_ = 0x861b8b8}}, routingKey="console.event.1.0.org.apache.qpid.broker.brokerLinkUp")
    at ../../src/qpid/management/ManagementAgent.cpp:497
#11 0x009793e2 in qpid::management::ManagementAgent::raiseEvent (this=0xb6e9e008, event=@0xb54fe91c, 
    severity=qpid::management::ManagementAgent::SEV_DEFAULT) at ../../src/qpid/management/ManagementAgent.cpp:351
#12 0x008e01e3 in qpid::broker::Link::established (this=0x86aa328) at ../../src/qpid/broker/Link.cpp:134
#13 0x008e7ad2 in qpid::broker::LinkRegistry::notifyConnection (this=0x8616544, key="localhost:5673", c=0xb4902410)
    at ../../src/qpid/broker/LinkRegistry.cpp:269
#14 0x008a92ab in qpid::broker::Connection::Connection(struct qpid::sys::ConnectionOutputHandler *, struct qpid::broker::Broker &, const std::string &, const qpid::sys::SecuritySettings &, bool, uint64_t, bool, bool) (this=0xb4902410, out_=0xb49017c8, broker_=@0x8616348, mgmtId_="localhost:5673", 
    external=@0xb4901828, isLink_=<value optimized out>, objectId_=0, shadow_=<value optimized out>, delayManagement=true)
    at ../../src/qpid/broker/Connection.cpp:103
#15 0x00d24aca in qpid::cluster::Connection::ConnectionCtor::construct (this=<value optimized out>) at ../../src/qpid/cluster/Connection.h:218
#16 qpid::cluster::Connection::init (this=<value optimized out>) at ../../src/qpid/cluster/Connection.cpp:121
#17 0x00d284c2 in qpid::cluster::Connection::Connection(struct qpid::cluster::Cluster &, struct qpid::sys::ConnectionOutputHandler &, const std::string &, qpid::cluster::MemberId, bool, bool, const qpid::sys::SecuritySettings &) (this=0xb4901790, c=@0x8617e48, out=@0xb4903728, 
    mgmtId="localhost:5673", member={<std::pair<unsigned int, unsigned int>> = {first = 33597632, second = 26448}, <No data fields>}, 
    isCatchUp=false, isLink=true, external=@0xb54fee48) at ../../src/qpid/cluster/Connection.cpp:111
#18 0x00d2efe3 in qpid::cluster::ConnectionCodec::ConnectionCodec(const qpid::framing::ProtocolVersion &, struct qpid::sys::OutputControl &, const std::string &, struct qpid::cluster::Cluster &, bool, bool, const qpid::sys::SecuritySettings &) (this=0xb4903720, v=@0xb54fed0e, out=@0xb4901268, 
    logId="localhost:5673", cluster=@0x8617e48, catchUp=<value optimized out>, isLink=true, external=@0xb54fee48)
    at ../../src/qpid/cluster/ConnectionCodec.cpp:59
#19 0x00d2f1b7 in qpid::cluster::ConnectionCodec::Factory::create (this=0x861ab00, out=@0xb4901268, logId="localhost:5673", external=@0xb54fee48)
    at ../../src/qpid/cluster/ConnectionCodec.cpp:52
#20 0x00d52f51 in qpid::cluster::SecureConnectionFactory::create (this=0x861ab78, out=@0xb4901268, id="localhost:5673", external=@0xb54fee48)
    at ../../src/qpid/cluster/SecureConnectionFactory.cpp:61
#21 0x00bb27f6 in qpid::sys::AsynchIOHandler::idle (this=0xb4901268) at ../../src/qpid/sys/AsynchIOHandler.cpp:203
#22 0x009880c4 in boost::_mfi::mf1<void, qpid::sys::AsynchIOHandler, qpid::sys::AsynchIO&>::operator() (a1=<value optimized out>, 
---Type <return> to continue, or q <return> to quit---
    p=<value optimized out>, this=<value optimized out>) at /usr/include/boost/bind/mem_fn_template.hpp:162
#23 operator()<boost::_mfi::mf1<void, qpid::sys::AsynchIOHandler, qpid::sys::AsynchIO&>, boost::_bi::list1<qpid::sys::AsynchIO&> > (
    a1=<value optimized out>, p=<value optimized out>, this=<value optimized out>) at /usr/include/boost/bind.hpp:292
#24 operator()<qpid::sys::AsynchIO> (a1=<value optimized out>, p=<value optimized out>, this=<value optimized out>)
    at /usr/include/boost/bind/bind_template.hpp:32
#25 boost::detail::function::void_function_obj_invoker1<boost::_bi::bind_t<void, boost::_mfi::mf1<void, qpid::sys::AsynchIOHandler, qpid::sys::AsynchIO&>, boost::_bi::list2<boost::_bi::value<qpid::sys::AsynchIOHandler*>, boost::arg<1> > >, void, qpid::sys::AsynchIO&>::invoke (
    a1=<value optimized out>, p=<value optimized out>, this=<value optimized out>) at /usr/include/boost/function/function_template.hpp:152
#26 0x00adc6a3 in boost::function1<void, qpid::sys::AsynchIO&>::operator() (this=0xb49016f0, a0=@0xb49015f0)
    at /usr/include/boost/function/function_template.hpp:989
#27 0x00ada9db in qpid::sys::posix::AsynchIO::writeable (this=0xb49015f0, h=@0xb49015f4) at ../../src/qpid/sys/posix/AsynchIO.cpp:542
#28 0x00adb924 in boost::_mfi::mf1<void, qpid::sys::posix::AsynchIO, qpid::sys::DispatchHandle&>::operator() (a1=<value optimized out>, 
    p=<value optimized out>, this=<value optimized out>) at /usr/include/boost/bind/mem_fn_template.hpp:162
#29 operator()<boost::_mfi::mf1<void, qpid::sys::posix::AsynchIO, qpid::sys::DispatchHandle&>, boost::_bi::list1<qpid::sys::DispatchHandle&> > (
    a1=<value optimized out>, p=<value optimized out>, this=<value optimized out>) at /usr/include/boost/bind.hpp:292
#30 operator()<qpid::sys::DispatchHandle> (a1=<value optimized out>, p=<value optimized out>, this=<value optimized out>)
    at /usr/include/boost/bind/bind_template.hpp:32
#31 boost::detail::function::void_function_obj_invoker1<boost::_bi::bind_t<void, boost::_mfi::mf1<void, qpid::sys::posix::AsynchIO, qpid::sys::DispatchHandle&>, boost::_bi::list2<boost::_bi::value<qpid::sys::posix::AsynchIO*>, boost::arg<1> > >, void, qpid::sys::DispatchHandle&>::invoke (
    a1=<value optimized out>, p=<value optimized out>, this=<value optimized out>) at /usr/include/boost/function/function_template.hpp:152
#32 0x00bba0d3 in boost::function1<void, qpid::sys::DispatchHandle&>::operator() (this=0xb490160c, a0=@0xb49015f4)
    at /usr/include/boost/function/function_template.hpp:989
#33 0x00bb6f4e in qpid::sys::DispatchHandle::processEvent (this=0xb49015f4, type=qpid::sys::Poller::WRITABLE)
    at ../../src/qpid/sys/DispatchHandle.cpp:285
#34 0x00ae931b in qpid::sys::Poller::Event::process (this=<value optimized out>) at ../../src/qpid/sys/Poller.h:125
#35 qpid::sys::Poller::run (this=<value optimized out>) at ../../src/qpid/sys/epoll/EpollPoller.cpp:519
#36 0x00bba574 in qpid::sys::Dispatcher::run (this=0xbfd7001c) at ../../src/qpid/sys/Dispatcher.cpp:37
#37 0x00adf001 in qpid::sys::(anonymous namespace)::runRunnable (p=0xbfd7001c) at ../../src/qpid/sys/posix/Thread.cpp:35
#38 0x004c98f5 in start_thread () from /lib/libpthread.so.0
#39 0x003fefce in clone () from /lib/libc.so.6
(gdb) 


Expected results:

No assertion

Additional info:

Comment 1 Alan Conway 2010-07-05 20:33:38 UTC
Fixed on trunk r960681, on 1.3.x release branch
http://mrg1.lab.bos.redhat.com/git/?p=qpid.git;a=commitdiff;h=ba83c5fd4c4cccae42240c70473d8d37fd8d3fcb

Comment 2 Gordon Sim 2010-08-27 09:34:24 UTC
Reproduced on qpid-cpp-server-0.7.946106-6.el5:

1) /usr/sbin/qpidd --auth no --data-dir cluster-a-grs --cluster-name cluster-a-grs --port 5673
2) (in another window) /usr/sbin/qpidd --auth no --data-dir cluster-b-grs --cluster-name cluster-b-grs --port 5674
3) (in a third window) qpid-tool localhost:5673
4) (in fourth window) qpid-route route add localhost:5673 localhost:5674 amq.fanout ""

Step 4. causes the broker started in step 1. to crash:

2010-08-27 05:27:42 critical Modified cluster state outside of cluster context
Aborted (core dumped)

Core was generated by `/usr/sbin/qpidd --auth no --data-dir cluster-a-grs --cluster-name cluster-a-grs'.
Program terminated with signal 6, Aborted.
[New process 29946]
[New process 29947]
[New process 29945]
[New process 29944]
[New process 29943]
[New process 29941]
[New process 29940]
#0  0x00000038f9c30265 in raise () from /lib64/libc.so.6
(gdb) bt
#0  0x00000038f9c30265 in raise () from /lib64/libc.so.6
#1  0x00000038f9c31d10 in abort () from /lib64/libc.so.6
#2  0x00002b11ca8213d4 in qpid::sys::assertClusterSafe ()
    at qpid/sys/ClusterSafe.cpp:42
#3  0x00002b11ca327220 in qpid::broker::Queue::push (this=0x74f4, msg=@0x74fa, 
    isRecovery=6) at qpid/broker/Queue.cpp:590
#4  0x00002b11ca32910c in qpid::broker::Queue::deliver (this=0x1bdd6c90, msg=
      {p_ = 0x4456a5b0}) at qpid/broker/Queue.cpp:159
#5  0x00002b11ca2cd922 in qpid::broker::DeliverableMessage::deliverTo (
    this=0x4456ad70, queue=@0x1bddbac0)
    at qpid/broker/DeliverableMessage.cpp:31
#6  0x00002b11ca2e59e5 in qpid::broker::Exchange::doRoute (this=0x1bd8cff8, 
    msg=@0x4456ad70, b={px = 0x4456a960, pn = {pi_ = 0x0}})
    at qpid/broker/Exchange.cpp:91
#7  0x00002b11ca37effc in qpid::broker::TopicExchange::route (this=0x1bd8cff8, 
    msg=@0x4456ad70, routingKey=@0x4456b660)
    at qpid/broker/TopicExchange.cpp:321
#8  0x00002b11ca38f546 in qpid::management::ManagementAgent::sendBufferLH (
    this=0x2aaaaaaab010, buf=@0x1be20d10, length=84, exchange=
        {px = 0x4456b610, pn = {pi_ = 0x1be20d10}}, routingKey=
        {static npos = 18446744073709551615, _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x4456b660 "���\033"}}) at qpid/management/ManagementAgent.cpp:497
---Type <return> to continue, or q <return> to quit--- 
#9  0x00002b11ca39a92c in qpid::management::ManagementAgent::raiseEvent (
    this=0x2aaaaaaab010, event=@0x4456bf10, severity=<value optimized out>)
    at qpid/management/ManagementAgent.cpp:351
#10 0x00002b11ca2ffe7f in qpid::broker::Link::established (this=0x1be23cb0)
    at qpid/broker/Link.cpp:134
#11 0x00002b11ca304ee5 in qpid::broker::LinkRegistry::notifyConnection (
    this=0x1bd692d8, key=<value optimized out>, c=0x1be23860)
    at qpid/broker/LinkRegistry.cpp:269
#12 0x00002b11ca2c2279 in Connection (this=0x1be23860, 
    out_=<value optimized out>, broker_=@0x1bd68f70, 
    mgmtId_=<value optimized out>, external=<value optimized out>, 
    isLink_=<value optimized out>, objectId_=0, shadow_=false, 
    delayManagement=true) at qpid/broker/Connection.cpp:102
#13 0x00002b11cad4f50d in qpid::cluster::Connection::init (this=0x1be23f40)
    at qpid/cluster/Connection.h:218
#14 0x00002b11cad5429b in Connection (this=0x1be23f40, c=@0x1bd87090, 
    out=<value optimized out>, mgmtId=@0x1be21798, 
    member=<value optimized out>, isCatchUp=false, isLink=true, 
    external=@0x4456c8c0) at qpid/cluster/Connection.cpp:111
#15 0x00002b11cad5b0f2 in ConnectionCodec (this=0x1be235f0, v=@0x4456c6e0, 
    out=<value optimized out>, logId=@0x1be21798, cluster=@0x1bd87090, 
    catchUp=false, isLink=true, external=@0x4456c8c0)
    at qpid/cluster/ConnectionCodec.cpp:59
---Type <return> to continue, or q <return> to quit---

Comment 3 Jiri Kolar 2010-08-27 10:50:29 UTC
this appear on 0.7.946106-6  and
is fixed on 0.7.946106-12
validated on RHEL5.5  i386 / x86_64  

packages:

# rpm -qa | grep -E '(qpid|openais|rhm)' | sort -u
openais-0.80.6-16.el5_5.7
openais-devel-0.80.6-16.el5_5.7
python-qpid-0.7.946106-12.el5
qpid-cpp-client-0.7.946106-12.el5
qpid-cpp-client-devel-0.7.946106-12.el5
qpid-cpp-client-devel-docs-0.7.946106-12.el5
qpid-cpp-client-ssl-0.7.946106-12.el5
qpid-cpp-mrg-debuginfo-0.7.946106-11.el5
qpid-cpp-server-0.7.946106-12.el5
qpid-cpp-server-cluster-0.7.946106-12.el5
qpid-cpp-server-devel-0.7.946106-12.el5
qpid-cpp-server-ssl-0.7.946106-12.el5
qpid-cpp-server-store-0.7.946106-12.el5
qpid-cpp-server-xml-0.7.946106-12.el5
qpid-java-client-0.7.946106-7.el5
qpid-java-common-0.7.946106-7.el5
qpid-tools-0.7.946106-8.el5
rhm-docs-0.7.946106-5.el5
rh-tests-distribution-MRG-Messaging-qpid_common-1.6-53

->VERIFIED


Note You need to log in before you can comment on or make changes to this bug.