Bug 1085053 - [AMQP1.0] Broker crash on remove domain request after unsuccessful interconnect attempt (to unreachable broker)
Summary: [AMQP1.0] Broker crash on remove domain request after unsuccessful interconne...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: Development
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: 3.0
: ---
Assignee: Gordon Sim
QA Contact: Valiantsina Hubeika
URL:
Whiteboard:
Depends On:
Blocks: 1010399
TreeView+ depends on / blocked
 
Reported: 2014-04-07 16:59 UTC by Petr Matousek
Modified: 2014-09-24 15:11 UTC (History)
8 users (show)

Fixed In Version: qpid-cpp-0.22-38
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-09-24 15:11:00 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Apache JIRA QPID-5704 0 None None None Never
Red Hat Product Errata RHEA-2014:1296 0 normal SHIPPED_LIVE Red Hat Enterprise MRG Messaging 3.0 Release 2014-09-24 19:00:06 UTC

Description Petr Matousek 2014-04-07 16:59:21 UTC
Description of problem:

When a domain link is declared and after that the associated domain is removed the broker will crash. 

Version-Release number of selected component (if applicable):
qpid-cpp-*-0.22-36

How reproducible:
100%

Steps to Reproduce:
1. start two brokers, A and B, specifying --domain BrokerA and --domain BrokerB respectively
2. qpid-config --broker <brokerB_url> add domain BrokerA --argument "url=<brokerA_url>"
3. qpid-config  --broker <brokerB_url> add outgoing link --argument "source=amq.topic" --argument "domain=BrokerA" --argument "target=amq.topic"
4. qpid-config del domain BrokerA  -b <brokerB_url>
5. wait several seconds
6. broker crash

Actual results:
Broker crash when domain "in use" is removed

Expected results:
No broker crash

Additional info:

Note: Broker logs may be provided on demand

stack trace: 
Core was generated by `qpidd --domain BrokerB'.
Program terminated with signal 11, Segmentation fault.
#0  0x0083bba3 in _M_rep (this=0x964fc08, __str=...) at /usr/src/debug/gcc-4.4.7-20120601/obj-i686-redhat-linux/i686-redhat-linux/libstdc++-v3/include/bits/basic_string.h:286
286	      { return &((reinterpret_cast<_Rep*> (_M_data()))[-1]); }
(gdb) info thread
  3 Thread 0xb770f960 (LWP 21567)  0x00a84416 in __kernel_vsyscall ()
  2 Thread 0xb7703b70 (LWP 21568)  0x00a84416 in __kernel_vsyscall ()
* 1 Thread 0xb6acab70 (LWP 21569)  0x0083bba3 in _M_rep (this=0x964fc08, __str=Traceback (most recent call last):
  File "/usr/lib/../share/gdb/python/libstdcxx/v6/printers.py", line 556, in to_string
    header = ptr.cast(reptype) - 1
RuntimeError: Cannot access memory at address 0x0
) at /usr/src/debug/gcc-4.4.7-20120601/obj-i686-redhat-linux/i686-redhat-linux/libstdc++-v3/include/bits/basic_string.h:286
(gdb) thread apply all bt

Thread 3 (Thread 0xb770f960 (LWP 21567)):
#0  0x00a84416 in __kernel_vsyscall ()
#1  0x0043d5e6 in epoll_wait () at ../sysdeps/unix/syscall-template.S:82
#2  0x02d19bec in qpid::sys::Poller::wait (this=0x962cb70, timeout=...) at /usr/src/debug/qpid-0.22/cpp/src/qpid/sys/epoll/EpollPoller.cpp:566
#3  0x02d1a3c3 in qpid::sys::Poller::run (this=0x962cb70) at /usr/src/debug/qpid-0.22/cpp/src/qpid/sys/epoll/EpollPoller.cpp:518
#4  0x02d76985 in qpid::sys::Dispatcher::run (this=0xbfb7c700) at /usr/src/debug/qpid-0.22/cpp/src/qpid/sys/Dispatcher.cpp:37
#5  0x03033946 in qpid::broker::Broker::run (this=0x962cbe8) at /usr/src/debug/qpid-0.22/cpp/src/qpid/broker/Broker.cpp:450
#6  0x0804e109 in qpid::broker::QpiddBroker::execute (this=0xbfb7cb4d, options=0x9626cc8) at /usr/src/debug/qpid-0.22/cpp/src/posix/QpiddBroker.cpp:206
#7  0x080539a4 in qpid::broker::run_broker (argc=3, argv=0xbfb7cc34, hidden=false) at /usr/src/debug/qpid-0.22/cpp/src/qpidd.cpp:108
#8  0x0804d9c4 in main (argc=3, argv=0xbfb7cc34) at /usr/src/debug/qpid-0.22/cpp/src/posix/QpiddBroker.cpp:215

Thread 2 (Thread 0xb7703b70 (LWP 21568)):
#0  0x00a84416 in __kernel_vsyscall ()
#1  0x004fd794 in pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/i386/i486/pthread_cond_timedwait.S:253
#2  0x02d7f2bf in wait (this=0x962d0f0) at /usr/src/debug/qpid-0.22/cpp/src/qpid/sys/posix/Condition.h:69
#3  0x02d0dff2 in qpid::sys::(anonymous namespace)::runRunnable (p=0x962d0f0) at /usr/src/debug/qpid-0.22/cpp/src/qpid/sys/posix/Thread.cpp:35
#4  0x004f9b39 in start_thread (arg=0xb7703b70) at pthread_create.c:301
#5  0x0043cd6e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:133

Thread 1 (Thread 0xb6acab70 (LWP 21569)):
#0  0x0083bba3 in _M_rep (this=0x964fc08, __str=Traceback (most recent call last):
  File "/usr/lib/../share/gdb/python/libstdcxx/v6/printers.py", line 556, in to_string
    header = ptr.cast(reptype) - 1
RuntimeError: Cannot access memory at address 0x0
) at /usr/src/debug/gcc-4.4.7-20120601/obj-i686-redhat-linux/i686-redhat-linux/libstdc++-v3/include/bits/basic_string.h:286
#1  std::basic_string<char, std::char_traits<char>, std::allocator<char> >::assign (this=0x964fc08, __str=Traceback (most recent call last):
  File "/usr/lib/../share/gdb/python/libstdcxx/v6/printers.py", line 556, in to_string
    header = ptr.cast(reptype) - 1
RuntimeError: Cannot access memory at address 0x0
) at /usr/src/debug/gcc-4.4.7-20120601/obj-i686-redhat-linux/i686-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:246
#2  0x00ac1867 in operator= (this=0x964fbb8) at /usr/include/c++/4.4.7/bits/basic_string.h:511
#3  0x00ac1cda in qpid::broker::amqp::InterconnectFactory::failed (this=0x964fbb8, text="Connection timed out") at /usr/src/debug/qpid-0.22/cpp/src/qpid/broker/amqp/Domain.cpp:209
#4  0x00ac520b in operator() (function_obj_ptr=..., a0=110, a1="Connection timed out") at /usr/include/boost/bind/mem_fn_template.hpp:274
#5  operator()<boost::_mfi::mf2<void, qpid::broker::amqp::InterconnectFactory, int, std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::_bi::list2<int&, std::basic_string<char, std::char_traits<char>, std::allocator<char> >&> > (function_obj_ptr=..., a0=110, a1="Connection timed out") at /usr/include/boost/bind/bind.hpp:385
#6  operator()<int, std::basic_string<char, std::char_traits<char>, std::allocator<char> > > (function_obj_ptr=..., a0=110, a1="Connection timed out")
    at /usr/include/boost/bind/bind_template.hpp:61
#7  boost::detail::function::void_function_obj_invoker2<boost::_bi::bind_t<void, boost::_mfi::mf2<void, qpid::broker::amqp::InterconnectFactory, int, std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::_bi::list3<boost::_bi::value<qpid::broker::amqp::InterconnectFactory*>, boost::arg<1>, boost::arg<2> > >, void, int, std::basic_string<char, std::char_traits<char>, std::allocator<char> > >::invoke (function_obj_ptr=..., a0=110, a1="Connection timed out") at /usr/include/boost/function/function_template.hpp:153
#8  0x0317711e in operator() (s=..., ec=110, emsg="Connection timed out", failedCb=...) at /usr/include/boost/function/function_template.hpp:1013
#9  0x0317ab17 in operator()<void (*)(const qpid::sys::Socket&, int, const std::string&, boost::function2<void, int, std::basic_string<char, std::char_traits<char>, std::allocator<char> > >), boost::_bi::list3<const qpid::sys::Socket&, int&, const std::basic_string<char, std::char_traits<char>, std::allocator<char> >&> > (function_obj_ptr=..., a0=..., a1=110, a2="Connection timed out")
    at /usr/include/boost/bind/bind.hpp:450
#10 operator()<const qpid::sys::Socket, int, const std::basic_string<char, std::char_traits<char>, std::allocator<char> > > (function_obj_ptr=..., a0=..., a1=110, a2="Connection timed out")
    at /usr/include/boost/bind/bind_template.hpp:116
#11 boost::detail::function::void_function_obj_invoker3<boost::_bi::bind_t<void, void (*)(qpid::sys::Socket const&, int, std::string const&, boost::function2<void, int, std::basic_string<char, std:---Type <return> to continue, or q <return> to quit---
:char_traits<char>, std::allocator<char> > >), boost::_bi::list4<boost::arg<1>, boost::arg<2>, boost::arg<3>, boost::_bi::value<boost::function2<void, int, std::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > >, void, qpid::sys::Socket const&, int, std::string const&>::invoke (function_obj_ptr=..., a0=..., a1=110, a2="Connection timed out")
    at /usr/include/boost/function/function_template.hpp:153
#12 0x02cf7772 in boost::function3<void, qpid::sys::Socket const&, int, std::string const&>::operator() (this=0x9650018, a0=..., a1=110, a2="Connection timed out")
    at /usr/include/boost/function/function_template.hpp:1013
#13 0x02cf5462 in qpid::sys::posix::AsynchConnector::connComplete (this=0x964ff58, h=...) at /usr/src/debug/qpid-0.22/cpp/src/qpid/sys/posix/AsynchIO.cpp:237
#14 0x02cf5c95 in operator() (function_obj_ptr=..., a0=...) at /usr/include/boost/bind/mem_fn_template.hpp:162
#15 operator()<boost::_mfi::mf1<void, qpid::sys::posix::AsynchConnector, qpid::sys::DispatchHandle&>, boost::_bi::list1<qpid::sys::DispatchHandle&> > (function_obj_ptr=..., a0=...)
    at /usr/include/boost/bind/bind.hpp:306
#16 operator()<qpid::sys::DispatchHandle> (function_obj_ptr=..., a0=...) at /usr/include/boost/bind/bind_template.hpp:32
#17 boost::detail::function::void_function_obj_invoker1<boost::_bi::bind_t<void, boost::_mfi::mf1<void, qpid::sys::posix::AsynchConnector, qpid::sys::DispatchHandle&>, boost::_bi::list2<boost::_bi::value<qpid::sys::posix::AsynchConnector*>, boost::arg<1> > >, void, qpid::sys::DispatchHandle&>::invoke (function_obj_ptr=..., a0=...) at /usr/include/boost/function/function_template.hpp:153
#18 0x02d79937 in boost::function1<void, qpid::sys::DispatchHandle&>::operator() (this=0x964ff84, a0=...) at /usr/include/boost/function/function_template.hpp:1013
#19 0x02d789bd in qpid::sys::DispatchHandle::processEvent (this=0x964ff5c, type=qpid::sys::Poller::DISCONNECTED) at /usr/src/debug/qpid-0.22/cpp/src/qpid/sys/DispatchHandle.cpp:291
#20 0x02d1a3a3 in process (this=0x962cb70) at /usr/src/debug/qpid-0.22/cpp/src/qpid/sys/Poller.h:131
#21 qpid::sys::Poller::run (this=0x962cb70) at /usr/src/debug/qpid-0.22/cpp/src/qpid/sys/epoll/EpollPoller.cpp:522
#22 0x02d76985 in qpid::sys::Dispatcher::run (this=0xbfb7c700) at /usr/src/debug/qpid-0.22/cpp/src/qpid/sys/Dispatcher.cpp:37
#23 0x02d0dff2 in qpid::sys::(anonymous namespace)::runRunnable (p=0xbfb7c700) at /usr/src/debug/qpid-0.22/cpp/src/qpid/sys/posix/Thread.cpp:35
#24 0x004f9b39 in start_thread (arg=0xb6acab70) at pthread_create.c:301
#25 0x0043cd6e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:133

Comment 2 Petr Matousek 2014-04-07 21:20:17 UTC
Well the issue appears only when:
1. the host specified in domain url (in step 2.) is down 
Note: iptables DROP policy for amqp traffic works as well, but there's no crash  if REJECT policy is used)
2. the delete request is issued before the connection timeout is reported by the broker (60 seconds is default):
info Inter-broker connection failed (tcp:<brokerA_url>): Connection timed out

Note: I'm also getting different results when inspecting the core files, ie:

Core was generated by `qpidd --domain BrokerB'.
Program terminated with signal 11, Segmentation fault.
#0  0x001f7c05 in __exchange_and_add (this=0xb60486a0, __str=...)
    at /usr/src/debug/gcc-4.4.7-20120601/obj-i686-redhat-linux/i686-redhat-linux/libstdc++-v3/include/ext/atomicity.h:46
46	  { return __sync_fetch_and_add(__mem, __val); }

Core was generated by `qpidd --domain BrokerB --log-enable=debug+'.
Program terminated with signal 11, Segmentation fault.
#0  operator<< <char, std::char_traits<char>, std::allocator<char> > (os=..., a=...) at /usr/include/c++/4.4.7/bits/basic_string.h:2503
2503	      return __ostream_insert(__os, __str.data(), __str.size());

Core was generated by `qpidd --domain BrokerB --log-enable=debug+'.
Program terminated with signal 11, Segmentation fault.
#0  size (this=0xb600209c, __c=58 ':', __pos=0)
    at /usr/src/debug/gcc-4.4.7-20120601/obj-i686-redhat-linux/i686-redhat-linux/libstdc++-v3/include/bits/basic_string.h:629
629	      { return _M_rep()->_M_length; }

Comment 3 Petr Matousek 2014-04-07 21:32:28 UTC
Also I would expect, that client gets an exception when creating the domain link (if not already by defining the domain)

Comment 8 Gordon Sim 2014-04-16 19:33:31 UTC
Fixed upstream: https://svn.apache.org/r1587998

Comment 10 Valiantsina Hubeika 2014-07-21 17:25:03 UTC
verified on 

qpid-cpp-0.22-43.el6.x86_64

qpid-cpp-0.22-43.el6.i686

Comment 11 errata-xmlrpc 2014-09-24 15:11:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-1296.html


Note You need to log in before you can comment on or make changes to this bug.