Bug 876720 - HA backup broker crashes shortly after promotion to primary
Summary: HA backup broker crashes shortly after promotion to primary
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: Development
Hardware: Unspecified
OS: Unspecified
high
unspecified
Target Milestone: 2.3
: ---
Assignee: mick
QA Contact: MRG Quality Engineering
URL:
Whiteboard:
Depends On:
Blocks: 698367
TreeView+ depends on / blocked
 
Reported: 2012-11-14 19:29 UTC by Jason Dillaman
Modified: 2013-03-19 16:41 UTC (History)
5 users (show)

Fixed In Version: qpid-cpp-0.18-11
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Quick patch to prevent this crash (3.04 KB, patch)
2012-11-19 18:53 UTC, Jason Dillaman
no flags Details | Diff
Updated: Quick patch to prevent this crash (3.08 KB, patch)
2012-11-19 18:57 UTC, Jason Dillaman
no flags Details | Diff

Description Jason Dillaman 2012-11-14 19:29:58 UTC
Description of problem:
The BrokerReplicator adds itself as a connection listener, but never removes itself prior to its destruction.  As a result, it is possible for a disconnected HA link to invoke the "disconnected" method on the destroyed BrokerReplicator.  

Version-Release number of selected component (if applicable):
Qpid-0.18-9

How reproducible:
Randomly - race condition 

Steps to Reproduce:
1. Start a primary and backup broker
2. Kill the primary and promote the backup
  
Actual results:
The newly promoted broker may crash if the memory for the freed BrokerReplicator object has already been reclaimed

Expected results:
The newly promoted broker does not crash

Additional info:

Comment 1 Jason Dillaman 2012-11-14 19:33:44 UTC
#1  operator<< <std::string> (this=0xe4dbf0, name="Queue1", userId="@QPID", connectionId=Traceback (most recent call last):
  File "/usr/lib64/../share/gdb/python/libstdcxx/v6/printers.py", line 558, in to_string
    return self.val['_M_dataplus']['_M_p'].lazy_string (length = len)
RuntimeError: Cannot access memory at address 0x9
, check=<value optimized out>) at ../include/qpid/Msg.h:63
#2  qpid::broker::Broker::deleteQueue (this=0xe4dbf0, name="Queue1", userId="@QPID", connectionId=Traceback (most recent call last):
  File "/usr/lib64/../share/gdb/python/libstdcxx/v6/printers.py", line 558, in to_string
    return self.val['_M_dataplus']['_M_p'].lazy_string (length = len)
RuntimeError: Cannot access memory at address 0x9
, check=<value optimized out>) at qpid/broker/Broker.cpp:1149
#3  0x00007fca90cbe978 in qpid::ha::BrokerReplicator::deleteQueue (this=0xe58350, name="Queue1", purge=<value optimized out>) at qpid/ha/BrokerReplicator.cpp:760
#4  0x00007fca90cbf6f1 in qpid::ha::BrokerReplicator::autoDeleteCheck (this=0xe58350, ex=<value optimized out>) at qpid/ha/BrokerReplicator.cpp:847
#5  0x00007fca90ccdde3 in operator() (__first=<value optimized out>, __last=..., __f=...) at /usr/include/boost/bind/mem_fn_template.hpp:162
#6  operator()<boost::_mfi::mf1<void, qpid::ha::BrokerReplicator, boost::shared_ptr<qpid::broker::Exchange> >, boost::_bi::list1<boost::shared_ptr<qpid::broker::Exchange>&> > (
    __first=<value optimized out>, __last=..., __f=...) at /usr/include/boost/bind/bind.hpp:306
#7  operator()<boost::shared_ptr<qpid::broker::Exchange> > (__first=<value optimized out>, __last=..., __f=...) at /usr/include/boost/bind/bind_template.hpp:32
#8  std::for_each<__gnu_cxx::__normal_iterator<boost::shared_ptr<qpid::broker::Exchange>*, std::vector<boost::shared_ptr<qpid::broker::Exchange>, std::allocator<boost::shared_ptr<qpid::broker::Exchange> > > >, boost::_bi::bind_t<void, boost::_mfi::mf1<void, qpid::ha::BrokerReplicator, boost::shared_ptr<qpid::broker::Exchange> >, boost::_bi::list2<boost::_bi::value<qpid::ha::BrokerReplicator*>, boost::arg<1> > > > (__first=<value optimized out>, __last=..., __f=...) at /usr/include/c++/4.4.6/bits/stl_algo.h:4200
#9  0x00007fca90cc8884 in qpid::ha::BrokerReplicator::disconnected (this=0xe58350) at qpid/ha/BrokerReplicator.cpp:861
#10 0x00007fca9179927c in call<boost::shared_ptr<qpid::broker::ConnectionObserver>, qpid::broker::Connection> (this=<value optimized out>, c=...) at /usr/include/boost/bind/mem_fn_template.hpp:153
#11 operator()<boost::shared_ptr<qpid::broker::ConnectionObserver> > (this=<value optimized out>, c=...) at /usr/include/boost/bind/mem_fn_template.hpp:167
#12 operator()<boost::_mfi::mf1<void, qpid::broker::ConnectionObserver, qpid::broker::Connection&>, boost::_bi::list1<boost::shared_ptr<qpid::broker::ConnectionObserver>&> > (
    this=<value optimized out>, c=...) at /usr/include/boost/bind/bind.hpp:306
#13 operator()<boost::shared_ptr<qpid::broker::ConnectionObserver> > (this=<value optimized out>, c=...) at /usr/include/boost/bind/bind_template.hpp:32
#14 for_each<__gnu_cxx::__normal_iterator<boost::shared_ptr<qpid::broker::ConnectionObserver>*, std::vector<boost::shared_ptr<qpid::broker::ConnectionObserver>, std::allocator<boost::shared_ptr<qpid::broker::ConnectionObserver> > > >, boost::_bi::bind_t<void, boost::_mfi::mf1<void, qpid::broker::ConnectionObserver, qpid::broker::Connection&>, boost::_bi::list2<boost::arg<1>, boost::reference_wrapper<qpid::broker::Connection> > > > (this=<value optimized out>, c=...) at /usr/include/c++/4.4.6/bits/stl_algo.h:4200
#15 each<boost::_bi::bind_t<void, boost::_mfi::mf1<void, qpid::broker::ConnectionObserver, qpid::broker::Connection&>, boost::_bi::list2<boost::arg<1>, boost::reference_wrapper<qpid::broker::Connection> > > > (this=<value optimized out>, c=...) at qpid/broker/Observers.h:63
#16 qpid::broker::ConnectionObservers::closed (this=<value optimized out>, c=...) at qpid/broker/ConnectionObservers.h:49
#17 0x00007fca917a5dca in qpid::broker::Connection::~Connection (this=0x7fca6805e5c0, __in_chrg=<value optimized out>) at qpid/broker/Connection.cpp:150
#18 0x00007fca917a6af9 in qpid::broker::Connection::~Connection (this=0x7fca6805e5c0, __in_chrg=<value optimized out>) at qpid/broker/Connection.cpp:160
#19 0x00007fca91773d31 in ~auto_ptr (this=0x7fca68048750, __in_chrg=<value optimized out>, __vtt_parm=<value optimized out>) at /usr/include/c++/4.4.6/backward/auto_ptr.h:168
#20 qpid::amqp_0_10::Connection::~Connection (this=0x7fca68048750, __in_chrg=<value optimized out>, __vtt_parm=<value optimized out>) at qpid/amqp_0_10/Connection.h:45
#21 0x00007fca91774109 in qpid::amqp_0_10::Connection::~Connection (this=0x7fca68048750, __in_chrg=<value optimized out>, __vtt_parm=<value optimized out>) at qpid/amqp_0_10/Connection.h:45
#22 0x00007fca91846ae4 in ~auto_ptr (this=0x7fca6805dd80, __in_chrg=<value optimized out>) at /usr/include/c++/4.4.6/backward/auto_ptr.h:168
---Type <return> to continue, or q <return> to quit---
#23 ~SecureConnection (this=0x7fca6805dd80, __in_chrg=<value optimized out>) at qpid/broker/SecureConnection.h:42
#24 qpid::broker::SecureConnection::~SecureConnection (this=0x7fca6805dd80, __in_chrg=<value optimized out>) at qpid/broker/SecureConnection.h:42
#25 0x00007fca91360343 in qpid::sys::AsynchIOHandler::~AsynchIOHandler (this=0xe53380, __in_chrg=<value optimized out>) at qpid/sys/AsynchIOHandler.cpp:81
#26 0x00007fca91360489 in qpid::sys::AsynchIOHandler::~AsynchIOHandler (this=0xe53380, __in_chrg=<value optimized out>) at qpid/sys/AsynchIOHandler.cpp:82
#27 0x00007fca91361b10 in qpid::sys::AsynchIOHandler::closedSocket (this=0xe53380, s=...) at qpid/sys/AsynchIOHandler.cpp:226
#28 0x00007fca91280c96 in operator() (this=0xe5bc00, h=<value optimized out>) at /usr/include/boost/function/function_template.hpp:1013
#29 qpid::sys::posix::AsynchIO::close (this=0xe5bc00, h=<value optimized out>) at qpid/sys/posix/AsynchIO.cpp:616
#30 0x00007fca912828df in qpid::sys::posix::AsynchIO::writeable (this=0xe5bc00, h=...) at qpid/sys/posix/AsynchIO.cpp:575
#31 0x00007fca91366da3 in boost::function1<void, qpid::sys::DispatchHandle&>::operator() (this=<value optimized out>, a0=<value optimized out>)
    at /usr/include/boost/function/function_template.hpp:1013
#32 0x00007fca91363c6e in qpid::sys::DispatchHandle::processEvent (this=0xe5bc08, type=qpid::sys::Poller::WRITABLE) at qpid/sys/DispatchHandle.cpp:287
#33 0x00007fca9128ea4d in process (this=0xe36c80) at qpid/sys/Poller.h:131
#34 qpid::sys::Poller::run (this=0xe36c80) at qpid/sys/epoll/EpollPoller.cpp:524
#35 0x00007fca9128624a in qpid::sys::(anonymous namespace)::runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#36 0x00000034d9e077f1 in start_thread () from /lib64/libpthread.so.0
#37 0x00000034d96e592d in clone () from /lib64/libc.so.6

Comment 2 Jason Dillaman 2012-11-19 18:53:35 UTC
Created attachment 647948 [details]
Quick patch to prevent this crash

Comment 3 Jason Dillaman 2012-11-19 18:57:10 UTC
Created attachment 647949 [details]
Updated: Quick patch to prevent this crash

Comment 5 Alan Conway 2012-11-27 21:24:12 UTC
Reviewed & approve of the fix.

Comment 7 Eric Sammons 2013-01-09 20:03:38 UTC
ran our full suite of tests against the following RPMs: qpid-cpp-client-0.18-11 qpid-cpp-server-0.18-11 qpid-cpp-server-ha-0.18-11 --Jason Dillaman


Note You need to log in before you can comment on or make changes to this bug.