Bug 680228 - [store] Deadlock in BDB database del() function
Summary: [store] Deadlock in BDB database del() function
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: Development
Hardware: Unspecified
OS: Linux
unspecified
high
Target Milestone: 2.0
: ---
Assignee: Kim van der Riet
QA Contact: Petr Matousek
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-02-24 18:25 UTC by Kim van der Riet
Modified: 2011-06-23 15:43 UTC (History)
5 users (show)

Fixed In Version: qpid-cpp-mrg-0.9.1079953
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-06-23 15:43:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Test script used to reproduce this bug (1.70 KB, application/x-sh)
2011-02-24 18:25 UTC, Kim van der Riet
no flags Details
qpid-perftest debug info (Comment 6) (10.47 KB, application/octet-stream)
2011-05-25 08:43 UTC, Petr Matousek
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2011:0890 0 normal SHIPPED_LIVE Red Hat Enterprise MRG Messaging 2.0 Release 2011-06-23 15:42:41 UTC

Description Kim van der Riet 2011-02-24 18:25:56 UTC
Created attachment 480830 [details]
Test script used to reproduce this bug

Running the topic scalability test (attached) causes the broker to hang when a large number of subscribers (>100) are present in durable mode. The attached test usually hangs in the 300 subscriber durable section.

Running on mrg42 on RHEL-6.0 from an in-tree build of r1072197/r4440 (which is before the flow-control checkin), the broker is started with:

./qpidd -m no --auth no --max-connections 3000 --load-module /home/kpvdr/mrg/store/lib/.libs/msgstore.so --store-dir /tmp --log-enable info+

When the hang occurs, pstack shows that threads 3, 5 and 7 have all called Db::del(DbTxn*, Dbt*, unsigned int) (), a condition which is explicitly forbidden:

http://download.oracle.com/docs/cd/E17076_02/html/programmer_reference/program_mt.html

pstack output:

Thread 10 (Thread 0x7f279c7e7710 (LWP 29058)):
#0  0x0000003aae80b7a9 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f279dd166d9 in qpid::sys::Timer::run() () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#2  0x00007f279dc545da in qpid::sys::(anonymous namespace)::runRunnable(void*) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#3  0x0000003aae8077e1 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003aae4e153d in clone () from /lib64/libc.so.6
Thread 9 (Thread 0x7f279bc60710 (LWP 29059)):
#0  0x0000003aae4e1b33 in epoll_wait () from /lib64/libc.so.6
#1  0x00007f279dc5beec in qpid::sys::Poller::wait(qpid::sys::Duration) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#2  0x00007f279dc5c751 in qpid::sys::Poller::run() () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#3  0x00007f279dc545da in qpid::sys::(anonymous namespace)::runRunnable(void*) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#4  0x0000003aae8077e1 in start_thread () from /lib64/libpthread.so.0
#5  0x0000003aae4e153d in clone () from /lib64/libc.so.6
Thread 8 (Thread 0x7f279b25f710 (LWP 29060)):
#0  0x0000003aae4e1b33 in epoll_wait () from /lib64/libc.so.6
#1  0x00007f279dc5beec in qpid::sys::Poller::wait(qpid::sys::Duration) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#2  0x00007f279dc5c751 in qpid::sys::Poller::run() () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#3  0x00007f279dc545da in qpid::sys::(anonymous namespace)::runRunnable(void*) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#4  0x0000003aae8077e1 in start_thread () from /lib64/libpthread.so.0
#5  0x0000003aae4e153d in clone () from /lib64/libc.so.6
Thread 7 (Thread 0x7f279a85e710 (LWP 29061)):
#0  0x0000003aae80b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f279ca2992b in __db_pthread_mutex_lock () from /usr/lib64/libdb_cxx-4.7.so
#2  0x00007f279ca2955d in __db_tas_mutex_lock () from /usr/lib64/libdb_cxx-4.7.so
#3  0x00007f279caaa71b in __lock_get_internal () from /usr/lib64/libdb_cxx-4.7.so
#4  0x00007f279caaabb1 in __lock_get () from /usr/lib64/libdb_cxx-4.7.so
#5  0x00007f279cae3ac9 in __db_lget () from /usr/lib64/libdb_cxx-4.7.so
#6  0x00007f279ca30691 in __bam_relink () from /usr/lib64/libdb_cxx-4.7.so
#7  0x00007f279ca30f24 in __bam_dpages () from /usr/lib64/libdb_cxx-4.7.so
#8  0x00007f279ca2fb92 in ?? () from /usr/lib64/libdb_cxx-4.7.so
#9  0x00007f279ca3030b in ?? () from /usr/lib64/libdb_cxx-4.7.so
#10 0x00007f279cad4eb1 in __dbc_close () from /usr/lib64/libdb_cxx-4.7.so
#11 0x00007f279cac9e04 in __db_del () from /usr/lib64/libdb_cxx-4.7.so
#12 0x00007f279cae08a8 in __db_del_pp () from /usr/lib64/libdb_cxx-4.7.so
#13 0x00007f279ca1ef07 in Db::del(DbTxn*, Dbt*, unsigned int) () from /usr/lib64/libdb_cxx-4.7.so
#14 0x00007f279cdc8916 in mrg::msgstore::MessageStoreImpl::destroy(boost::shared_ptr<Db>, qpid::broker::Persistable const&) () from /home/kpvdr/mrg/store/lib/.libs/msgstore.so
#15 0x00007f279cde18a9 in mrg::msgstore::MessageStoreImpl::destroy(qpid::broker::PersistableQueue&) () from /home/kpvdr/mrg/store/lib/.libs/msgstore.so
#16 0x00007f279e14bf8d in qpid::broker::MessageStoreModule::destroy(qpid::broker::PersistableQueue&) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#17 0x00007f279e15c945 in qpid::broker::Queue::destroyed() () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#18 0x00007f279e15cce1 in tryAutoDeleteImpl(qpid::broker::Broker&, boost::shared_ptr<qpid::broker::Queue>) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#19 0x00007f279e15ceec in qpid::broker::Queue::tryAutoDelete(qpid::broker::Broker&, boost::shared_ptr<qpid::broker::Queue>) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#20 0x00007f279e18a46f in qpid::broker::SessionAdapter::QueueHandlerImpl::destroyExclusiveQueues() () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#21 0x00007f279e18c983 in qpid::broker::SessionAdapter::QueueHandlerImpl::~QueueHandlerImpl() () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#22 0x00007f279e19b284 in qpid::broker::SessionState::~SessionState() () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#23 0x00007f279e19b6f9 in qpid::broker::SessionState::~SessionState() () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#24 0x00007f279e19560a in qpid::broker::SessionHandler::handleDetach() () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#25 0x00007f279dce1b8a in qpid::amqp_0_10::SessionHandler::detach(std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#26 0x00007f279dcaf1d3 in qpid::framing::AMQP_AllOperations::SessionHandler::Invoker::visit(qpid::framing::SessionDetachBody const&) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#27 0x00007f279dcdff5c in qpid::amqp_0_10::SessionHandler::invoke(qpid::framing::AMQMethodBody const&) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#28 0x00007f279dce05cf in qpid::amqp_0_10::SessionHandler::handleIn(qpid::framing::AMQFrame&) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#29 0x00007f279e101642 in qpid::broker::Connection::received(qpid::framing::AMQFrame&) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#30 0x00007f279e0db44d in qpid::amqp_0_10::Connection::decode(char const*, unsigned long) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#31 0x00007f279dd0dbca in qpid::sys::AsynchIOHandler::readbuff(qpid::sys::AsynchIO&, qpid::sys::AsynchIOBufferBase*) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#32 0x00007f279dc50df2 in qpid::sys::posix::AsynchIO::readable(qpid::sys::DispatchHandle&) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#33 0x00007f279dd12103 in boost::function1<void, qpid::sys::DispatchHandle&>::operator()(qpid::sys::DispatchHandle&) const () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#34 0x00007f279dd11251 in qpid::sys::DispatchHandle::processEvent(qpid::sys::Poller::EventType) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#35 0x00007f279dc5c742 in qpid::sys::Poller::run() () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#36 0x00007f279dc545da in qpid::sys::(anonymous namespace)::runRunnable(void*) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#37 0x0000003aae8077e1 in start_thread () from /lib64/libpthread.so.0
#38 0x0000003aae4e153d in clone () from /lib64/libc.so.6
Thread 6 (Thread 0x7f2789e5d710 (LWP 29062)):
#0  0x0000003aae4e1b33 in epoll_wait () from /lib64/libc.so.6
#1  0x00007f279dc5beec in qpid::sys::Poller::wait(qpid::sys::Duration) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#2  0x00007f279dc5c751 in qpid::sys::Poller::run() () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#3  0x00007f279dc545da in qpid::sys::(anonymous namespace)::runRunnable(void*) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#4  0x0000003aae8077e1 in start_thread () from /lib64/libpthread.so.0
#5  0x0000003aae4e153d in clone () from /lib64/libc.so.6
Thread 5 (Thread 0x7f2799e5d710 (LWP 29063)):
#0  0x0000003aae80b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f279ca2992b in __db_pthread_mutex_lock () from /usr/lib64/libdb_cxx-4.7.so
#2  0x00007f279ca2955d in __db_tas_mutex_lock () from /usr/lib64/libdb_cxx-4.7.so
#3  0x00007f279caaa71b in __lock_get_internal () from /usr/lib64/libdb_cxx-4.7.so
#4  0x00007f279caaabb1 in __lock_get () from /usr/lib64/libdb_cxx-4.7.so
#5  0x00007f279cae3ac9 in __db_lget () from /usr/lib64/libdb_cxx-4.7.so
#6  0x00007f279ca3d586 in __bam_get_root () from /usr/lib64/libdb_cxx-4.7.so
#7  0x00007f279ca3d964 in __bam_search () from /usr/lib64/libdb_cxx-4.7.so
#8  0x00007f279ca2c7c6 in ?? () from /usr/lib64/libdb_cxx-4.7.so
#9  0x00007f279ca2cf77 in ?? () from /usr/lib64/libdb_cxx-4.7.so
#10 0x00007f279cad591e in __dbc_get () from /usr/lib64/libdb_cxx-4.7.so
#11 0x00007f279cac9f4f in __db_del () from /usr/lib64/libdb_cxx-4.7.so
#12 0x00007f279cae08a8 in __db_del_pp () from /usr/lib64/libdb_cxx-4.7.so
#13 0x00007f279ca1ef07 in Db::del(DbTxn*, Dbt*, unsigned int) () from /usr/lib64/libdb_cxx-4.7.so
#14 0x00007f279cdc8916 in mrg::msgstore::MessageStoreImpl::destroy(boost::shared_ptr<Db>, qpid::broker::Persistable const&) () from /home/kpvdr/mrg/store/lib/.libs/msgstore.so
#15 0x00007f279cde18a9 in mrg::msgstore::MessageStoreImpl::destroy(qpid::broker::PersistableQueue&) () from /home/kpvdr/mrg/store/lib/.libs/msgstore.so
#16 0x00007f279e14bf8d in qpid::broker::MessageStoreModule::destroy(qpid::broker::PersistableQueue&) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#17 0x00007f279e15c945 in qpid::broker::Queue::destroyed() () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#18 0x00007f279e15cce1 in tryAutoDeleteImpl(qpid::broker::Broker&, boost::shared_ptr<qpid::broker::Queue>) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#19 0x00007f279e15ceec in qpid::broker::Queue::tryAutoDelete(qpid::broker::Broker&, boost::shared_ptr<qpid::broker::Queue>) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#20 0x00007f279e18a46f in qpid::broker::SessionAdapter::QueueHandlerImpl::destroyExclusiveQueues() () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#21 0x00007f279e18c983 in qpid::broker::SessionAdapter::QueueHandlerImpl::~QueueHandlerImpl() () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#22 0x00007f279e19b284 in qpid::broker::SessionState::~SessionState() () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#23 0x00007f279e19b6f9 in qpid::broker::SessionState::~SessionState() () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#24 0x00007f279e19560a in qpid::broker::SessionHandler::handleDetach() () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#25 0x00007f279dce1b8a in qpid::amqp_0_10::SessionHandler::detach(std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#26 0x00007f279dcaf1d3 in qpid::framing::AMQP_AllOperations::SessionHandler::Invoker::visit(qpid::framing::SessionDetachBody const&) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#27 0x00007f279dcdff5c in qpid::amqp_0_10::SessionHandler::invoke(qpid::framing::AMQMethodBody const&) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#28 0x00007f279dce05cf in qpid::amqp_0_10::SessionHandler::handleIn(qpid::framing::AMQFrame&) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#29 0x00007f279e101642 in qpid::broker::Connection::received(qpid::framing::AMQFrame&) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#30 0x00007f279e0db44d in qpid::amqp_0_10::Connection::decode(char const*, unsigned long) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#31 0x00007f279dd0dbca in qpid::sys::AsynchIOHandler::readbuff(qpid::sys::AsynchIO&, qpid::sys::AsynchIOBufferBase*) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#32 0x00007f279dc50df2 in qpid::sys::posix::AsynchIO::readable(qpid::sys::DispatchHandle&) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#33 0x00007f279dd12103 in boost::function1<void, qpid::sys::DispatchHandle&>::operator()(qpid::sys::DispatchHandle&) const () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#34 0x00007f279dd11251 in qpid::sys::DispatchHandle::processEvent(qpid::sys::Poller::EventType) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#35 0x00007f279dc5c742 in qpid::sys::Poller::run() () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#36 0x00007f279dc545da in qpid::sys::(anonymous namespace)::runRunnable(void*) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#37 0x0000003aae8077e1 in start_thread () from /lib64/libpthread.so.0
#38 0x0000003aae4e153d in clone () from /lib64/libc.so.6
Thread 4 (Thread 0x7f279945c710 (LWP 29064)):
#0  0x0000003aae4e1b33 in epoll_wait () from /lib64/libc.so.6
#1  0x00007f279dc5beec in qpid::sys::Poller::wait(qpid::sys::Duration) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#2  0x00007f279dc5c751 in qpid::sys::Poller::run() () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#3  0x00007f279dc545da in qpid::sys::(anonymous namespace)::runRunnable(void*) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#4  0x0000003aae8077e1 in start_thread () from /lib64/libpthread.so.0
#5  0x0000003aae4e153d in clone () from /lib64/libc.so.6
Thread 3 (Thread 0x7f2798a5b710 (LWP 29065)):
#0  0x0000003aae80b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f279ca2992b in __db_pthread_mutex_lock () from /usr/lib64/libdb_cxx-4.7.so
#2  0x00007f279ca2955d in __db_tas_mutex_lock () from /usr/lib64/libdb_cxx-4.7.so
#3  0x00007f279caaa71b in __lock_get_internal () from /usr/lib64/libdb_cxx-4.7.so
#4  0x00007f279caaabb1 in __lock_get () from /usr/lib64/libdb_cxx-4.7.so
#5  0x00007f279cae3ac9 in __db_lget () from /usr/lib64/libdb_cxx-4.7.so
#6  0x00007f279ca3d586 in __bam_get_root () from /usr/lib64/libdb_cxx-4.7.so
#7  0x00007f279ca3d964 in __bam_search () from /usr/lib64/libdb_cxx-4.7.so
#8  0x00007f279ca2fb25 in ?? () from /usr/lib64/libdb_cxx-4.7.so
#9  0x00007f279ca3030b in ?? () from /usr/lib64/libdb_cxx-4.7.so
#10 0x00007f279cad4eb1 in __dbc_close () from /usr/lib64/libdb_cxx-4.7.so
#11 0x00007f279cac9e04 in __db_del () from /usr/lib64/libdb_cxx-4.7.so
#12 0x00007f279cae08a8 in __db_del_pp () from /usr/lib64/libdb_cxx-4.7.so
#13 0x00007f279ca1ef07 in Db::del(DbTxn*, Dbt*, unsigned int) () from /usr/lib64/libdb_cxx-4.7.so
#14 0x00007f279cdc8916 in mrg::msgstore::MessageStoreImpl::destroy(boost::shared_ptr<Db>, qpid::broker::Persistable const&) () from /home/kpvdr/mrg/store/lib/.libs/msgstore.so
#15 0x00007f279cde18a9 in mrg::msgstore::MessageStoreImpl::destroy(qpid::broker::PersistableQueue&) () from /home/kpvdr/mrg/store/lib/.libs/msgstore.so
#16 0x00007f279e14bf8d in qpid::broker::MessageStoreModule::destroy(qpid::broker::PersistableQueue&) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#17 0x00007f279e15c945 in qpid::broker::Queue::destroyed() () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#18 0x00007f279e15cce1 in tryAutoDeleteImpl(qpid::broker::Broker&, boost::shared_ptr<qpid::broker::Queue>) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#19 0x00007f279e15ceec in qpid::broker::Queue::tryAutoDelete(qpid::broker::Broker&, boost::shared_ptr<qpid::broker::Queue>) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#20 0x00007f279e18a46f in qpid::broker::SessionAdapter::QueueHandlerImpl::destroyExclusiveQueues() () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#21 0x00007f279e18c983 in qpid::broker::SessionAdapter::QueueHandlerImpl::~QueueHandlerImpl() () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#22 0x00007f279e19b284 in qpid::broker::SessionState::~SessionState() () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#23 0x00007f279e19b6f9 in qpid::broker::SessionState::~SessionState() () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#24 0x00007f279e19560a in qpid::broker::SessionHandler::handleDetach() () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#25 0x00007f279dce1b8a in qpid::amqp_0_10::SessionHandler::detach(std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#26 0x00007f279dcaf1d3 in qpid::framing::AMQP_AllOperations::SessionHandler::Invoker::visit(qpid::framing::SessionDetachBody const&) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#27 0x00007f279dcdff5c in qpid::amqp_0_10::SessionHandler::invoke(qpid::framing::AMQMethodBody const&) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#28 0x00007f279dce05cf in qpid::amqp_0_10::SessionHandler::handleIn(qpid::framing::AMQFrame&) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#29 0x00007f279e101642 in qpid::broker::Connection::received(qpid::framing::AMQFrame&) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#30 0x00007f279e0db44d in qpid::amqp_0_10::Connection::decode(char const*, unsigned long) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#31 0x00007f279dd0dbca in qpid::sys::AsynchIOHandler::readbuff(qpid::sys::AsynchIO&, qpid::sys::AsynchIOBufferBase*) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#32 0x00007f279dc50df2 in qpid::sys::posix::AsynchIO::readable(qpid::sys::DispatchHandle&) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#33 0x00007f279dd12103 in boost::function1<void, qpid::sys::DispatchHandle&>::operator()(qpid::sys::DispatchHandle&) const () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#34 0x00007f279dd11251 in qpid::sys::DispatchHandle::processEvent(qpid::sys::Poller::EventType) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#35 0x00007f279dc5c742 in qpid::sys::Poller::run() () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#36 0x00007f279dc545da in qpid::sys::(anonymous namespace)::runRunnable(void*) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#37 0x0000003aae8077e1 in start_thread () from /lib64/libpthread.so.0
#38 0x0000003aae4e153d in clone () from /lib64/libc.so.6
Thread 2 (Thread 0x7f2793fff710 (LWP 29066)):
#0  0x0000003aae4e1b33 in epoll_wait () from /lib64/libc.so.6
#1  0x00007f279dc5beec in qpid::sys::Poller::wait(qpid::sys::Duration) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#2  0x00007f279dc5c751 in qpid::sys::Poller::run() () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#3  0x00007f279dc545da in qpid::sys::(anonymous namespace)::runRunnable(void*) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#4  0x0000003aae8077e1 in start_thread () from /lib64/libpthread.so.0
#5  0x0000003aae4e153d in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7f279d0717a0 (LWP 29043)):
#0  0x0000003aae4e1b33 in epoll_wait () from /lib64/libc.so.6
#1  0x00007f279dc5beec in qpid::sys::Poller::wait(qpid::sys::Duration) () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#2  0x00007f279dc5c751 in qpid::sys::Poller::run() () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidcommon.so.2
#3  0x00007f279e0ef742 in qpid::broker::Broker::run() () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.2
#4  0x000000000040d83d in QpiddBroker::execute(QpiddOptions*) ()
#5  0x000000000040a1f2 in main ()

Comment 1 Kim van der Riet 2011-02-25 16:45:21 UTC
Review of the code shows that the Db::del() function should be free-threaded, and not have an issue with multiple threads. However, adding a lock into the MessageStoreImpl::destroy() function has definitely solved this bug, and the attached script, which reliably failed on the 300 subscriber test, now runs all the 300, 1000 and 3000 tests without a problem.

Comment 2 Kim van der Riet 2011-03-17 17:09:55 UTC
Fixed in r.4444

Comment 3 Petr Matousek 2011-04-26 14:49:29 UTC
I got different results per RHEL versions while testing the issue with the attached reproducer on my VMs. 

RHEL5:
perf-topic.sh stops execution in the 1 subscriber durable section.
Connection was closed by broker due to Enqueue capacity threshold exceeded.

2011-04-26 13:27:57 error Unexpected exception: Enqueue capacity threshold exceeded on queue "anonymous.016d17cc-3581-4162-ad17-ccf07fe6e351". (JournalImpl.cpp:587)
2011-04-26 13:27:57 error Connection 127.0.0.1:5672-127.0.0.1:46408 closed by error: Enqueue capacity threshold exceeded on queue "anonymous.016d17cc-3581-4162-ad17-ccf07fe6e351". (JournalImpl.cpp:587)(501)

RHEL6:
perf-topic.sh stops execution in the 300 subscriber durable section.
Connection was closed by broker due to Too many open files exception.

2011-04-26 12:40:35 warning Broker closed connection: 501, Queue anonymous.a3243978-5ccf-4ad8-ba90-73eeaea402ea: create() failed: jexception 0x0400 fcntl::clean_file() threw JERR_FCNTL_OPENWR: Unable to open file for write. (open() failed: errno=24 (Too many open files)) (MessageStoreImpl.cpp:533)
SubscribeThread exception: framing-error: Queue anonymous.a3243978-5ccf-4ad8-ba90-73eeaea402ea: create() failed: jexception 0x0400 fcntl::clean_file() threw JERR_FCNTL_OPENWR: Unable to open file for write. (open() failed: errno=24 (Too many open files)) (MessageStoreImpl.cpp:533)


-> handing over to freznice for further testing on real hardware

Comment 4 Kim van der Riet 2011-04-26 15:17:01 UTC
Enqueue Threshold exceptions are well-understood and relate to the cumulative number of messages on the store as well as the correct ordering of message consumption - ie consuming messages in the order in which they were received. If one OS enqueues at a greater rate than another relative to the consume rate, then this exception might occur on that OS and not the other. Make the store larger to accommodate the slower OS.

Too many files open is also well-known and typically occurs when there are a large number of persistent queues and/or --num-jfiles is set to a high number. Each journal file holds one file handle for the life of the queue on that broker. Each user by default may hold no more than 1024 file handles open at one time. To increase this limit, set a new higher value in /etc/security/limits.conf:

userid  -  nofile 2048

See the man page for limits.conf for further details. If you are making this change for an installed broker, then userid would be "qpidd".

Make sure that if you are running many durable tests on a single broker instance, that the queues of previous tests are deleted, thus releasing the file handles associated with that test.

Your test is a topic test, which creates queue (and hence one journal) per subscription. Since each has 8 files, this would soon consume all 1024 available file handles. I have successfully tested up to 10,000 subscribers to a topic, but the file handle limit needs to be raised to 64k, as well as the AIO handle limit limitfs.aio-max-nr set in sysctl. Also make sure there is enough disk space for the total journal footprint.

Comment 12 Petr Matousek 2011-05-26 09:47:38 UTC
This issue has been fixed.

Verified on RHEL5.6, RHEL6.1 architectures: i386, x86_64

Successfully tested up to 10000 transient subscribers (3000 durable subscribers due to disk space limit) with the attached reproducer on RHEL5.6, RHEL6.1 x86_64.

Successfully tested up to 100 transient/durable subscribers due to insufficient resources to create another thread on RHEL 5.6, RHEL6.1 i386.

With regard to this issue was found on x86_64 arch system and and no hang occurred by repeatedly testing this issue -> moving to verified.

packages installed:
python-qpid-0.10-1.el5.noarch
python-qpid-qmf-0.10-8.el5.x86_64
qpid-cpp-client-0.10-7.el5.x86_64
qpid-cpp-client-devel-0.10-7.el5.x86_64
qpid-cpp-client-devel-docs-0.10-7.el5.x86_64
qpid-cpp-client-ssl-0.10-7.el5.x86_64
qpid-cpp-mrg-debuginfo-0.10-6.el5.x86_64
qpid-cpp-server-0.10-7.el5.x86_64
qpid-cpp-server-cluster-0.10-7.el5.x86_64
qpid-cpp-server-devel-0.10-7.el5.x86_64
qpid-cpp-server-ssl-0.10-7.el5.x86_64
qpid-cpp-server-store-0.10-7.el5.x86_64
qpid-cpp-server-xml-0.10-7.el5.x86_64
qpid-java-client-0.10-6.el5.noarch
qpid-java-common-0.10-6.el5.noarch
qpid-java-example-0.10-6.el5.noarch
qpid-qmf-0.10-8.el5.x86_64
qpid-qmf-debuginfo-0.10-6.el5.x86_64
qpid-qmf-devel-0.10-8.el5.x86_64
qpid-tools-0.10-5.el5.noarch

-> VERIFIED

Comment 13 errata-xmlrpc 2011-06-23 15:43:45 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2011-0890.html


Note You need to log in before you can comment on or make changes to this bug.