Hide Forgot
Description of problem: Qpidd segfaults on perftest in mode fanout with more subscribers and message durability. Version-Release number of selected component (if applicable): > qpid-cpp-client-0.14-5.el5 > qpid-cpp-client-devel-0.14-5.el5 > qpid-cpp-client-devel-docs-0.14-5.el5 > qpid-cpp-client-rdma-0.14-5.el5 > qpid-cpp-client-ssl-0.14-5.el5 > qpid-cpp-server-0.14-5.el5 > qpid-cpp-server-cluster-0.14-5.el5 > qpid-cpp-server-devel-0.14-5.el5 > qpid-cpp-server-rdma-0.14-5.el5 > qpid-cpp-server-ssl-0.14-5.el5 > qpid-cpp-server-store-0.14-5.el5 > qpid-cpp-server-xml-0.14-5.el5 > qpid-java-client-0.14-1.el5 > qpid-java-common-0.14-1.el5 > qpid-java-example-0.14-1.el5 > qpid-qmf-0.14-2.el5 > qpid-qmf-devel-0.14-2.el5 > qpid-tests-0.14-1.el5 > qpid-tools-0.14-1.el5 How reproducible: Steps to Reproduce: 1. DD=$(mktemp -d);ulimit -c unlimited; 2. /usr/sbin/qpidd --auth no --daemon --port 5672 \ --log-enable debug+ --log-to-file ${DD}/qpidd.log \ --data-dir ${DD} --num-jfile 16 --jfile-size-pgs 128; 3. /usr/bin/perftest --port 5672 --mode fanout \ --count 25000 --size 256 --durable yes --nsubs 4 4. ps xfa | grep qpidd 5. check corefile Actual results: [13:59:37] Core file: /root/.qpidd/core.30948 generated by /usr/sbin/qpidd ----------------------3/4- -rw------- 1 root root 73338880 Feb 12 13:56 /root/.qpidd/core.30948 /root/.qpidd/core.30948: ELF 32-bit LSB core file Intel 80386, version 1 (SYSV), SVR4-style, from 'qpidd' GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-37.el5_7.1) Copyright (C) 2009 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i386-redhat-linux-gnu". For bug reporting instructions, please see: [New Thread 30958] [New Thread 30961] [New Thread 30960] [New Thread 30959] [New Thread 30949] [New Thread 30948] warning: .dynamic section for "/usr/lib/openais/libcpg.so.2" is not at the expected address warning: difference appears to be caused by prelink, adjusting expectations warning: .dynamic section for "/usr/lib/libnssutil3.so" is not at the expected address warning: difference appears to be caused by prelink, adjusting expectations warning: .dynamic section for "/usr/lib/libplc4.so" is not at the expected address warning: difference appears to be caused by prelink, adjusting expectations warning: .dynamic section for "/usr/lib/libplds4.so" is not at the expected address warning: difference appears to be caused by prelink, adjusting expectations [Thread debugging using libthread_db enabled] Core was generated by `/usr/sbin/qpidd --auth no --daemon --port 0 --log-enable info+ --log-to-file qp'. Program terminated with signal 11, Segmentation fault. #0 0x006b18a8 in qpid::framing::DeliveryProperties::bodySize() const () from /usr/lib/libqpidcommon.so.5 (gdb) eax 0x4cf6f 315247 ecx 0x1001 4097 edx 0x3 3 ebx 0x80005c 8388700 esp 0xb7411050 0xb7411050 ebp 0xb7411058 0xb7411058 esi 0x1001 4097 edi 0xb3b12d9c -1280234084 eip 0x6b18a8 0x6b18a8 <qpid::framing::DeliveryProperties::bodySize() const+72> eflags 0x10202 [ IF RF ] cs 0x73 115 ss 0x7b 123 ds 0x7b 123 es 0xc040007b -1069547397 fs 0x0 0 gs 0x33 51 (gdb) Using memory regions provided by the target. There are no memory regions defined. (gdb) 32 AT_SYSINFO Special system info/entry points 0xe0d400 33 AT_SYSINFO_EHDR System-supplied DSO's ELF header 0xe0d000 16 AT_HWCAP Machine-dependent CPU capability hints 0xbfebfbff 6 AT_PAGESZ System page size 4096 17 AT_CLKTCK Frequency of times() 100 3 AT_PHDR Program headers for program 0x8048034 4 AT_PHENT Size of program header entry 32 5 AT_PHNUM Number of program headers 8 7 AT_BASE Base address of interpreter 0x0 8 AT_FLAGS Flags 0x0 9 AT_ENTRY Entry point of program 0x804c2b0 11 AT_UID Real user ID 0 12 AT_EUID Effective user ID 0 13 AT_GID Real group ID 0 14 AT_EGID Effective group ID 0 23 AT_SECURE Boolean, was exec setuid-like? 0 15 AT_PLATFORM String identifying platform 0xbffc4ecb "i686" 0 AT_NULL End of vector 0x0 (gdb) Stack level 0, frame at 0xb7411060: eip = 0x6b18a8 in qpid::framing::DeliveryProperties::bodySize() const; saved eip 0x6b190d called by frame at 0xb7411070 Arglist at 0xb7411058, args: Locals at 0xb7411058, Previous frame's sp is 0xb7411060 Saved registers: ebp at 0xb7411058, esi at 0xb7411050, edi at 0xb7411054, eip at 0xb741105c (gdb) From To Syms Read Shared Object Library 0x00af3120 0x00d02b94 Yes (*) /usr/lib/libqpidbroker.so.5 0x00658c00 0x00780574 Yes (*) /usr/lib/libqpidcommon.so.5 0x0050fc10 0x0051e574 Yes (*) /usr/lib/libqpidtypes.so.1 0x004df940 0x004ff054 Yes (*) /usr/lib/libboost_program_options.so.2 0x00529900 0x005320a4 Yes (*) /usr/lib/libboost_filesystem.so.2 0x00a55f90 0x00a57bb4 Yes (*) /lib/libuuid.so.1 0x00485a70 0x00486aa4 Yes (*) /lib/libdl.so.2 0x0054a880 0x0054ec44 Yes (*) /lib/librt.so.1 0x008fc190 0x0090c774 Yes (*) /usr/lib/libsasl2.so.2 0x0084bc50 0x008c7174 Yes (*) /usr/lib/libstdc++.so.6 0x004aa410 0x004c5594 Yes (*) /lib/libm.so.6 0x00570660 0x00577f34 Yes (*) /lib/libgcc_s.so.1 0x0033fc80 0x0043b2e0 Yes (*) /lib/libc.so.6 0x0030b7f0 0x00320fff Yes (*) /lib/ld-linux.so.2 0x00490210 0x0049bac4 Yes (*) /lib/libpthread.so.0 0x005560e0 0x00561074 Yes (*) /lib/libresolv.so.2 0x009da700 0x009e1334 Yes (*) /lib/libcrypt.so.1 0x00117610 0x001205b4 Yes (*) /usr/lib/qpid/daemon/rdma.so 0x0012d430 0x00140224 Yes (*) /usr/lib/librdmawrap.so.5 0x001484e0 0x0014f614 Yes (*) /usr/lib/libibverbs.so.1 0x00154020 0x00156ec4 Yes (*) /usr/lib/librdmacm.so.1 0x00160760 0x0016f234 Yes (*) /usr/lib/qpid/daemon/xml.so 0x00f390a0 0x010f6534 Yes (*) /usr/lib/libxerces-c.so.28 0x06a09c00 0x06bbf1d4 Yes (*) /usr/lib/libxqilla.so.3 0x0017eb30 0x001a1ce4 Yes (*) /usr/lib/qpid/daemon/acl.so 0x001af000 0x001b3bc4 Yes (*) /usr/lib/qpid/daemon/replication_exchange.so 0x001dfb70 0x0028f694 Yes (*) /usr/lib/qpid/daemon/msgstore.so 0x016d6240 0x0179e764 Yes (*) /usr/lib/libdb_cxx-4.3.so 0x002b6390 0x002b66d0 Yes (*) /usr/lib/libaio.so.1 0x002bd120 0x002c3944 Yes (*) /usr/lib/qpid/daemon/replicating_listener.so 0x060210e0 0x060a7a24 Yes (*) /usr/lib/qpid/daemon/cluster.so 0x002c7ed0 0x002ca224 Yes (*) /usr/lib/openais/libcpg.so.2 0x00a27d30 0x00a2a8b4 Yes (*) /usr/lib/libcman.so.2 0x07d2d2d0 0x07dae1f4 Yes (*) /usr/lib/libqpidclient.so.5 0x002d58a0 0x002e1094 Yes (*) /usr/lib/qpid/daemon/ssl.so 0x0091c780 0x00937104 Yes (*) /usr/lib/libsslcommon.so.5 0x0568e290 0x05772bb4 Yes (*) /usr/lib/libnss3.so 0x00943510 0x00969b94 Yes (*) /usr/lib/libssl3.so 0x0097b7d0 0x0099e374 Yes (*) /usr/lib/libnspr4.so 0x002eac30 0x002f6374 Yes (*) /usr/lib/libnssutil3.so 0x002ffdf0 0x003016f4 Yes (*) /usr/lib/libplc4.so 0x00303a60 0x00304a24 Yes (*) /usr/lib/libplds4.so 0x005375c0 0x00542814 Yes (*) /lib/libz.so.1 0x009b06b0 0x009b4cd4 Yes (*) /usr/lib/qpid/daemon/watchdog.so (*): Shared library is missing debugging information. (gdb) 6 Thread 0xb7f40960 (LWP 30948) 0x00e0d410 in __kernel_vsyscall () 5 Thread 30949 0x00e0d410 in __kernel_vsyscall () 4 Thread 30959 0x00e0d410 in __kernel_vsyscall () 3 Thread 30960 0x00e0d410 in __kernel_vsyscall () 2 Thread 30961 0x00e0d410 in __kernel_vsyscall () * 1 Thread 0xb7413b90 (LWP 30958) 0x006b18a8 in qpid::framing::DeliveryProperties::bodySize() const () from /usr/lib/libqpidcommon.so.5 Thread 6 (Thread 0xb7f40960 (LWP 30948)): #0 0x00e0d410 in __kernel_vsyscall () #1 0x003fcae6 in epoll_wait () from /lib/libc.so.6 #2 0x00679e4a in qpid::sys::Poller::wait(qpid::sys::Duration) () from /usr/lib/libqpidcommon.so.5 #3 0x0067aa73 in qpid::sys::Poller::run() () from /usr/lib/libqpidcommon.so.5 #4 0x0076fc44 in qpid::sys::Dispatcher::run() () from /usr/lib/libqpidcommon.so.5 #5 0x00b8d85d in qpid::broker::Broker::run() () from /usr/lib/libqpidbroker.so.5 #6 0x08050f5c in ?? () #7 0x00bb6d20 in qpid::broker::Daemon::fork() () from /usr/lib/libqpidbroker.so.5 #8 0x0804e3e6 in ?? () #9 0x0804c956 in std::ios_base::Init::~Init() () #10 0x0804deca in ?? () #11 0x0033fe9c in __libc_start_main () from /lib/libc.so.6 #12 0x0804c2d1 in std::ios_base::Init::~Init() () Thread 5 (Thread 30949): #0 0x00e0d410 in __kernel_vsyscall () #1 0x00495ef2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 #2 0x00408b84 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libc.so.6 #3 0x00771c43 in qpid::sys::Timer::run() () from /usr/lib/libqpidcommon.so.5 #4 0x00670f01 in ?? () from /usr/lib/libqpidcommon.so.5 #5 0x00491832 in start_thread () from /lib/libpthread.so.0 #6 0x003fc46e in clone () from /lib/libc.so.6 Thread 4 (Thread 30959): #0 0x00e0d410 in __kernel_vsyscall () #1 0x003fcae6 in epoll_wait () from /lib/libc.so.6 #2 0x00679e4a in qpid::sys::Poller::wait(qpid::sys::Duration) () from /usr/lib/libqpidcommon.so.5 #3 0x0067aa73 in qpid::sys::Poller::run() () from /usr/lib/libqpidcommon.so.5 #4 0x0076fc44 in qpid::sys::Dispatcher::run() () from /usr/lib/libqpidcommon.so.5 #5 0x00670f01 in ?? () from /usr/lib/libqpidcommon.so.5 #6 0x00491832 in start_thread () from /lib/libpthread.so.0 #7 0x003fc46e in clone () from /lib/libc.so.6 Thread 3 (Thread 30960): #0 0x00e0d410 in __kernel_vsyscall () #1 0x003fcae6 in epoll_wait () from /lib/libc.so.6 #2 0x00679e4a in qpid::sys::Poller::wait(qpid::sys::Duration) () from /usr/lib/libqpidcommon.so.5 #3 0x0067aa73 in qpid::sys::Poller::run() () from /usr/lib/libqpidcommon.so.5 #4 0x0076fc44 in qpid::sys::Dispatcher::run() () from /usr/lib/libqpidcommon.so.5 #5 0x00670f01 in ?? () from /usr/lib/libqpidcommon.so.5 #6 0x00491832 in start_thread () from /lib/libpthread.so.0 #7 0x003fc46e in clone () from /lib/libc.so.6 Thread 2 (Thread 30961): #0 0x00e0d410 in __kernel_vsyscall () #1 0x003fcae6 in epoll_wait () from /lib/libc.so.6 #2 0x00679e4a in qpid::sys::Poller::wait(qpid::sys::Duration) () from /usr/lib/libqpidcommon.so.5 #3 0x0067aa73 in qpid::sys::Poller::run() () from /usr/lib/libqpidcommon.so.5 #4 0x0076fc44 in qpid::sys::Dispatcher::run() () from /usr/lib/libqpidcommon.so.5 #5 0x00670f01 in ?? () from /usr/lib/libqpidcommon.so.5 #6 0x00491832 in start_thread () from /lib/libpthread.so.0 #7 0x003fc46e in clone () from /lib/libc.so.6 Thread 1 (Thread 0xb7413b90 (LWP 30958)): #0 0x006b18a8 in qpid::framing::DeliveryProperties::bodySize() const () from /usr/lib/libqpidcommon.so.5 #1 0x006b190d in qpid::framing::DeliveryProperties::encodedSize() const () from /usr/lib/libqpidcommon.so.5 #2 0x00730f17 in qpid::framing::AMQHeaderBody::encodedSize() const () from /usr/lib/libqpidcommon.so.5 #3 0x0072f5df in qpid::framing::AMQFrame::encodedSize() const () from /usr/lib/libqpidcommon.so.5 #4 0x00c07bee in qpid::broker::Message::encodedHeaderSize() const () from /usr/lib/libqpidbroker.so.5 #5 0x001fd2c9 in mrg::msgstore::MessageStoreImpl::msgEncode(std::vector<char, std::allocator<char> >&, boost::intrusive_ptr<qpid::broker::PersistableMessage> const&) () from /usr/lib/qpid/daemon/msgstore.so #6 0x00201cd4 in mrg::msgstore::MessageStoreImpl::store(qpid::broker::PersistableQueue const*, mrg::msgstore::TxnCtxt*, boost::intrusive_ptr<qpid::broker::PersistableMessage> const&, bool) () from /usr/lib/qpid/daemon/msgstore.so #7 0x00206641 in mrg::msgstore::MessageStoreImpl::enqueue(qpid::broker::TransactionContext*, boost::intrusive_ptr<qpid::broker::PersistableMessage> const&, qpid::broker::PersistableQueue const&) () from /usr/lib/qpid/daemon/msgstore.so #8 0x00c16d4d in qpid::broker::MessageStoreModule::enqueue(qpid::broker::TransactionContext*, boost::intrusive_ptr<qpid::broker::PersistableMessage> const&, qpid::broker::PersistableQueue const&) () from /usr/lib/libqpidbroker.so.5 #9 0x00c2d710 in qpid::broker::Queue::enqueue(qpid::broker::TransactionContext*, boost::intrusive_ptr<qpid::broker::Message>&, bool) () from /usr/lib/libqpidbroker.so.5 #10 0x00c2ed1c in qpid::broker::Queue::deliver(boost::intrusive_ptr<qpid::broker::Message>) () from /usr/lib/libqpidbroker.so.5 #11 0x00bb9a6e in qpid::broker::DeliverableMessage::deliverTo(boost::shared_ptr<qpid::broker::Queue> const&) () from /usr/lib/libqpidbroker.so.5 #12 0x00bd8b84 in qpid::broker::Exchange::doRoute(qpid::broker::Deliverable&, boost::shared_ptr<std::vector<boost::shared_ptr<qpid::broker::Exchange::Binding>, std::allocator<boost::shared_ptr<qpid::broker::Exchange::Binding> > > const>) () from /usr/lib/libqpidbroker.so.5 #13 0x00be76a3 in qpid::broker::FanOutExchange::route(qpid::broker::Deliverable&, std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, qpid::framing::FieldTable const*) () from /usr/lib/libqpidbroker.so.5 #14 0x00c701ce in qpid::broker::SemanticState::route(boost::intrusive_ptr<qpid::broker::Message>, qpid::broker::Deliverable&) () from /usr/lib/libqpidbroker.so.5 #15 0x00c70f5c in qpid::broker::SemanticState::handle(boost::intrusive_ptr<qpid::broker::Message>) () from /usr/lib/libqpidbroker.so.5 #16 0x00c97eaa in qpid::broker::SessionState::handleContent(qpid::framing::AMQFrame&, qpid::framing::SequenceNumber const&) () from /usr/lib/libqpidbroker.so.5 #17 0x00c988e3 in qpid::broker::SessionState::handleIn(qpid::framing::AMQFrame&) () from /usr/lib/libqpidbroker.so.5 #18 0x00c9a47b in qpid::framing::Handler<qpid::framing::AMQFrame&>::MemFunRef<qpid::framing::Handler<qpid::framing::AMQFrame&>::InOutHandlerInterface, &(qpid::framing::Handler<qpid::framing::AMQFrame&>::InOutHandlerInterface::handleIn(qpid::framing::AMQFrame&))>::handle(qpid::framing::AMQFrame&) () from /usr/lib/libqpidbroker.so.5 #19 0x0072c190 in qpid::amqp_0_10::SessionHandler::handleIn(qpid::framing::AMQFrame&) () from /usr/lib/libqpidcommon.so.5 #20 0x00c9a47b in qpid::framing::Handler<qpid::framing::AMQFrame&>::MemFunRef<qpid::framing::Handler<qpid::framing::AMQFrame&>::InOutHandlerInterface, &(qpid::framing::Handler<qpid::framing::AMQFrame&>::InOutHandlerInterface::handleIn(qpid::framing::AMQFrame&))>::handle(qpid::framing::AMQFrame&) () from /usr/lib/libqpidbroker.so.5 #21 0x00bb443a in qpid::broker::ConnectionHandler::handle(qpid::framing::AMQFrame&) () from /usr/lib/libqpidbroker.so.5 #22 0x00ba933a in qpid::broker::Connection::received(qpid::framing::AMQFrame&) () from /usr/lib/libqpidbroker.so.5 #23 0x00b73bef in qpid::amqp_0_10::Connection::decode(char const*, unsigned int) () from /usr/lib/libqpidbroker.so.5 #24 0x00c69a94 in qpid::broker::SecureConnection::decode(char const*, unsigned int) () from /usr/lib/libqpidbroker.so.5 #25 0x00767664 in qpid::sys::AsynchIOHandler::readbuff(qpid::sys::AsynchIO&, qpid::sys::AsynchIOBufferBase*) () from /usr/lib/libqpidcommon.so.5 #26 0x00d01284 in boost::detail::function::void_function_obj_invoker2<boost::_bi::bind_t<void, boost::_mfi::mf2<void, qpid::sys::AsynchIOHandler, qpid::sys::AsynchIO&, qpid::sys::AsynchIOBufferBase*>, boost::_bi::list3<boost::_bi::value<qpid::sys::AsynchIOHandler*>, boost::arg<1>, boost::arg<2> > >, void, qpid::sys::AsynchIO&, qpid::sys::AsynchIOBufferBase*>::invoke(boost::detail::function::any_pointer, qpid::sys::AsynchIO&, qpid::sys::AsynchIOBufferBase*) () from /usr/lib/libqpidbroker.so.5 #27 0x0066e65b in boost::function2<void, qpid::sys::AsynchIO&, qpid::sys::AsynchIOBufferBase*, std::allocator<boost::function_base> >::operator()(qpid::sys::AsynchIO&, qpid::sys::AsynchIOBufferBase*) const () from /usr/lib/libqpidcommon.so.5 #28 0x00667be9 in qpid::sys::posix::AsynchIO::readable(qpid::sys::DispatchHandle&) () from /usr/lib/libqpidcommon.so.5 #29 0x0066d23d in boost::detail::function::void_function_obj_invoker1<boost::_bi::bind_t<void, boost::_mfi::mf1<void, qpid::sys::posix::AsynchIO, qpid::sys::DispatchHandle&>, boost::_bi::list2<boost::_bi::value<qpid::sys::posix::AsynchIO*>, boost::arg<1> > >, void, qpid::sys::DispatchHandle&>::invoke(boost::detail::function::any_pointer, qpid::sys::DispatchHandle&) () from /usr/lib/libqpidcommon.so.5 #30 0x0076ca04 in boost::function1<void, qpid::sys::DispatchHandle&, std::allocator<boost::function_base> >::operator()(qpid::sys::DispatchHandle&) const () from /usr/lib/libqpidcommon.so.5 #31 0x0076bfe1 in qpid::sys::DispatchHandle::processEvent(qpid::sys::Poller::EventType) () from /usr/lib/libqpidcommon.so.5 #32 0x0067aa91 in qpid::sys::Poller::run() () from /usr/lib/libqpidcommon.so.5 #33 0x0076fc44 in qpid::sys::Dispatcher::run() () from /usr/lib/libqpidcommon.so.5 #34 0x00670f01 in ?? () from /usr/lib/libqpidcommon.so.5 #35 0x00491832 in start_thread () from /lib/libpthread.so.0 #36 0x003fc46e in clone () from /lib/libc.so.6 (gdb) quit Expected results: Broker will not segfault Additional info:
A couple of questions: 1) How reproducible is this symptom? Have you seen it more than once? 2) Is the broker really running with only six threads or are there threads not reported by gdb in the back-trace?
Petr, do we know if this is a regression from 0.10?
Mick, please do a short assessment.
(In reply to comment #2) > A couple of questions: > > 1) How reproducible is this symptom? Have you seen it more than once? i was able to reproduce every run, with given reproducer > 2) Is the broker really running with only six threads or are there threads not > reported by gdb in the back-trace? there was no change to worker-threads, i believe qpidd decided itself on # of threads
Ken, please take a look.
Gordon has discovered the root cause - this is due to a thread encoding the message's headers whilst another thread updates them (the ttl). See upstream jira - fix in progress: https://issues.apache.org/jira/browse/QPID-3877
Bugfix submitted upstream: http://svn.apache.org/viewvc?view=rev&rev=1296230
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause Sending persistent messages containing a time-to-live (TTL) header to a fanout exchange. Consequence The broker will occasionally crash. Fix A threading lock was added to the message header processing code. Result The broker serializes access to the TTL header, preventing the race condition that results in the crash.
Retested, buggy behavior changed in the way that original above described crash moved somewhere else (most probably due to incomplete fix). The new crash is tracked as bug 801310 and marked as dependency for this defect (801310 should be resolved before this defect goes VERI).
Retested on rhel5.7/5.8/6.2 i/x on packages: qpid-cpp-*0.14-12.el5 + qpid-qmf-*0.14-3.el5 qpid-cpp-*0.14-12.el6 + qpid-qmf-*0.14-5.el6 Issue is reliably fixed, no other crashes detected. Waiting for installable set & retest
Retested on rhel5.7/5.8/6.2 i/x on packages: qpid-cpp-*0.14-14.el5 + qpid-qmf-*0.14-4.el5 qpid-cpp-*0.14-12.el6 + qpid-qmf-*0.14-6.el6 Issue is reliably fixed, no other crashes detected. -> VERIFIED