Description of problem: When the c++ qpid client does not call session.close() and connection.close(). This is cleanly bad code, but the question is whether this bad example can or cannot encounter this assertion: /usr/include/qpid/sys/posix/Mutex.h:112: qpid::sys::Mutex::~Mutex(): Test log files: https://rhts.redhat.com/cgi-bin/rhts/test_log.cgi?id=7118434 Test and client source codes are: http://cvs.devel.redhat.com/cgi-bin/cvsweb.cgi/tests/distribution/MRG_Messaging/qpid_test_fieldtable_encoding_issue_bz468026/ Bad code example: http://cvs.devel.redhat.com/cgi-bin/cvsweb.cgi/tests/distribution/MRG_Messaging/qpid_test_fieldtable_encoding_issue_bz468026/header_test2.cpp?annotate=1.3 Version-Release number of selected component (if applicable): qpidd-0.4.750054-1.el5, qpidc-0.4.750054-1.el5 How reproducible: rarely (~2%) Steps to Reproduce: 1. run qpid_test_fieldtable_encoding_issue_bz468026 in a loop 2. seek the failures Actual results: c++ client occassionly aborts with qpid::sys::Mutex::~Mutex(): Assertion `0' failed. Expected results: Additional info: (test logfile fragment) [05:59:50] run(A) 4/5--------------------------[err_cnt:0]- [05:59:50] mrg_qpidd_start: qpidd launched normal bg way (port:5672,log:qpidd.transcript.log, params: --auth no --log-enable info+) [05:59:51] mrg_qpidd_wait_on_settle: qpidd started-up (dur:0sec) [05:59:51] .qpidd settled [05:59:51] .running core tests... (c++ client <--> qpidd <--> python client) header_test|c++:Header values sent ok (cpp->qpidd) header_test|py:Correct header values received (cpp->qpidd->python). header_test|c++2:All values received correctly 2009-mar-05 05:59:51 warning Session was not closed cleanly Device or resource busy header_test2: /usr/include/qpid/sys/posix/Mutex.h:112: qpid::sys::Mutex::~Mutex(): Assertion `0' failed. ./runtest.sh: line 306: 6858 Aborted (core dumped) ./${HEADER_TEST_B} [05:59:51] ..ERROR:core test failed! (ecode:00134) [05:59:51] .qpidd status check [05:59:52] mrg_qpidd_status: 1 instance[s] running (pids:6790 ,ports: 5672) [05:59:52] ..qpidd running fine on port 5672 (1 instance[s]) [05:59:52] .qpidd stop [05:59:52] mrg_qpidd_stop: following qpidd brokers are found to stop (6790 ) [05:59:57] .qpidd stopped ok [05:59:57] .remove qpidd parent lock (if needed)... [05:59:58] run(A) 5/5--------------------------[err_cnt:1]- ... [06:00:09] Core file: core.6858 generated by ./header_test2'. ------------------------- core.6858: ELF 64-bit LSB core file AMD x86-64, version 1 (SYSV), SVR4-style, from 'header_test2' GNU gdb Red Hat Linux (6.5-37.el5rh) Copyright (C) 2006 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu"..../header_test2'.: No such file or directory. Core was generated by `./header_test2'. Program terminated with signal 6, Aborted. #0 0x0000003b34630155 in ?? () (gdb) #0 0x0000003b34630155 in ?? () #1 0x0000003b34631bf0 in ?? () #2 0x00002aaaac000068 in ?? () #3 0x00002aaaac000068 in ?? () #4 0x0000000000000000 in ?? ()
I have reproduced the scenario above on RHEL 5.3 x86_64. The code is still without any close() call. (So neither session not connection one was added) [09:32:42] run(A) 40/50--------------------------[err_cnt:0]- [09:32:42] mrg_qpidd_start: qpidd launched normal bg way (port:5672,log:qpidd.transcript.log, params: --auth no --log-enable info+) [09:32:43] mrg_qpidd_wait_on_settle: qpidd started-up (dur:0sec) [09:32:43] .qpidd settled [09:32:43] .running core tests... (c++ client <--> qpidd <--> python client) header_test|c++:Header values sent ok (cpp->qpidd) header_test|py:Correct header values received (cpp->qpidd->python). header_test|c++2:All values received correctly 2009-mar-06 09:32:44 warning Session was not closed cleanly Device or resource busy header_test2: /usr/include/qpid/sys/posix/Mutex.h:112: qpid::sys::Mutex::~Mutex(): Assertion `0' failed. ./runtest.sh: line 306: 21620 Aborted (core dumped) ./${HEADER_TEST_B} [09:32:44] ..ERROR:core test failed! (ecode:00134) [09:32:44] .qpidd status check [09:32:44] mrg_qpidd_status: 1 instance[s] running (pids:21573 ,ports: 5672) [09:32:44] ..qpidd running fine on port 5672 (1 instance[s]) [09:32:44] .qpidd stop [09:32:44] mrg_qpidd_stop: following qpidd brokers are found to stop (21573 ) [09:32:49] .qpidd stopped ok [09:32:49] .remove qpidd parent lock (if needed)... [09:32:50] run(A) 41/50--------------------------[err_cnt:1]- [09:32:51] mrg_qpidd_start: qpidd launched normal bg way (port:5672,log:qpidd.transcript.log, params: --auth no --log-enable info+) [1]+ Stopped ./rf.sh ... [root@hp-xw4800-01 qpid_test_fieldtable_encoding_issue_bz468026.0008]# gdb ./header_test2 core.21620 GNU gdb Fedora (6.8-27.el5) Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu"... Reading symbols from /lib64/libpthread.so.0...done. Loaded symbols for /lib64/libpthread.so.0 Reading symbols from /usr/lib64/libqpidclient.so.0...done. Loaded symbols for /usr/lib64/libqpidclient.so.0 Reading symbols from /usr/lib64/libstdc++.so.6...done. Loaded symbols for /usr/lib64/libstdc++.so.6 Reading symbols from /lib64/libm.so.6...done. Loaded symbols for /lib64/libm.so.6 Reading symbols from /lib64/libgcc_s.so.1...done. Loaded symbols for /lib64/libgcc_s.so.1 Reading symbols from /lib64/libc.so.6...done. Loaded symbols for /lib64/libc.so.6 Reading symbols from /usr/lib64/libqpidcommon.so.0...done. Loaded symbols for /usr/lib64/libqpidcommon.so.0 Reading symbols from /usr/lib64/libboost_program_options.so.2...done. Loaded symbols for /usr/lib64/libboost_program_options.so.2 Reading symbols from /lib64/ld-linux-x86-64.so.2...done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols from /lib64/libuuid.so.1...done. Loaded symbols for /lib64/libuuid.so.1 Reading symbols from /usr/lib64/libboost_filesystem.so.2...done. Loaded symbols for /usr/lib64/libboost_filesystem.so.2 Reading symbols from /lib64/libdl.so.2...done. Loaded symbols for /lib64/libdl.so.2 Reading symbols from /lib64/librt.so.1...done. Loaded symbols for /lib64/librt.so.1 Reading symbols from /usr/lib64/libsasl2.so.2...done. Loaded symbols for /usr/lib64/libsasl2.so.2 Reading symbols from /lib64/libresolv.so.2...done. Loaded symbols for /lib64/libresolv.so.2 Reading symbols from /lib64/libcrypt.so.1...done. Loaded symbols for /lib64/libcrypt.so.1 Reading symbols from /usr/lib64/qpid/client/rdmaconnector.so...done. Loaded symbols for /usr/lib64/qpid/client/rdmaconnector.so Reading symbols from /usr/lib64/librdmawrap.so.0...done. Loaded symbols for /usr/lib64/librdmawrap.so.0 Reading symbols from /usr/lib64/librdmacm.so.1...done. Loaded symbols for /usr/lib64/librdmacm.so.1 Reading symbols from /usr/lib64/libibverbs.so.1...done. Loaded symbols for /usr/lib64/libibverbs.so.1 Reading symbols from /usr/lib64/qpid/client/sslconnector.so...done. Loaded symbols for /usr/lib64/qpid/client/sslconnector.so Reading symbols from /usr/lib64/libsslcommon.so.0...done. Loaded symbols for /usr/lib64/libsslcommon.so.0 Reading symbols from /usr/lib64/libnss3.so...done. Loaded symbols for /usr/lib64/libnss3.so Reading symbols from /usr/lib64/libssl3.so...done. Loaded symbols for /usr/lib64/libssl3.so Reading symbols from /usr/lib64/libnspr4.so...done. Loaded symbols for /usr/lib64/libnspr4.so Reading symbols from /usr/lib64/libnssutil3.so...done. Loaded symbols for /usr/lib64/libnssutil3.so Reading symbols from /usr/lib64/libplc4.so...done. Loaded symbols for /usr/lib64/libplc4.so Reading symbols from /usr/lib64/libplds4.so...done. Loaded symbols for /usr/lib64/libplds4.so Reading symbols from /lib64/libnss_files.so.2...done. Loaded symbols for /lib64/libnss_files.so.2 Reading symbols from /usr/lib64/sasl2/libsasldb.so.2...done. Loaded symbols for /usr/lib64/sasl2/libsasldb.so.2 Reading symbols from /usr/lib64/sasl2/libanonymous.so.2...done. Loaded symbols for /usr/lib64/sasl2/libanonymous.so.2 Reading symbols from /usr/lib64/sasl2/liblogin.so.2...done. Loaded symbols for /usr/lib64/sasl2/liblogin.so.2 Reading symbols from /usr/lib64/sasl2/libplain.so.2...done. Loaded symbols for /usr/lib64/sasl2/libplain.so.2 Core was generated by `./header_test2'. Program terminated with signal 6, Aborted. [New process 21620] [New process 21621] #0 0x0000003730e30215 in raise () from /lib64/libc.so.6 (gdb) bt #0 0x0000003730e30215 in raise () from /lib64/libc.so.6 #1 0x0000003730e31cc0 in abort () from /lib64/libc.so.6 #2 0x0000003730e29696 in __assert_fail () from /lib64/libc.so.6 #3 0x00000000004146ef in ~Mutex (this=0x366563b8a8) at /usr/include/qpid/sys/posix/Mutex.h:112 #4 0x00000036653b2b33 in qpid::log::Logger::~Logger () from /usr/lib64/libqpidcommon.so.0 #5 0x0000003730e3363e in __cxa_finalize () from /lib64/libc.so.6 #6 0x00000036652ea196 in ?? () from /usr/lib64/libqpidcommon.so.0 #7 0x0000003730c1c000 in ?? () from /lib64/ld-linux-x86-64.so.2 #8 0x0000000000000000 in ?? () (gdb)
Not able to reproduce so far, I'm very curious to know what the other threads were doing at the time of the crash. If you can reproduce it again in gdb do: thread apply all bt to get a bt of all the threads.
Please find below three thread backtraces... (tested on RHEL 5.3 i386 / x86_64) There is different behavior observed on i386 and x86_64, on i386 no issue seen no stderr message printed, but on x86_64 there is stderr message 2009-mar-10 06:59:24 warning Session was not closed cleanly and occasionly client core file dumped... [root@hp-xw4800-01 qpid_test_fieldtable_encoding_issue_bz468026.0006]# gdb ./header_test2 core.688 GNU gdb Fedora (6.8-27.el5) Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu"... Reading symbols from /lib64/libpthread.so.0...done. ... Loaded symbols for /usr/lib64/sasl2/libplain.so.2 Core was generated by `./header_test2'. Program terminated with signal 6, Aborted. [New process 688] [New process 689] #0 0x0000003730e30215 in raise () from /lib64/libc.so.6 (gdb) thread apply all bt Thread 2 (process 689): #0 0x0000003730a08c85 in check_match.8514 () from /lib64/ld-linux-x86-64.so.2 #1 0x0000003730a09044 in do_lookup_x () from /lib64/ld-linux-x86-64.so.2 #2 0x0000003730a09242 in _dl_lookup_symbol_x () from /lib64/ld-linux-x86-64.so.2 #3 0x0000003730a0cd35 in _dl_fixup () from /lib64/ld-linux-x86-64.so.2 #4 0x0000003730a128b2 in _dl_runtime_resolve () from /lib64/ld-linux-x86-64.so.2 #5 0x0000003665860dfc in std::_Rb_tree<unsigned short, std::pair<unsigned short const, boost::weak_ptr<qpid::client::SessionImpl> >, std::_Select1st<std::pair<unsigned short const, boost::weak_ptr<qpid::client::SessionImpl> > >, std::less<unsigned short>, std::allocator<std::pair<unsigned short const, boost::weak_ptr<qpid::client::SessionImpl> > > >::erase () from /usr/lib64/libqpidclient.so.0 #6 0x000000366585aac6 in qpid::client::ConnectionImpl::erase () from /usr/lib64/libqpidclient.so.0 #7 0x000000366589d85d in qpid::client::SessionImpl::~SessionImpl () from /usr/lib64/libqpidclient.so.0 #8 0x000000366585b755 in qpid::client::ConnectionImpl::incoming () from /usr/lib64/libqpidclient.so.0 #9 0x00000036658556aa in boost::function1<void, qpid::framing::AMQFrame&, std::allocator<void> >::operator() () from /usr/lib64/libqpidclient.so.0 #10 0x0000003665850f5f in qpid::client::ConnectionHandler::incoming () from /usr/lib64/libqpidclient.so.0 #11 0x0000003665864bfd in qpid::client::TCPConnector::decode () from /usr/lib64/libqpidclient.so.0 #12 0x0000003665864510 in qpid::client::TCPConnector::readbuff () from /usr/lib64/libqpidclient.so.0 #13 0x0000003665869a28 in boost::detail::function::function_obj_invoker2<boost::_bi::bind_t<bool, boost::_mfi::mf2<bool, qpid::client::TCPConnector, qpid::sys::AsynchIO&, qpid::sys::AsynchIOBufferBase*>, boost::_bi::list3<boost::_bi::value<qpid::client::TCPConnector*>, boost::arg<1>, boost::arg<2> > >, bool, qpid::sys::AsynchIO&, qpid::sys::AsynchIOBufferBase*>::invoke () from /usr/lib64/libqpidclient.so.0 #14 0x000000366536615a in boost::function2<bool, qpid::sys::AsynchIO&, qpid::sys::AsynchIOBufferBase*, std::allocator<boost::function_base> >::operator() () from /usr/lib64/libqpidcommon.so.0 #15 0x000000366535ff33 in qpid::sys::posix::AsynchIO::readable () from /usr/lib64/libqpidcommon.so.0 #16 0x00000036653bfc1a in boost::function1<void, qpid::sys::DispatchHandle&, std::allocator<boost::function_base> >::operator() () from /usr/lib64/libqpidcommon.so.0 #17 0x00000036653bf4cf in qpid::sys::DispatchHandle::processEvent () from /usr/lib64/libqpidcommon.so.0 #18 0x0000003665375183 in qpid::sys::Poller::run () from /usr/lib64/libqpidcommon.so.0 #19 0x000000366586605b in qpid::client::TCPConnector::run () from /usr/lib64/libqpidclient.so.0 #20 0x000000366536968a in ?? () from /usr/lib64/libqpidcommon.so.0 #21 0x0000003731606367 in start_thread () from /lib64/libpthread.so.0 #22 0x0000003730ed30ad in clone () from /lib64/libc.so.6 Thread 1 (process 688): #0 0x0000003730e30215 in raise () from /lib64/libc.so.6 #1 0x0000003730e31cc0 in abort () from /lib64/libc.so.6 #2 0x0000003730e29696 in __assert_fail () from /lib64/libc.so.6 #3 0x00000000004146ef in ~Mutex (this=0x366563b8a8) at /usr/include/qpid/sys/posix/Mutex.h:112 #4 0x00000036653b2b33 in qpid::log::Logger::~Logger () from /usr/lib64/libqpidcommon.so.0 #5 0x0000003730e3363e in __cxa_finalize () from /lib64/libc.so.6 #6 0x00000036652ea196 in ?? () from /usr/lib64/libqpidcommon.so.0 ---Type <return> to continue, or q <return> to quit--- #7 0x0000003730c1c000 in ?? () from /lib64/ld-linux-x86-64.so.2 #8 0x0000000000000000 in ?? () --------------- [root@hp-xw4800-01 qpid_test_fieldtable_encoding_issue_bz468026.0006]# gdb ./header_test2 core.3233 GNU gdb Fedora (6.8-27.el5) Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu"... Reading symbols from /lib64/libpthread.so.0...done. ... Loaded symbols for /usr/lib64/sasl2/libplain.so.2 Core was generated by `./header_test2'. Program terminated with signal 6, Aborted. [New process 3233] [New process 3234] #0 0x0000003730e30215 in raise () from /lib64/libc.so.6 (gdb) thread apply all bt Thread 2 (process 3234): #0 0x0000003730e758f0 in free () from /lib64/libc.so.6 #1 0x0000003665873d00 in qpid::sys::ExceptionHolder::Wrapper<qpid::ClosedException>::~Wrapper () from /usr/lib64/libqpidclient.so.0 #2 0x000000366589a8a9 in qpid::client::SessionImpl::handleClosed () from /usr/lib64/libqpidclient.so.0 #3 0x000000366589d487 in qpid::client::SessionImpl::~SessionImpl () from /usr/lib64/libqpidclient.so.0 #4 0x000000366585b755 in qpid::client::ConnectionImpl::incoming () from /usr/lib64/libqpidclient.so.0 #5 0x00000036658556aa in boost::function1<void, qpid::framing::AMQFrame&, std::allocator<void> >::operator() () from /usr/lib64/libqpidclient.so.0 #6 0x0000003665850f5f in qpid::client::ConnectionHandler::incoming () from /usr/lib64/libqpidclient.so.0 #7 0x0000003665864bfd in qpid::client::TCPConnector::decode () from /usr/lib64/libqpidclient.so.0 #8 0x0000003665864510 in qpid::client::TCPConnector::readbuff () from /usr/lib64/libqpidclient.so.0 #9 0x0000003665869a28 in boost::detail::function::function_obj_invoker2<boost::_bi::bind_t<bool, boost::_mfi::mf2<bool, qpid::client::TCPConnector, qpid::sys::AsynchIO&, qpid::sys::AsynchIOBufferBase*>, boost::_bi::list3<boost::_bi::value<qpid::client::TCPConnector*>, boost::arg<1>, boost::arg<2> > >, bool, qpid::sys::AsynchIO&, qpid::sys::AsynchIOBufferBase*>::invoke () from /usr/lib64/libqpidclient.so.0 #10 0x000000366536615a in boost::function2<bool, qpid::sys::AsynchIO&, qpid::sys::AsynchIOBufferBase*, std::allocator<boost::function_base> >::operator() () from /usr/lib64/libqpidcommon.so.0 #11 0x000000366535ff33 in qpid::sys::posix::AsynchIO::readable () from /usr/lib64/libqpidcommon.so.0 #12 0x00000036653bfc1a in boost::function1<void, qpid::sys::DispatchHandle&, std::allocator<boost::function_base> >::operator() () from /usr/lib64/libqpidcommon.so.0 #13 0x00000036653bf4cf in qpid::sys::DispatchHandle::processEvent () from /usr/lib64/libqpidcommon.so.0 #14 0x0000003665375183 in qpid::sys::Poller::run () from /usr/lib64/libqpidcommon.so.0 #15 0x000000366586605b in qpid::client::TCPConnector::run () from /usr/lib64/libqpidclient.so.0 #16 0x000000366536968a in ?? () from /usr/lib64/libqpidcommon.so.0 #17 0x0000003731606367 in start_thread () from /lib64/libpthread.so.0 #18 0x0000003730ed30ad in clone () from /lib64/libc.so.6 Thread 1 (process 3233): #0 0x0000003730e30215 in raise () from /lib64/libc.so.6 #1 0x0000003730e31cc0 in abort () from /lib64/libc.so.6 #2 0x0000003730e29696 in __assert_fail () from /lib64/libc.so.6 #3 0x00000000004146ef in ~Mutex (this=0x366563b8a8) at /usr/include/qpid/sys/posix/Mutex.h:112 #4 0x00000036653b2b33 in qpid::log::Logger::~Logger () from /usr/lib64/libqpidcommon.so.0 #5 0x0000003730e3363e in __cxa_finalize () from /lib64/libc.so.6 #6 0x00000036652ea196 in ?? () from /usr/lib64/libqpidcommon.so.0 #7 0x0000003730c1c000 in ?? () from /lib64/ld-linux-x86-64.so.2 #8 0x0000000000000000 in ?? () (gdb) --------------- [root@hp-xw4800-01 qpid_test_fieldtable_encoding_issue_bz468026.0006]# gdb ./header_test2 core.5731 GNU gdb Fedora (6.8-27.el5) Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu"... Reading symbols from /lib64/libpthread.so.0...done. ... Loaded symbols for /usr/lib64/sasl2/libplain.so.2 Core was generated by `./header_test2'. Program terminated with signal 6, Aborted. [New process 5731] [New process 5732] #0 0x0000003730e30215 in raise () from /lib64/libc.so.6 (gdb) thread apply all bt Thread 2 (process 5732): #0 0x0000003730a0cd8e in _dl_fixup () from /lib64/ld-linux-x86-64.so.2 #1 0x0000003730a128b2 in _dl_runtime_resolve () from /lib64/ld-linux-x86-64.so.2 #2 0x0000003665860dfc in std::_Rb_tree<unsigned short, std::pair<unsigned short const, boost::weak_ptr<qpid::client::SessionImpl> >, std::_Select1st<std::pair<unsigned short const, boost::weak_ptr<qpid::client::SessionImpl> > >, std::less<unsigned short>, std::allocator<std::pair<unsigned short const, boost::weak_ptr<qpid::client::SessionImpl> > > >::erase () from /usr/lib64/libqpidclient.so.0 #3 0x000000366585aac6 in qpid::client::ConnectionImpl::erase () from /usr/lib64/libqpidclient.so.0 #4 0x000000366589d85d in qpid::client::SessionImpl::~SessionImpl () from /usr/lib64/libqpidclient.so.0 #5 0x000000366585b755 in qpid::client::ConnectionImpl::incoming () from /usr/lib64/libqpidclient.so.0 #6 0x00000036658556aa in boost::function1<void, qpid::framing::AMQFrame&, std::allocator<void> >::operator() () from /usr/lib64/libqpidclient.so.0 #7 0x0000003665850f5f in qpid::client::ConnectionHandler::incoming () from /usr/lib64/libqpidclient.so.0 #8 0x0000003665864bfd in qpid::client::TCPConnector::decode () from /usr/lib64/libqpidclient.so.0 #9 0x0000003665864510 in qpid::client::TCPConnector::readbuff () from /usr/lib64/libqpidclient.so.0 #10 0x0000003665869a28 in boost::detail::function::function_obj_invoker2<boost::_bi::bind_t<bool, boost::_mfi::mf2<bool, qpid::client::TCPConnector, qpid::sys::AsynchIO&, qpid::sys::AsynchIOBufferBase*>, boost::_bi::list3<boost::_bi::value<qpid::client::TCPConnector*>, boost::arg<1>, boost::arg<2> > >, bool, qpid::sys::AsynchIO&, qpid::sys::AsynchIOBufferBase*>::invoke () from /usr/lib64/libqpidclient.so.0 #11 0x000000366536615a in boost::function2<bool, qpid::sys::AsynchIO&, qpid::sys::AsynchIOBufferBase*, std::allocator<boost::function_base> >::operator() () from /usr/lib64/libqpidcommon.so.0 #12 0x000000366535ff33 in qpid::sys::posix::AsynchIO::readable () from /usr/lib64/libqpidcommon.so.0 #13 0x00000036653bfc1a in boost::function1<void, qpid::sys::DispatchHandle&, std::allocator<boost::function_base> >::operator() () from /usr/lib64/libqpidcommon.so.0 #14 0x00000036653bf4cf in qpid::sys::DispatchHandle::processEvent () from /usr/lib64/libqpidcommon.so.0 #15 0x0000003665375183 in qpid::sys::Poller::run () from /usr/lib64/libqpidcommon.so.0 #16 0x000000366586605b in qpid::client::TCPConnector::run () from /usr/lib64/libqpidclient.so.0 #17 0x000000366536968a in ?? () from /usr/lib64/libqpidcommon.so.0 #18 0x0000003731606367 in start_thread () from /lib64/libpthread.so.0 #19 0x0000003730ed30ad in clone () from /lib64/libc.so.6 Thread 1 (process 5731): #0 0x0000003730e30215 in raise () from /lib64/libc.so.6 #1 0x0000003730e31cc0 in abort () from /lib64/libc.so.6 #2 0x0000003730e29696 in __assert_fail () from /lib64/libc.so.6 #3 0x00000000004146ef in ~Mutex (this=0x366563b8a8) at /usr/include/qpid/sys/posix/Mutex.h:112 #4 0x00000036653b2b33 in qpid::log::Logger::~Logger () from /usr/lib64/libqpidcommon.so.0 #5 0x0000003730e3363e in __cxa_finalize () from /lib64/libc.so.6 #6 0x00000036652ea196 in ?? () from /usr/lib64/libqpidcommon.so.0 #7 0x0000003730c1c000 in ?? () from /lib64/ld-linux-x86-64.so.2 #8 0x0000000000000000 in ?? () (gdb)
Here is what is happening: SessionImpl holds a shared_ptr to ConnectionImpl ConnectionImpl holds a map of weak_ptrs to SessionImpl The weak_ptrs are supposed to avoid a shared_ptr cycle that would cause leaks. However in the scope of ConnectionImpl::incoming ConnectionImpl takes a shared_ptr to the SessionImpl its working with creating a temporary cycle. If main() exits while this temporary cycle exists (as in the backtrace above) ConnectionImpl's destructor is not run, so it's Connector thread is not joined and is still running as the main thread runs global destructors. The core dump is caused when the rogue Connector thread tries to use the Logger concurrently with the Logger being destroyed.
Created attachment 348664 [details] test case to reproduce the bug This reproduces the bug on almost every run.
variation of case, created with perftest Thread 2 (process 28784): #0 0x0000003411600cfe in ?? () from /lib64/libdl.so.2 #1 0x00002b518ac234e0 in ?? () #2 0x0000000000000004 in ?? () #3 0x0000003411600e56 in __do_global_dtors_aux () from /lib64/libdl.so.2 #4 0x0000000000000000 in ?? () Thread 1 (process 29037): #0 0x0000003411230215 in raise () from /lib64/libc.so.6 #1 0x0000003411231cc0 in abort () from /lib64/libc.so.6 #2 0x0000003411229696 in __assert_fail () from /lib64/libc.so.6 #3 0x00002b518a927c1f in std::for_each<__gnu_cxx::__normal_iterator<qpid::sys::DeletionManager<qpid::sys::PollerHandlePrivate>::ThreadStatus**, std::vector<qpid::sys::DeletionManager<qpid::sys::PollerHandlePrivate>::ThreadStatus*, std::allocator<qpid::sys::DeletionManager<qpid::sys::PollerHandlePrivate>::ThreadStatus*> > >, qpid::sys::DeletionManager<qpid::sys::PollerHandlePrivate>::AllThreadsStatuses::handleAdder> ( __first=<value optimized out>, __last={_M_current = 0x1c3ba5a8}, __f= {handle = {px = 0x41d5f1e0, pn = {pi_ = 0x1c3ba5a8}}}) at ../../../qpid-working/cpp/src/qpid/sys/posix/Mutex.h:116 #4 0x00002b518a927d0c in qpid::sys::DeletionManager<qpid::sys::PollerHandlePrivate>::markForDeletion ( handle=0x1bf8e730) at ../../../qpid-working/cpp/src/qpid/sys/DeletionManager.h:130 #5 0x00002b518a9241fe in ~Poller (this=0x1bf90480) at ../../../qpid-working/cpp/src/qpid/sys/epoll/EpollPoller.cpp:181 #6 0x00002b518a5314a8 in ~TCPConnector (this=0x1c1a2b90) at /usr/include/boost/detail/sp_counted_base_gcc_x86.hpp:145 #7 0x00002b518a527fa4 in ~ConnectionImpl (this=0x1c1a2d70) at /usr/include/boost/checked_delete.hpp:34 #8 0x00002b518a562541 in ~SessionImpl (this=0x1c1a7850) at /usr/include/boost/detail/sp_counted_base_gcc_x86.hpp:145 #9 0x00002b518a55b290 in ~SessionBase_0_10 (this=<value optimized out>) at /usr/include/boost/detail/sp_counted_base_gcc_x86.hpp:145 #10 0x00000000004107a0 in ~Dispatcher (this=0x41d5f5d0) at ../../src/gen/qpid/client/no_keyword/Session_0_10.h:47 #11 0x000000000041606c in PublishThread::run (this=0x1c1a1540) at ../../../../qpid-working/cpp/src/tests/../qpid/client/SubscriptionManager.h:97 #12 0x00002b518a91d65a in runRunnable (p=0x7070) at ../../../qpid-working/cpp/src/qpid/sys/posix/Thread.cpp:35 #13 0x0000003411e06367 in start_thread () from /lib64/libpthread.so.0 #14 0x00000034112d30ad in clone () from /lib64/libc.so.6
and, both can be gotten with close() being called. Thread 2 (process 21886): #0 0x0000003411233600 in __cxa_finalize () from /lib64/libc.so.6 #1 0x0000003411600e56 in __do_global_dtors_aux () from /lib64/libdl.so.2 #2 0x0000000000000000 in ?? () Thread 1 (process 22033): #0 0x0000003411230215 in raise () from /lib64/libc.so.6 #1 0x0000003411231cc0 in abort () from /lib64/libc.so.6 #2 0x0000003411229696 in __assert_fail () from /lib64/libc.so.6 #3 0x00002b9f23708e01 in qpid::sys::DeletionManager<qpid::sys::PollerHandlePrivate>::markForDeletion ( handle=0x119750e0) at ../../../qpid-working/cpp/src/qpid/sys/posix/Mutex.h:116 #4 0x00002b9f236f6c77 in ~AsynchIO (this=0x11974920) at ../../../qpid-working/cpp/src/qpid/sys/posix/AsynchIO.cpp:335 #5 0x00002b9f237515bb in qpid::sys::DispatchHandle::processEvent (this=0x11974928, type=qpid::sys::Poller::INTERRUPTED) at ../../../qpid-working/cpp/src/qpid/sys/DispatchHandle.cpp:505 #6 0x00002b9f23707668 in qpid::sys::Poller::run (this=0x119753c0) at ../../../qpid-working/cpp/src/qpid/sys/Poller.h:122 #7 0x00002b9f2331357b in qpid::client::TCPConnector::run (this=0x11b89b90) at ../../../qpid-working/cpp/src/qpid/client/Connector.cpp:396 #8 0x00002b9f236fe65a in runRunnable (p=0x557e) at ../../../qpid-working/cpp/src/qpid/sys/posix/Thread.cpp:35 #9 0x0000003411e06367 in start_thread () from /lib64/libpthread.so.0 #10 0x00000034112d30ad in clone () from /lib64/libc.so.6
Comment #6 is looks like a bug with perftest, in that if main has a connect exception, we don't wait for the join. Fix perftest for #6.
Created attachment 375184 [details] Reproducer Previous dirtyexit reproducer was incorrect (deleting connection while threads still running) This one shows the bug about every 100 runs. Run as follows: while ./dirtyreceiver; do true; done And at the same time run the normal sender client to feed messages: seq 1000000 | sender Bug is still present in r885870
Created attachment 402347 [details] Reproducer source code. Attached source code for dirtyreceiver reproducer, previously attahced a binary by mistake. Bug is still present and easily reproducible on r926867
should be fixed in qpid-cpp-client-0.7.935473-1.el5
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause: The c++ qpid client does not call session.close() and connection.close() Consequence: About 2% of the time, the client triggers an assertion and aborts. Fix: Changes to shared_ptr and weak_ptr strategy in SessionImpl and ConnectionImpl. Result: no more assertions observed.
Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1,11 +1 @@ -Cause: +Previously, the C++ qpid client did not call the 'session.close()' and 'connection.close()' methods. On rare occasions, this caused the client to trigger an assertion and abort. With this update, changes were made to the 'shared_ptr' and 'weak_ptr' cycles in the 'SessionImpl' and 'ConnectionImpl' methods, thus, assertions no longer occur.- The c++ qpid client does not call session.close() and connection.close() - - Consequence: - About 2% of the time, the client triggers an assertion and aborts. - - Fix: - Changes to shared_ptr and weak_ptr strategy in SessionImpl and ConnectionImpl. - - Result: - no more assertions observed.
Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1 +1 @@ -Previously, the C++ qpid client did not call the 'session.close()' and 'connection.close()' methods. On rare occasions, this caused the client to trigger an assertion and abort. With this update, changes were made to the 'shared_ptr' and 'weak_ptr' cycles in the 'SessionImpl' and 'ConnectionImpl' methods, thus, assertions no longer occur.+Previously, the C++ qpid client did not call the session.close() and connection.close() methods. On rare occasions, this caused the client to trigger an assertion and abort. With this update, changes were made to the 'shared_ptr' and 'weak_ptr' cycles in the "SessionImpl' and 'ConnectionImpl' methods such that the assertion is no longer triggered.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2010-0773.html