Bug 488942 - c++ client aborts when session and connection not closed
Summary: c++ client aborts when session and connection not closed
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: 1.1
Hardware: All
OS: Linux
medium
medium
Target Milestone: 1.3
: ---
Assignee: Gordon Sim
QA Contact: Frantisek Reznicek
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-03-06 11:19 UTC by Frantisek Reznicek
Modified: 2015-11-16 00:06 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previously, the C++ qpid client did not call the session.close() and connection.close() methods. On rare occasions, this caused the client to trigger an assertion and abort. With this update, changes were made to the 'shared_ptr' and 'weak_ptr' cycles in the "SessionImpl' and 'ConnectionImpl' methods such that the assertion is no longer triggered.
Clone Of:
Environment:
Last Closed: 2010-10-14 16:11:43 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
test case to reproduce the bug (2.54 KB, text/x-c++src)
2009-06-19 13:29 UTC, Alan Conway
no flags Details
Reproducer (50.38 KB, application/octet-stream)
2009-12-01 20:27 UTC, Alan Conway
no flags Details
Reproducer source code. (2.00 KB, text/x-c++src)
2010-03-24 15:56 UTC, Alan Conway
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2010:0773 0 normal SHIPPED_LIVE Moderate: Red Hat Enterprise MRG Messaging and Grid Version 1.3 2010-10-14 15:56:44 UTC

Description Frantisek Reznicek 2009-03-06 11:19:47 UTC
Description of problem:

When the c++ qpid client does not call session.close() and connection.close().
This is cleanly bad code, but the question is whether this bad example can or cannot encounter this assertion:

/usr/include/qpid/sys/posix/Mutex.h:112: qpid::sys::Mutex::~Mutex(): 

Test log files:
  https://rhts.redhat.com/cgi-bin/rhts/test_log.cgi?id=7118434
Test and client source codes are:
  http://cvs.devel.redhat.com/cgi-bin/cvsweb.cgi/tests/distribution/MRG_Messaging/qpid_test_fieldtable_encoding_issue_bz468026/

Bad code example:
  http://cvs.devel.redhat.com/cgi-bin/cvsweb.cgi/tests/distribution/MRG_Messaging/qpid_test_fieldtable_encoding_issue_bz468026/header_test2.cpp?annotate=1.3

Version-Release number of selected component (if applicable):
qpidd-0.4.750054-1.el5, qpidc-0.4.750054-1.el5


How reproducible:
rarely (~2%)

Steps to Reproduce:
1. run qpid_test_fieldtable_encoding_issue_bz468026 in a loop
2. seek the failures
  
Actual results:
c++ client occassionly aborts with qpid::sys::Mutex::~Mutex(): Assertion `0' failed.

Expected results:


Additional info: (test logfile fragment)

[05:59:50] run(A) 4/5--------------------------[err_cnt:0]-
[05:59:50] mrg_qpidd_start: qpidd launched normal bg way (port:5672,log:qpidd.transcript.log, params: --auth no --log-enable info+)
[05:59:51] mrg_qpidd_wait_on_settle: qpidd started-up (dur:0sec)
[05:59:51] .qpidd settled
[05:59:51] .running core tests... (c++ client <--> qpidd <--> python client)
header_test|c++:Header values sent ok (cpp->qpidd)
header_test|py:Correct header values received (cpp->qpidd->python).
header_test|c++2:All values received correctly
2009-mar-05 05:59:51 warning Session was not closed cleanly
Device or resource busy
header_test2: /usr/include/qpid/sys/posix/Mutex.h:112: qpid::sys::Mutex::~Mutex(): Assertion `0' failed.
./runtest.sh: line 306:  6858 Aborted                 (core dumped) ./${HEADER_TEST_B}
[05:59:51] ..ERROR:core test failed! (ecode:00134)
[05:59:51] .qpidd status check
[05:59:52] mrg_qpidd_status: 1 instance[s] running (pids:6790 ,ports: 5672)
[05:59:52] ..qpidd running fine on port  5672 (1 instance[s])
[05:59:52] .qpidd stop
[05:59:52] mrg_qpidd_stop: following qpidd brokers are found to stop (6790 )
[05:59:57] .qpidd stopped ok
[05:59:57] .remove qpidd parent lock (if needed)...
[05:59:58] run(A) 5/5--------------------------[err_cnt:1]-
...

[06:00:09] Core file: core.6858 generated by ./header_test2'. -------------------------
core.6858: ELF 64-bit LSB core file AMD x86-64, version 1 (SYSV), SVR4-style, from 'header_test2'
  GNU gdb Red Hat Linux (6.5-37.el5rh)
  Copyright (C) 2006 Free Software Foundation, Inc.
  GDB is free software, covered by the GNU General Public License, and you are
  welcome to change it and/or distribute copies of it under certain conditions.
  Type "show copying" to see the conditions.
  There is absolutely no warranty for GDB.  Type "show warranty" for details.
  This GDB was configured as "x86_64-redhat-linux-gnu"..../header_test2'.: No such file or directory.
  Core was generated by `./header_test2'.
  Program terminated with signal 6, Aborted.
  #0  0x0000003b34630155 in ?? ()
  (gdb) #0  0x0000003b34630155 in ?? ()
  #1  0x0000003b34631bf0 in ?? ()
  #2  0x00002aaaac000068 in ?? ()
  #3  0x00002aaaac000068 in ?? ()
  #4  0x0000000000000000 in ?? ()

Comment 1 Frantisek Reznicek 2009-03-06 14:38:34 UTC
I have reproduced the scenario above on RHEL 5.3 x86_64.

The code is still without any close() call. (So neither session not connection one was added)

[09:32:42] run(A) 40/50--------------------------[err_cnt:0]-
[09:32:42] mrg_qpidd_start: qpidd launched normal bg way (port:5672,log:qpidd.transcript.log, params: --auth no --log-enable info+)
[09:32:43] mrg_qpidd_wait_on_settle: qpidd started-up (dur:0sec)
[09:32:43] .qpidd settled
[09:32:43] .running core tests... (c++ client <--> qpidd <--> python client)
header_test|c++:Header values sent ok (cpp->qpidd)
header_test|py:Correct header values received (cpp->qpidd->python).
header_test|c++2:All values received correctly
2009-mar-06 09:32:44 warning Session was not closed cleanly
Device or resource busy
header_test2: /usr/include/qpid/sys/posix/Mutex.h:112: qpid::sys::Mutex::~Mutex(): Assertion `0' failed.
./runtest.sh: line 306: 21620 Aborted                 (core dumped) ./${HEADER_TEST_B}
[09:32:44] ..ERROR:core test failed! (ecode:00134)
[09:32:44] .qpidd status check
[09:32:44] mrg_qpidd_status: 1 instance[s] running (pids:21573 ,ports: 5672)
[09:32:44] ..qpidd running fine on port  5672 (1 instance[s])
[09:32:44] .qpidd stop
[09:32:44] mrg_qpidd_stop: following qpidd brokers are found to stop (21573 )
[09:32:49] .qpidd stopped ok
[09:32:49] .remove qpidd parent lock (if needed)...
[09:32:50] run(A) 41/50--------------------------[err_cnt:1]-
[09:32:51] mrg_qpidd_start: qpidd launched normal bg way (port:5672,log:qpidd.transcript.log, params: --auth no --log-enable info+)

[1]+  Stopped                 ./rf.sh
...
[root@hp-xw4800-01 qpid_test_fieldtable_encoding_issue_bz468026.0008]# gdb ./header_test2 core.21620
GNU gdb Fedora (6.8-27.el5)
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...
Reading symbols from /lib64/libpthread.so.0...done.
Loaded symbols for /lib64/libpthread.so.0
Reading symbols from /usr/lib64/libqpidclient.so.0...done.
Loaded symbols for /usr/lib64/libqpidclient.so.0
Reading symbols from /usr/lib64/libstdc++.so.6...done.
Loaded symbols for /usr/lib64/libstdc++.so.6
Reading symbols from /lib64/libm.so.6...done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /lib64/libgcc_s.so.1...done.
Loaded symbols for /lib64/libgcc_s.so.1
Reading symbols from /lib64/libc.so.6...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /usr/lib64/libqpidcommon.so.0...done.
Loaded symbols for /usr/lib64/libqpidcommon.so.0
Reading symbols from /usr/lib64/libboost_program_options.so.2...done.
Loaded symbols for /usr/lib64/libboost_program_options.so.2
Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib64/libuuid.so.1...done.
Loaded symbols for /lib64/libuuid.so.1
Reading symbols from /usr/lib64/libboost_filesystem.so.2...done.
Loaded symbols for /usr/lib64/libboost_filesystem.so.2
Reading symbols from /lib64/libdl.so.2...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/librt.so.1...done.
Loaded symbols for /lib64/librt.so.1
Reading symbols from /usr/lib64/libsasl2.so.2...done.
Loaded symbols for /usr/lib64/libsasl2.so.2
Reading symbols from /lib64/libresolv.so.2...done.
Loaded symbols for /lib64/libresolv.so.2
Reading symbols from /lib64/libcrypt.so.1...done.
Loaded symbols for /lib64/libcrypt.so.1
Reading symbols from /usr/lib64/qpid/client/rdmaconnector.so...done.
Loaded symbols for /usr/lib64/qpid/client/rdmaconnector.so
Reading symbols from /usr/lib64/librdmawrap.so.0...done.
Loaded symbols for /usr/lib64/librdmawrap.so.0
Reading symbols from /usr/lib64/librdmacm.so.1...done.
Loaded symbols for /usr/lib64/librdmacm.so.1
Reading symbols from /usr/lib64/libibverbs.so.1...done.
Loaded symbols for /usr/lib64/libibverbs.so.1
Reading symbols from /usr/lib64/qpid/client/sslconnector.so...done.
Loaded symbols for /usr/lib64/qpid/client/sslconnector.so
Reading symbols from /usr/lib64/libsslcommon.so.0...done.
Loaded symbols for /usr/lib64/libsslcommon.so.0
Reading symbols from /usr/lib64/libnss3.so...done.
Loaded symbols for /usr/lib64/libnss3.so
Reading symbols from /usr/lib64/libssl3.so...done.
Loaded symbols for /usr/lib64/libssl3.so
Reading symbols from /usr/lib64/libnspr4.so...done.
Loaded symbols for /usr/lib64/libnspr4.so
Reading symbols from /usr/lib64/libnssutil3.so...done.
Loaded symbols for /usr/lib64/libnssutil3.so
Reading symbols from /usr/lib64/libplc4.so...done.
Loaded symbols for /usr/lib64/libplc4.so
Reading symbols from /usr/lib64/libplds4.so...done.
Loaded symbols for /usr/lib64/libplds4.so
Reading symbols from /lib64/libnss_files.so.2...done.
Loaded symbols for /lib64/libnss_files.so.2
Reading symbols from /usr/lib64/sasl2/libsasldb.so.2...done.
Loaded symbols for /usr/lib64/sasl2/libsasldb.so.2
Reading symbols from /usr/lib64/sasl2/libanonymous.so.2...done.
Loaded symbols for /usr/lib64/sasl2/libanonymous.so.2
Reading symbols from /usr/lib64/sasl2/liblogin.so.2...done.
Loaded symbols for /usr/lib64/sasl2/liblogin.so.2
Reading symbols from /usr/lib64/sasl2/libplain.so.2...done.
Loaded symbols for /usr/lib64/sasl2/libplain.so.2
Core was generated by `./header_test2'.
Program terminated with signal 6, Aborted.
[New process 21620]
[New process 21621]
#0  0x0000003730e30215 in raise () from /lib64/libc.so.6
(gdb) bt
#0  0x0000003730e30215 in raise () from /lib64/libc.so.6
#1  0x0000003730e31cc0 in abort () from /lib64/libc.so.6
#2  0x0000003730e29696 in __assert_fail () from /lib64/libc.so.6
#3  0x00000000004146ef in ~Mutex (this=0x366563b8a8) at /usr/include/qpid/sys/posix/Mutex.h:112
#4  0x00000036653b2b33 in qpid::log::Logger::~Logger () from /usr/lib64/libqpidcommon.so.0
#5  0x0000003730e3363e in __cxa_finalize () from /lib64/libc.so.6
#6  0x00000036652ea196 in ?? () from /usr/lib64/libqpidcommon.so.0
#7  0x0000003730c1c000 in ?? () from /lib64/ld-linux-x86-64.so.2
#8  0x0000000000000000 in ?? ()
(gdb)

Comment 2 Alan Conway 2009-03-09 21:24:58 UTC
Not able to reproduce so far, I'm very curious to know what the other threads were doing at the time of the crash. If you can reproduce it again in gdb do:
 thread apply all bt
to get a bt of all the threads.

Comment 3 Frantisek Reznicek 2009-03-10 11:02:29 UTC
Please find below three thread backtraces... (tested on RHEL 5.3 i386 / x86_64)

There is different behavior observed on i386 and x86_64,
on i386 no issue seen no stderr message printed, but on x86_64
there is stderr message
2009-mar-10 06:59:24 warning Session was not closed cleanly
and occasionly client core file dumped...


[root@hp-xw4800-01 qpid_test_fieldtable_encoding_issue_bz468026.0006]# gdb ./header_test2 core.688
GNU gdb Fedora (6.8-27.el5)
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...
Reading symbols from /lib64/libpthread.so.0...done.
...
Loaded symbols for /usr/lib64/sasl2/libplain.so.2
Core was generated by `./header_test2'.
Program terminated with signal 6, Aborted.
[New process 688]
[New process 689]
#0  0x0000003730e30215 in raise () from /lib64/libc.so.6
(gdb) thread apply all bt

Thread 2 (process 689):
#0  0x0000003730a08c85 in check_match.8514 () from /lib64/ld-linux-x86-64.so.2
#1  0x0000003730a09044 in do_lookup_x () from /lib64/ld-linux-x86-64.so.2
#2  0x0000003730a09242 in _dl_lookup_symbol_x () from /lib64/ld-linux-x86-64.so.2
#3  0x0000003730a0cd35 in _dl_fixup () from /lib64/ld-linux-x86-64.so.2
#4  0x0000003730a128b2 in _dl_runtime_resolve () from /lib64/ld-linux-x86-64.so.2
#5  0x0000003665860dfc in std::_Rb_tree<unsigned short, std::pair<unsigned short const, boost::weak_ptr<qpid::client::SessionImpl> >, std::_Select1st<std::pair<unsigned short const, boost::weak_ptr<qpid::client::SessionImpl> > >, std::less<unsigned short>, std::allocator<std::pair<unsigned short const, boost::weak_ptr<qpid::client::SessionImpl> > > >::erase () from /usr/lib64/libqpidclient.so.0
#6  0x000000366585aac6 in qpid::client::ConnectionImpl::erase () from /usr/lib64/libqpidclient.so.0
#7  0x000000366589d85d in qpid::client::SessionImpl::~SessionImpl () from /usr/lib64/libqpidclient.so.0
#8  0x000000366585b755 in qpid::client::ConnectionImpl::incoming () from /usr/lib64/libqpidclient.so.0
#9  0x00000036658556aa in boost::function1<void, qpid::framing::AMQFrame&, std::allocator<void> >::operator() () from /usr/lib64/libqpidclient.so.0
#10 0x0000003665850f5f in qpid::client::ConnectionHandler::incoming () from /usr/lib64/libqpidclient.so.0
#11 0x0000003665864bfd in qpid::client::TCPConnector::decode () from /usr/lib64/libqpidclient.so.0
#12 0x0000003665864510 in qpid::client::TCPConnector::readbuff () from /usr/lib64/libqpidclient.so.0
#13 0x0000003665869a28 in boost::detail::function::function_obj_invoker2<boost::_bi::bind_t<bool, boost::_mfi::mf2<bool, qpid::client::TCPConnector, qpid::sys::AsynchIO&, qpid::sys::AsynchIOBufferBase*>, boost::_bi::list3<boost::_bi::value<qpid::client::TCPConnector*>, boost::arg<1>, boost::arg<2> > >, bool, qpid::sys::AsynchIO&, qpid::sys::AsynchIOBufferBase*>::invoke () from /usr/lib64/libqpidclient.so.0
#14 0x000000366536615a in boost::function2<bool, qpid::sys::AsynchIO&, qpid::sys::AsynchIOBufferBase*, std::allocator<boost::function_base> >::operator() () from /usr/lib64/libqpidcommon.so.0
#15 0x000000366535ff33 in qpid::sys::posix::AsynchIO::readable () from /usr/lib64/libqpidcommon.so.0
#16 0x00000036653bfc1a in boost::function1<void, qpid::sys::DispatchHandle&, std::allocator<boost::function_base> >::operator() ()
   from /usr/lib64/libqpidcommon.so.0
#17 0x00000036653bf4cf in qpid::sys::DispatchHandle::processEvent () from /usr/lib64/libqpidcommon.so.0
#18 0x0000003665375183 in qpid::sys::Poller::run () from /usr/lib64/libqpidcommon.so.0
#19 0x000000366586605b in qpid::client::TCPConnector::run () from /usr/lib64/libqpidclient.so.0
#20 0x000000366536968a in ?? () from /usr/lib64/libqpidcommon.so.0
#21 0x0000003731606367 in start_thread () from /lib64/libpthread.so.0
#22 0x0000003730ed30ad in clone () from /lib64/libc.so.6

Thread 1 (process 688):
#0  0x0000003730e30215 in raise () from /lib64/libc.so.6
#1  0x0000003730e31cc0 in abort () from /lib64/libc.so.6
#2  0x0000003730e29696 in __assert_fail () from /lib64/libc.so.6
#3  0x00000000004146ef in ~Mutex (this=0x366563b8a8) at /usr/include/qpid/sys/posix/Mutex.h:112
#4  0x00000036653b2b33 in qpid::log::Logger::~Logger () from /usr/lib64/libqpidcommon.so.0
#5  0x0000003730e3363e in __cxa_finalize () from /lib64/libc.so.6
#6  0x00000036652ea196 in ?? () from /usr/lib64/libqpidcommon.so.0
---Type <return> to continue, or q <return> to quit---
#7  0x0000003730c1c000 in ?? () from /lib64/ld-linux-x86-64.so.2
#8  0x0000000000000000 in ?? ()



---------------


[root@hp-xw4800-01 qpid_test_fieldtable_encoding_issue_bz468026.0006]# gdb ./header_test2 core.3233
GNU gdb Fedora (6.8-27.el5)
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...
Reading symbols from /lib64/libpthread.so.0...done.
...
Loaded symbols for /usr/lib64/sasl2/libplain.so.2
Core was generated by `./header_test2'.
Program terminated with signal 6, Aborted.
[New process 3233]
[New process 3234]
#0  0x0000003730e30215 in raise () from /lib64/libc.so.6
(gdb) thread apply all bt

Thread 2 (process 3234):
#0  0x0000003730e758f0 in free () from /lib64/libc.so.6
#1  0x0000003665873d00 in qpid::sys::ExceptionHolder::Wrapper<qpid::ClosedException>::~Wrapper () from /usr/lib64/libqpidclient.so.0
#2  0x000000366589a8a9 in qpid::client::SessionImpl::handleClosed () from /usr/lib64/libqpidclient.so.0
#3  0x000000366589d487 in qpid::client::SessionImpl::~SessionImpl () from /usr/lib64/libqpidclient.so.0
#4  0x000000366585b755 in qpid::client::ConnectionImpl::incoming () from /usr/lib64/libqpidclient.so.0
#5  0x00000036658556aa in boost::function1<void, qpid::framing::AMQFrame&, std::allocator<void> >::operator() () from /usr/lib64/libqpidclient.so.0
#6  0x0000003665850f5f in qpid::client::ConnectionHandler::incoming () from /usr/lib64/libqpidclient.so.0
#7  0x0000003665864bfd in qpid::client::TCPConnector::decode () from /usr/lib64/libqpidclient.so.0
#8  0x0000003665864510 in qpid::client::TCPConnector::readbuff () from /usr/lib64/libqpidclient.so.0
#9  0x0000003665869a28 in boost::detail::function::function_obj_invoker2<boost::_bi::bind_t<bool, boost::_mfi::mf2<bool, qpid::client::TCPConnector, qpid::sys::AsynchIO&, qpid::sys::AsynchIOBufferBase*>, boost::_bi::list3<boost::_bi::value<qpid::client::TCPConnector*>, boost::arg<1>, boost::arg<2> > >, bool, qpid::sys::AsynchIO&, qpid::sys::AsynchIOBufferBase*>::invoke () from /usr/lib64/libqpidclient.so.0
#10 0x000000366536615a in boost::function2<bool, qpid::sys::AsynchIO&, qpid::sys::AsynchIOBufferBase*, std::allocator<boost::function_base> >::operator() () from /usr/lib64/libqpidcommon.so.0
#11 0x000000366535ff33 in qpid::sys::posix::AsynchIO::readable () from /usr/lib64/libqpidcommon.so.0
#12 0x00000036653bfc1a in boost::function1<void, qpid::sys::DispatchHandle&, std::allocator<boost::function_base> >::operator() ()
   from /usr/lib64/libqpidcommon.so.0
#13 0x00000036653bf4cf in qpid::sys::DispatchHandle::processEvent () from /usr/lib64/libqpidcommon.so.0
#14 0x0000003665375183 in qpid::sys::Poller::run () from /usr/lib64/libqpidcommon.so.0
#15 0x000000366586605b in qpid::client::TCPConnector::run () from /usr/lib64/libqpidclient.so.0
#16 0x000000366536968a in ?? () from /usr/lib64/libqpidcommon.so.0
#17 0x0000003731606367 in start_thread () from /lib64/libpthread.so.0
#18 0x0000003730ed30ad in clone () from /lib64/libc.so.6

Thread 1 (process 3233):
#0  0x0000003730e30215 in raise () from /lib64/libc.so.6
#1  0x0000003730e31cc0 in abort () from /lib64/libc.so.6
#2  0x0000003730e29696 in __assert_fail () from /lib64/libc.so.6
#3  0x00000000004146ef in ~Mutex (this=0x366563b8a8) at /usr/include/qpid/sys/posix/Mutex.h:112
#4  0x00000036653b2b33 in qpid::log::Logger::~Logger () from /usr/lib64/libqpidcommon.so.0
#5  0x0000003730e3363e in __cxa_finalize () from /lib64/libc.so.6
#6  0x00000036652ea196 in ?? () from /usr/lib64/libqpidcommon.so.0
#7  0x0000003730c1c000 in ?? () from /lib64/ld-linux-x86-64.so.2
#8  0x0000000000000000 in ?? ()
(gdb)



---------------

[root@hp-xw4800-01 qpid_test_fieldtable_encoding_issue_bz468026.0006]# gdb ./header_test2 core.5731
GNU gdb Fedora (6.8-27.el5)
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...
Reading symbols from /lib64/libpthread.so.0...done.
...
Loaded symbols for /usr/lib64/sasl2/libplain.so.2
Core was generated by `./header_test2'.
Program terminated with signal 6, Aborted.
[New process 5731]
[New process 5732]
#0  0x0000003730e30215 in raise () from /lib64/libc.so.6
(gdb) thread apply all bt

Thread 2 (process 5732):
#0  0x0000003730a0cd8e in _dl_fixup () from /lib64/ld-linux-x86-64.so.2
#1  0x0000003730a128b2 in _dl_runtime_resolve () from /lib64/ld-linux-x86-64.so.2
#2  0x0000003665860dfc in std::_Rb_tree<unsigned short, std::pair<unsigned short const, boost::weak_ptr<qpid::client::SessionImpl> >, std::_Select1st<std::pair<unsigned short const, boost::weak_ptr<qpid::client::SessionImpl> > >, std::less<unsigned short>, std::allocator<std::pair<unsigned short const, boost::weak_ptr<qpid::client::SessionImpl> > > >::erase () from /usr/lib64/libqpidclient.so.0
#3  0x000000366585aac6 in qpid::client::ConnectionImpl::erase () from /usr/lib64/libqpidclient.so.0
#4  0x000000366589d85d in qpid::client::SessionImpl::~SessionImpl () from /usr/lib64/libqpidclient.so.0
#5  0x000000366585b755 in qpid::client::ConnectionImpl::incoming () from /usr/lib64/libqpidclient.so.0
#6  0x00000036658556aa in boost::function1<void, qpid::framing::AMQFrame&, std::allocator<void> >::operator() () from /usr/lib64/libqpidclient.so.0
#7  0x0000003665850f5f in qpid::client::ConnectionHandler::incoming () from /usr/lib64/libqpidclient.so.0
#8  0x0000003665864bfd in qpid::client::TCPConnector::decode () from /usr/lib64/libqpidclient.so.0
#9  0x0000003665864510 in qpid::client::TCPConnector::readbuff () from /usr/lib64/libqpidclient.so.0
#10 0x0000003665869a28 in boost::detail::function::function_obj_invoker2<boost::_bi::bind_t<bool, boost::_mfi::mf2<bool, qpid::client::TCPConnector, qpid::sys::AsynchIO&, qpid::sys::AsynchIOBufferBase*>, boost::_bi::list3<boost::_bi::value<qpid::client::TCPConnector*>, boost::arg<1>, boost::arg<2> > >, bool, qpid::sys::AsynchIO&, qpid::sys::AsynchIOBufferBase*>::invoke () from /usr/lib64/libqpidclient.so.0
#11 0x000000366536615a in boost::function2<bool, qpid::sys::AsynchIO&, qpid::sys::AsynchIOBufferBase*, std::allocator<boost::function_base> >::operator() () from /usr/lib64/libqpidcommon.so.0
#12 0x000000366535ff33 in qpid::sys::posix::AsynchIO::readable () from /usr/lib64/libqpidcommon.so.0
#13 0x00000036653bfc1a in boost::function1<void, qpid::sys::DispatchHandle&, std::allocator<boost::function_base> >::operator() ()
   from /usr/lib64/libqpidcommon.so.0
#14 0x00000036653bf4cf in qpid::sys::DispatchHandle::processEvent () from /usr/lib64/libqpidcommon.so.0
#15 0x0000003665375183 in qpid::sys::Poller::run () from /usr/lib64/libqpidcommon.so.0
#16 0x000000366586605b in qpid::client::TCPConnector::run () from /usr/lib64/libqpidclient.so.0
#17 0x000000366536968a in ?? () from /usr/lib64/libqpidcommon.so.0
#18 0x0000003731606367 in start_thread () from /lib64/libpthread.so.0
#19 0x0000003730ed30ad in clone () from /lib64/libc.so.6

Thread 1 (process 5731):
#0  0x0000003730e30215 in raise () from /lib64/libc.so.6
#1  0x0000003730e31cc0 in abort () from /lib64/libc.so.6
#2  0x0000003730e29696 in __assert_fail () from /lib64/libc.so.6
#3  0x00000000004146ef in ~Mutex (this=0x366563b8a8) at /usr/include/qpid/sys/posix/Mutex.h:112
#4  0x00000036653b2b33 in qpid::log::Logger::~Logger () from /usr/lib64/libqpidcommon.so.0
#5  0x0000003730e3363e in __cxa_finalize () from /lib64/libc.so.6
#6  0x00000036652ea196 in ?? () from /usr/lib64/libqpidcommon.so.0
#7  0x0000003730c1c000 in ?? () from /lib64/ld-linux-x86-64.so.2
#8  0x0000000000000000 in ?? ()
(gdb)

Comment 4 Alan Conway 2009-03-10 14:13:51 UTC
Here is what is happening:

SessionImpl holds a shared_ptr to ConnectionImpl
ConnectionImpl holds a map of weak_ptrs to SessionImpl
The weak_ptrs are supposed to avoid a shared_ptr cycle that would cause leaks.

However in the scope of ConnectionImpl::incoming ConnectionImpl takes a shared_ptr to the SessionImpl its working with creating a temporary cycle.

If main() exits while this temporary cycle exists (as in the backtrace above) ConnectionImpl's destructor is not run, so it's Connector thread is not joined and is still running as the main thread runs global destructors.

The core dump is caused when the rogue Connector thread tries to use the Logger 
concurrently with the Logger being destroyed.

Comment 5 Alan Conway 2009-06-19 13:29:50 UTC
Created attachment 348664 [details]
test case to reproduce the bug

This reproduces the bug on almost every run.

Comment 6 Carl Trieloff 2009-06-19 20:01:00 UTC

variation of case, created with perftest


Thread 2 (process 28784):
#0  0x0000003411600cfe in ?? () from /lib64/libdl.so.2
#1  0x00002b518ac234e0 in ?? ()
#2  0x0000000000000004 in ?? ()
#3  0x0000003411600e56 in __do_global_dtors_aux () from /lib64/libdl.so.2
#4  0x0000000000000000 in ?? ()

Thread 1 (process 29037):
#0  0x0000003411230215 in raise () from /lib64/libc.so.6
#1  0x0000003411231cc0 in abort () from /lib64/libc.so.6
#2  0x0000003411229696 in __assert_fail () from /lib64/libc.so.6
#3  0x00002b518a927c1f in std::for_each<__gnu_cxx::__normal_iterator<qpid::sys::DeletionManager<qpid::sys::PollerHandlePrivate>::ThreadStatus**, std::vector<qpid::sys::DeletionManager<qpid::sys::PollerHandlePrivate>::ThreadStatus*, std::allocator<qpid::sys::DeletionManager<qpid::sys::PollerHandlePrivate>::ThreadStatus*> > >, qpid::sys::DeletionManager<qpid::sys::PollerHandlePrivate>::AllThreadsStatuses::handleAdder> (
    __first=<value optimized out>, __last={_M_current = 0x1c3ba5a8}, __f=
          {handle = {px = 0x41d5f1e0, pn = {pi_ = 0x1c3ba5a8}}})
    at ../../../qpid-working/cpp/src/qpid/sys/posix/Mutex.h:116
#4  0x00002b518a927d0c in qpid::sys::DeletionManager<qpid::sys::PollerHandlePrivate>::markForDeletion (
    handle=0x1bf8e730) at ../../../qpid-working/cpp/src/qpid/sys/DeletionManager.h:130
#5  0x00002b518a9241fe in ~Poller (this=0x1bf90480)
    at ../../../qpid-working/cpp/src/qpid/sys/epoll/EpollPoller.cpp:181
#6  0x00002b518a5314a8 in ~TCPConnector (this=0x1c1a2b90)
    at /usr/include/boost/detail/sp_counted_base_gcc_x86.hpp:145
#7  0x00002b518a527fa4 in ~ConnectionImpl (this=0x1c1a2d70) at /usr/include/boost/checked_delete.hpp:34
#8  0x00002b518a562541 in ~SessionImpl (this=0x1c1a7850)
    at /usr/include/boost/detail/sp_counted_base_gcc_x86.hpp:145
#9  0x00002b518a55b290 in ~SessionBase_0_10 (this=<value optimized out>)
    at /usr/include/boost/detail/sp_counted_base_gcc_x86.hpp:145
#10 0x00000000004107a0 in ~Dispatcher (this=0x41d5f5d0)
    at ../../src/gen/qpid/client/no_keyword/Session_0_10.h:47
#11 0x000000000041606c in PublishThread::run (this=0x1c1a1540)
    at ../../../../qpid-working/cpp/src/tests/../qpid/client/SubscriptionManager.h:97
#12 0x00002b518a91d65a in runRunnable (p=0x7070)
    at ../../../qpid-working/cpp/src/qpid/sys/posix/Thread.cpp:35
#13 0x0000003411e06367 in start_thread () from /lib64/libpthread.so.0
#14 0x00000034112d30ad in clone () from /lib64/libc.so.6

Comment 7 Carl Trieloff 2009-06-19 20:02:05 UTC

and, both can be gotten with close() being called.


Thread 2 (process 21886):
#0  0x0000003411233600 in __cxa_finalize () from /lib64/libc.so.6
#1  0x0000003411600e56 in __do_global_dtors_aux () from /lib64/libdl.so.2
#2  0x0000000000000000 in ?? ()

Thread 1 (process 22033):
#0  0x0000003411230215 in raise () from /lib64/libc.so.6
#1  0x0000003411231cc0 in abort () from /lib64/libc.so.6
#2  0x0000003411229696 in __assert_fail () from /lib64/libc.so.6
#3  0x00002b9f23708e01 in qpid::sys::DeletionManager<qpid::sys::PollerHandlePrivate>::markForDeletion (
    handle=0x119750e0) at ../../../qpid-working/cpp/src/qpid/sys/posix/Mutex.h:116
#4  0x00002b9f236f6c77 in ~AsynchIO (this=0x11974920)
    at ../../../qpid-working/cpp/src/qpid/sys/posix/AsynchIO.cpp:335
#5  0x00002b9f237515bb in qpid::sys::DispatchHandle::processEvent (this=0x11974928, 
    type=qpid::sys::Poller::INTERRUPTED) at ../../../qpid-working/cpp/src/qpid/sys/DispatchHandle.cpp:505
#6  0x00002b9f23707668 in qpid::sys::Poller::run (this=0x119753c0)
    at ../../../qpid-working/cpp/src/qpid/sys/Poller.h:122
#7  0x00002b9f2331357b in qpid::client::TCPConnector::run (this=0x11b89b90)
    at ../../../qpid-working/cpp/src/qpid/client/Connector.cpp:396
#8  0x00002b9f236fe65a in runRunnable (p=0x557e)
    at ../../../qpid-working/cpp/src/qpid/sys/posix/Thread.cpp:35
#9  0x0000003411e06367 in start_thread () from /lib64/libpthread.so.0
#10 0x00000034112d30ad in clone () from /lib64/libc.so.6

Comment 8 Carl Trieloff 2009-06-19 20:12:08 UTC
 
Comment #6 is looks like a bug with perftest, in that if main has a connect exception, we don't wait for the join. Fix perftest for #6.

Comment 10 Alan Conway 2009-12-01 20:27:56 UTC
Created attachment 375184 [details]
Reproducer

Previous dirtyexit reproducer was incorrect (deleting connection while threads still running) This one shows the bug about every 100 runs. Run as follows:

while ./dirtyreceiver; do true; done

And at the same time run the normal sender client to feed messages:

seq 1000000 | sender

Bug is still present in r885870

Comment 11 Alan Conway 2010-03-24 15:56:43 UTC
Created attachment 402347 [details]
Reproducer source code.

Attached source code for dirtyreceiver reproducer, previously attahced a binary by mistake. Bug is still present and easily reproducible on r926867

Comment 12 Gordon Sim 2010-04-28 15:43:11 UTC
should be fixed in qpid-cpp-client-0.7.935473-1.el5

Comment 14 mick 2010-10-07 20:05:31 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause:
    The c++ qpid client does not call session.close() and connection.close()

  Consequence:
    About 2% of the time, the client triggers an assertion and aborts.

  Fix:
    Changes to shared_ptr and weak_ptr strategy in SessionImpl and ConnectionImpl.

  Result:
    no more assertions observed.

Comment 15 Martin Prpič 2010-10-10 07:23:30 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,11 +1 @@
-Cause:
+Previously, the C++ qpid client did not call the 'session.close()' and 'connection.close()' methods. On rare occasions, this caused the client to trigger an assertion and abort. With this update, changes were made to the 'shared_ptr' and 'weak_ptr' cycles in the 'SessionImpl' and 'ConnectionImpl' methods, thus, assertions no longer occur.-    The c++ qpid client does not call session.close() and connection.close()
-
-  Consequence:
-    About 2% of the time, the client triggers an assertion and aborts.
-
-  Fix:
-    Changes to shared_ptr and weak_ptr strategy in SessionImpl and ConnectionImpl.
-
-  Result:
-    no more assertions observed.

Comment 16 Douglas Silas 2010-10-11 09:36:40 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1 @@
-Previously, the C++ qpid client did not call the 'session.close()' and 'connection.close()' methods. On rare occasions, this caused the client to trigger an assertion and abort. With this update, changes were made to the 'shared_ptr' and 'weak_ptr' cycles in the 'SessionImpl' and 'ConnectionImpl' methods, thus, assertions no longer occur.+Previously, the C++ qpid client did not call the session.close() and connection.close() methods. On rare occasions, this caused the client to trigger an assertion and abort. With this update, changes were made to the 'shared_ptr' and 'weak_ptr' cycles in the "SessionImpl' and 'ConnectionImpl' methods such that the assertion is no longer triggered.

Comment 18 errata-xmlrpc 2010-10-14 16:11:43 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0773.html


Note You need to log in before you can comment on or make changes to this bug.