Description of problem: This is a hard-to-reproduce leak reported by valgrind. It shows up periodically in ptol, I've also seen it occasionally in my own builds. The ConnectionImpl's thread is not being joined and as a result some memory allocated in pthread_create is leaked. Example: ==7518== 184 bytes in 1 blocks are possibly lost in loss record 9 of 17 ==7518== at 0x40046FF: calloc (vg_replace_malloc.c:279) ==7518== by 0x405D49: _dl_allocate_tls (in /lib/ld-2.5.so) ==7518== by 0x58DB92: pthread_create@@GLIBC_2.1 (in /lib/libpthread-2.5.so) ==7518== by 0x58E217: pthread_create (in /lib/libpthread-2.5.so) ==7518== by 0x443DCD8: qpid::sys::Thread::Thread(qpid::sys::Runnable*) (in /var/lib/ptolemy/sources/qpid/cpp/src/.libs/libqpidcommon.so.2.0.0) ==7518== by 0x4080DD4: qpid::client::TCPConnector::init() (in /var/lib/ptolemy/sources/qpid/cpp/src/.libs/libqpidclient.so.2.0.0) ==7518== by 0x407702A: qpid::client::ConnectionImpl::open() (in /var/lib/ptolemy/sources/qpid/cpp/src/.libs/libqpidclient.so.2.0.0) ==7518== by 0x40694CF: qpid::client::Connection::open(qpid::client::ConnectionSettings const&) (in /var/lib/ptolemy/sources/qpid/cpp/src/.libs/libqpidclient.so.2.0.0) ==7518== by 0x4069AD9: qpid::client::Connection::open(std::string const&, int, std::string const&, std::string const&, std::string const&, unsigned short) (in /var/lib/ptolemy/sources/qpid/cpp/src/.libs/libqpidclient.so.2.0.0) ==7518== by 0x807E0FF: ClientT<LocalConnection, qpid::client::Session_0_10>::ClientT(unsigned short, std::string const&) (in /var/lib/ptolemy/sources/qpid/cpp/src/tests/.libs/lt-cluster_test) ==7518== by 0x809907F: testCoincidentErrors() (in /var/lib/ptolemy/sources/qpid/cpp/src/tests/.libs/lt-cluster_test) ==7518== by 0x807CB3B: boost::unit_test::ut_detail::callback0_impl_t<boost::unit_test::ut_detail::unused, void (*)()>::invoke() (in /var/lib/ptolemy/sources/qpid/cpp/src/tests/.libs/lt-cluster_test) ==7518== by 0x45D148C: (within /usr/lib/libboost_unit_test_framework.so.1.33.1) ==7518== by 0x45C1F34: boost::execution_monitor::catch_signals(boost::unit_test::callback0<int> const&, bool, int) (in /usr/lib/libboost_unit_test_framework.so.1.33.1) ==7518== by 0x45C22C5: boost::execution_monitor::execute(boost::unit_test::callback0<int> const&, bool, int) (in /usr/lib/libboost_unit_test_framework.so.1.33.1) ==7518== by 0x45D1598: boost::unit_test::unit_test_monitor_t::execute_and_translate(boost::unit_test::test_case const&) (in /usr/lib/libboost_unit_test_framework.so.1.33.1) ==7518== by 0x45C5193: boost::unit_test::framework_impl::visit(boost::unit_test::test_case const&) (in /usr/lib/libboost_unit_test_framework.so.1.33.1) ==7518== by 0x45D7EF6: boost::unit_test::traverse_test_tree(boost::unit_test::test_case const&, boost::unit_test::test_tree_visitor&) (in /usr/lib/libboost_unit_test_framework.so.1.33.1) ==7518== by 0x45D869F: boost::unit_test::traverse_test_tree(unsigned long, boost::unit_test::test_tree_visitor&) (in /usr/lib/libboost_unit_test_framework.so.1.33.1) ==7518== by 0x45D84D7: boost::unit_test::traverse_test_tree(boost::unit_test::test_suite const&, boost::unit_test::test_tree_visitor&) (in /usr/lib/libboost_unit_test_framework.so.1.33.1) ==7518== by 0x45D86D4: boost::unit_test::traverse_test_tree(unsigned long, boost::unit_test::test_tree_visitor&) (in /usr/lib/libboost_unit_test_framework.so.1.33.1) ==7518== by 0x45D84D7: boost::unit_test::traverse_test_tree(boost::unit_test::test_suite const&, boost::unit_test::test_tree_visitor&) (in /usr/lib/libboost_unit_test_framework.so.1.33.1) ==7518== by 0x45D86D4: boost::unit_test::traverse_test_tree(unsigned long, boost::unit_test::test_tree_visitor&) (in /usr/lib/libboost_unit_test_framework.so.1.33.1) ==7518== by 0x45C4168: boost::unit_test::framework::run(unsigned long, bool) (in /usr/lib/libboost_unit_test_framework.so.1.33.1) ==7518== by 0x45D1248: main (in /usr/lib/libboost_unit_test_framework.so.1.33.1)
This occured twice today on ptol Server el5-64 in testCoincidentErrors. There may be something about that test that makes this more likely.
This is occuring regularly in ptol in testCoincidentErrors.
*** Bug 473298 has been marked as a duplicate of this bug. ***
*** Bug 528481 has been marked as a duplicate of this bug. ***
astitcher is working on a fix for this issue. In the meantime I have supressed the error in valgrind so we don't get spammed.
This should now be fixed by the fix for upstream jira QPID-1879. This removed the thread creation/deletion from the connection path so it can't leak anymore.
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Previously, the 'ConnectionImpl' method's thread was not being joined which resulted in memory leaks in the client library. With this update, memory leaks in the client library no longer occur.
Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1 +1 @@ -Previously, the 'ConnectionImpl' method's thread was not being joined which resulted in memory leaks in the client library. With this update, memory leaks in the client library no longer occur.+Memory allocated in the pthread_create() function suffered a difficult-to-reproduce leak due to a problem with the 'ConnectionImpl' method's thread. This update plugs this memory leak.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2010-0773.html