509212 – Cluster test testReconnectSameSessionName fails with "catch-up connection closed prematurely"

Bug 509212 - Cluster test testReconnectSameSessionName fails with "catch-up connection closed prematurely"

Summary: Cluster test testReconnectSameSessionName fails with "catch-up connection clo...

Keywords:
Status:	CLOSED DUPLICATE of bug 490855
Alias:	None
Product:	Red Hat Enterprise MRG
Classification:	Red Hat
Component:	qpid-cpp
Sub Component:
Version:	Development
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	1.3
Target Release:	---
Assignee:	Ted Ross
QA Contact:	MRG Quality Engineering
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2009-07-01 19:13 UTC by Kim van der Riet
Modified:	2010-04-26 19:02 UTC (History)
CC List:	0 users
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2010-04-26 19:02:06 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Kim van der Riet 2009-07-01 19:13:04 UTC

Running cluster_test in a loop with the store enabled (through the store run_cluster_test script), I noticed the following error occurs occasionally (about once in about 50 runs):

fork1: 2009-07-01 14:50:58 critical 10.16.16.49:25327(UPDATEE) catch-up connection closed prematurely 10.16.16.49:25327-1(local,catchup)

and the test fails. Core file attached.

My trunk revision: 790164

Backtrace:

#0  memcpy () at ../sysdeps/x86_64/memcpy.S:509
#1  0x00000030ad0a3075 in std::char_traits<char>::copy ()
    at /usr/src/debug/gcc-4.3.2-20081105/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/char_traits.h:274
#2  std::string::_M_copy (__n=<value optimized out>, __s=<value optimized out>, __d=<value optimized out>)
    at /usr/src/debug/gcc-4.3.2-20081105/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.h:344
#3  std::string::append (this=0x7ffff78a9830, __str=@0x7f6db7836b88)
    at /usr/src/debug/gcc-4.3.2-20081105/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:331
#4  0x00007f6db7590aa0 in operator+<char, std::char_traits<char>, std::allocator<char> > ()
    at /usr/lib/gcc/x86_64-redhat-linux/4.3.2/../../../../include/c++/4.3.2/bits/basic_string.tcc:677
#5  qpid::management::ManagementAgent::raiseEvent (this=0x7f6db449a010, event=<value optimized out>, severity=<value optimized out>)
    at qpid/management/ManagementAgent.cpp:211
#6  0x00007f6db74fdade in ~Connection (this=0x17b1808) at qpid/broker/Connection.cpp:127
#7  0x00007f6db68a858f in ~Connection (this=0x17b1710) at qpid/cluster/Connection.cpp:111
#8  0x00007f6db689419c in qpid::RefCounted::release () at ./qpid/RefCounted.h:42
#9  intrusive_ptr_release (p=<value optimized out>) at ./qpid/RefCounted.h:57
#10 ~intrusive_ptr () at /usr/include/boost/intrusive_ptr.hpp:83
#11 ~pair () at /usr/lib/gcc/x86_64-redhat-linux/4.3.2/../../../../include/c++/4.3.2/bits/stl_pair.h:73
#12 __gnu_cxx::new_allocator<std::pair<qpid::cluster::ConnectionId const, boost::intrusive_ptr<qpid::cluster::Connection> > >::destroy ()
    at /usr/lib/gcc/x86_64-redhat-linux/4.3.2/../../../../include/c++/4.3.2/ext/new_allocator.h:118
#13 std::_Rb_tree<qpid::cluster::ConnectionId, std::pair<qpid::cluster::ConnectionId const, boost::intrusive_ptr<qpid::cluster::Connection> >, std::_Select1st<std::pair<qpid::cluster::ConnectionId const, boost::intrusive_ptr<qpid::cluster::Connection> > >, std::less<qpid::cluster::ConnectionId>, std::allocator<std::pair<qpid::cluster::ConnectionId const, boost::intrusive_ptr<qpid::cluster::Connection> > > >::_M_destroy_node ()
    at /usr/lib/gcc/x86_64-redhat-linux/4.3.2/../../../../include/c++/4.3.2/bits/stl_tree.h:390
#14 std::_Rb_tree<qpid::cluster::ConnectionId, std::pair<qpid::cluster::ConnectionId const, boost::intrusive_ptr<qpid::cluster::Connection> >, std::_Select1st<std::pair<qpid::cluster::ConnectionId const, boost::intrusive_ptr<qpid::cluster::Connection> > >, std::less<qpid::cluster::ConnectionId>, std::allocator<std::pair<qpid::cluster::ConnectionId const, boost::intrusive_ptr<qpid::cluster::Connection> > > >::_M_erase (this=0x1760350, __x=0x176e730)
    at /usr/lib/gcc/x86_64-redhat-linux/4.3.2/../../../../include/c++/4.3.2/bits/stl_tree.h:943
#15 0x00007f6db688bdc9 in ~_Rb_tree () at /usr/lib/gcc/x86_64-redhat-linux/4.3.2/../../../../include/c++/4.3.2/bits/stl_tree.h:585
#16 ~map () at /usr/lib/gcc/x86_64-redhat-linux/4.3.2/../../../../include/c++/4.3.2/bits/stl_map.h:92
#17 ~Cluster (this=0x175fbd0) at qpid/cluster/Cluster.cpp:219
#18 0x00007f6db6889e25 in qpid::cluster::Cluster::brokerShutdown (this=0x175fbd0) at qpid/cluster/Cluster.cpp:573
#19 0x00007f6db711ff37 in boost::function0<void, std::allocator<void> >::operator() (this=<value optimized out>)
    at /usr/include/boost/function/function_template.hpp:692
#20 0x00007f6db711fa3a in for_each<__gnu_cxx::__normal_iterator<boost::function<void ()(), std::allocator<void> >*, std::vector<boost::function<void ()(), std::allocator<void> >, std::allocator<boost::function<void ()(), std::allocator<void> > > > >, void (*)(boost::function<void ()(), std::allocator<void> >)> ()
    at /usr/lib/gcc/x86_64-redhat-linux/4.3.2/../../../../include/c++/4.3.2/bits/stl_algo.h:3791
#21 qpid::Plugin::Target::finalize (this=0x175dfa8) at qpid/Plugin.cpp:45
#22 0x00007f6db74d28c4 in ~Broker (this=0x175dfa0) at qpid/broker/Broker.cpp:338
#23 0x000000309e636960 in __cxa_finalize (d=0x7f6db7825880) at cxa_finalize.c:56
#24 0x00007f6db749ba86 in __do_global_dtors_aux () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.0
#25 0x0000000000405a20 in std::basic_streambuf<char, std::char_traits<char> >::~basic_streambuf ()
#26 0x00007ffff78aa0e0 in ?? ()
#27 0x00007f6db759c281 in _fini () from /home/kpvdr/mrg/qpid/cpp/src/.libs/libqpidbroker.so.0
#28 0x0000000000000019 in ?? ()
#29 0x0000000001751230 in ?? ()
#30 0x0000000001752a50 in ?? ()
#31 0x0000000001754dc0 in ?? ()
#32 0x00007f6db7839778 in ?? ()
#33 0x00007f6db7402000 in ?? ()
#34 0x00007f6db7402508 in ?? ()
#35 0x00007f6db7402a08 in ?? ()
#36 0x00007f6db6b33000 in ?? ()
#37 0x00007f6db6b32000 in ?? ()
#38 0x00007f6db6b334c8 in ?? ()
#39 0x00007f6db6b33990 in ?? ()
#40 0x0000000001755370 in ?? ()
#41 0x00007f6db6b324d8 in ?? ()
#42 0x00007f6db6b329a8 in ?? ()
#43 0x00007f6db6b31000 in ?? ()
#44 0x00007f6db6b31990 in ?? ()
#45 0x00007f6db6b30128 in ?? ()
#46 0x00007f6db6b305f0 in ?? ()
#47 0x0000000001752110 in ?? ()
#48 0x00000000017525b0 in ?? ()
#49 0x00007f6db6b314c8 in ?? ()
#50 0x000000309d4204e8 in _rtld_local () from /lib64/ld-2.9.so
#51 0x0000000001755840 in ?? ()
#52 0x0000000000000000 in ?? ()

Comment 1 Kim van der Riet 2009-07-01 19:15:21 UTC

Core file too big, did not successfully attach.

Comment 2 Kim van der Riet 2009-07-01 19:56:35 UTC

Further testing has shown that the above trace may NOT be connected with this failure; I have managed to run the test several times without a test failure, but several core files have been left behind with the same trace.

Comment 3 Alan Conway 2009-07-01 21:55:34 UTC

The backtrace looks like a seg-fault in ManagementAgent, reassigning to tross. The cluster error reported is consistent with a cluster member crashing due to a seg fault.

Comment 4 Kim van der Riet 2009-07-02 18:45:39 UTC

This backtrace has been transferred to Bug 509436.

I have not yet isolated a core file for this bug, even after ~500 runs.

Comment 6 Ted Ross 2009-10-30 18:47:10 UTC

There was a cluster-shutdown bug (BZ490855) that looks like it might be causing this.  BZ490855 was fixed upstream in version 823258.  Is this still occurring or might it be marked duplicate?

-Ted

Comment 7 Ted Ross 2010-04-26 19:02:06 UTC


*** This bug has been marked as a duplicate of bug 490855 ***

Note You need to log in before you can comment on or make changes to this bug.