If the broker is stopped with ctrl-C and clustering is loaded, the broker can core. See bug 595438 for details. When running this scenario using valgrind, it shows the following errors: 2010-05-27 08:59:48 notice Shut down ==3177== Invalid write of size 8 ==3177== at 0x54DDED5: qpid::management::ManagementObject::resourceDestroy() (ManagementObject.cpp:265) ==3177== by 0x6C0F78D: mrg::msgstore::JournalImpl::~JournalImpl() (JournalImpl.cpp:129) ==3177== by 0x6C50BE2: mrg::msgstore::TplJournalImpl::~TplJournalImpl() (JournalImpl.h:242) ==3177== by 0x6C39448: void boost::checked_delete<mrg::msgstore::TplJournalImpl>(mrg::msgstore::TplJournalImpl*) (checked_delete.hpp:34) ==3177== by 0x6C3B3E2: boost::detail::sp_counted_impl_p<mrg::msgstore::TplJournalImpl>::dispose() (sp_counted_impl.hpp:76) ==3177== by 0x409599: boost::detail::sp_counted_base::release() (sp_counted_base_gcc_x86.hpp:145) ==3177== by 0x4095C9: boost::detail::shared_count::~shared_count() (shared_count.hpp:159) ==3177== by 0x6C37220: boost::shared_ptr<mrg::msgstore::TplJournalImpl>::~shared_ptr() (shared_ptr.hpp:106) ==3177== by 0x6C2A044: mrg::msgstore::MessageStoreImpl::~MessageStoreImpl() (MessageStoreImpl.cpp:450) ==3177== by 0x6C082C6: void boost::checked_delete<mrg::msgstore::MessageStoreImpl>(mrg::msgstore::MessageStoreImpl*) (checked_delete.hpp:34) ==3177== by 0x6C08504: boost::detail::sp_counted_impl_p<mrg::msgstore::MessageStoreImpl>::dispose() (sp_counted_impl.hpp:76) ==3177== by 0x409599: boost::detail::sp_counted_base::release() (sp_counted_base_gcc_x86.hpp:145) ==3177== Address 0x5fc8060 is 16 bytes inside a block of size 296 free'd ==3177== at 0x4A05743: operator delete(void*) (vg_replace_malloc.c:346) ==3177== by 0x6CA3ABE: qmf::com::redhat::rhm::store::Journal::~Journal() (Journal.cpp:100) ==3177== by 0x4F2C359: qpid::management::ManagementAgent::~ManagementAgent() (ManagementAgent.cpp:135) ==3177== by 0x4E3278F: std::auto_ptr<qpid::management::ManagementAgent>::~auto_ptr() (memory:259) ==3177== by 0x4E29B89: qpid::broker::Broker::~Broker() (Broker.cpp:364) ==3177== by 0x4E259A6: qpid::RefCounted::released() const (RefCounted.h:48) ==3177== by 0x40CD02: qpid::RefCounted::release() const (RefCounted.h:42) ==3177== by 0x40CD1A: boost::intrusive_ptr_release(qpid::RefCounted const*) (RefCounted.h:57) ==3177== by 0x40CD63: boost::intrusive_ptr<qpid::broker::Broker>::~intrusive_ptr() (intrusive_ptr.hpp:83) ==3177== by 0x40AEF9: QpiddBroker::execute(QpiddOptions*) (QpiddBroker.cpp:176) ==3177== by 0x40927C: main (qpidd.cpp:80) ==3177== ==3177== Invalid write of size 1 ==3177== at 0x54DDEDD: qpid::management::ManagementObject::resourceDestroy() (ManagementObject.cpp:266) ==3177== by 0x6C0F78D: mrg::msgstore::JournalImpl::~JournalImpl() (JournalImpl.cpp:129) ==3177== by 0x6C50BE2: mrg::msgstore::TplJournalImpl::~TplJournalImpl() (JournalImpl.h:242) ==3177== by 0x6C39448: void boost::checked_delete<mrg::msgstore::TplJournalImpl>(mrg::msgstore::TplJournalImpl*) (checked_delete.hpp:34) ==3177== by 0x6C3B3E2: boost::detail::sp_counted_impl_p<mrg::msgstore::TplJournalImpl>::dispose() (sp_counted_impl.hpp:76) ==3177== by 0x409599: boost::detail::sp_counted_base::release() (sp_counted_base_gcc_x86.hpp:145) ==3177== by 0x4095C9: boost::detail::shared_count::~shared_count() (shared_count.hpp:159) ==3177== by 0x6C37220: boost::shared_ptr<mrg::msgstore::TplJournalImpl>::~shared_ptr() (shared_ptr.hpp:106) ==3177== by 0x6C2A044: mrg::msgstore::MessageStoreImpl::~MessageStoreImpl() (MessageStoreImpl.cpp:450) ==3177== by 0x6C082C6: void boost::checked_delete<mrg::msgstore::MessageStoreImpl>(mrg::msgstore::MessageStoreImpl*) (checked_delete.hpp:34) ==3177== by 0x6C08504: boost::detail::sp_counted_impl_p<mrg::msgstore::MessageStoreImpl>::dispose() (sp_counted_impl.hpp:76) ==3177== by 0x409599: boost::detail::sp_counted_base::release() (sp_counted_base_gcc_x86.hpp:145) ==3177== Address 0x5fc80a2 is 82 bytes inside a block of size 296 free'd ==3177== at 0x4A05743: operator delete(void*) (vg_replace_malloc.c:346) ==3177== by 0x6CA3ABE: qmf::com::redhat::rhm::store::Journal::~Journal() (Journal.cpp:100) ==3177== by 0x4F2C359: qpid::management::ManagementAgent::~ManagementAgent() (ManagementAgent.cpp:135) ==3177== by 0x4E3278F: std::auto_ptr<qpid::management::ManagementAgent>::~auto_ptr() (memory:259) ==3177== by 0x4E29B89: qpid::broker::Broker::~Broker() (Broker.cpp:364) ==3177== by 0x4E259A6: qpid::RefCounted::released() const (RefCounted.h:48) ==3177== by 0x40CD02: qpid::RefCounted::release() const (RefCounted.h:42) ==3177== by 0x40CD1A: boost::intrusive_ptr_release(qpid::RefCounted const*) (RefCounted.h:57) ==3177== by 0x40CD63: boost::intrusive_ptr<qpid::broker::Broker>::~intrusive_ptr() (intrusive_ptr.hpp:83) ==3177== by 0x40AEF9: QpiddBroker::execute(QpiddOptions*) (QpiddBroker.cpp:176) ==3177== by 0x40927C: main (qpidd.cpp:80) The reason for the error is that the store calls _mgmtObject->resourceDestroy() in its destructor after the broker has already destroyed the management agent. To reproduce on RHEL-5.5: 0. Create two data dirs: /tmp/c0 and /tmp/c1 1. Enable openais/clustering. 2. Start two brokers in two windows: window 1: ./qpidd --load-module .libs/cluster.so --load-module /home/kpvdr/store/lib/.libs/msgstore.so --cluster-name XXX --data-dir /tmp/c0 --auth no --port 0 --truncate yes --log-enable info+ window 2: valgrind .libs/lt-qpidd --load-module .libs/cluster.so --load-module /home/kpvdr/store/lib/.libs/msgstore.so --cluster-name XXX --data-dir /tmp/c1 --auth no --port 0 --truncate yes --log-enable info+ 3. kill the broker in window 2 using ctrl-c It is possible that this is the cause of bug 595438.
I can confirm that if the cluster initialization fails and the broker is thus shut down, the same error as above results: 2010-05-27 09:42:09 critical Unexpected error: Cluster-ID mismatch. Stores belong to different clusters. ==3403== Invalid write of size 8 ... ==3403== Invalid write of size 1 ...
Fixed in store revision 3995 Remove global shared_ptr to store in store plugin. The global shared_ptr delays destruction of the store till after the broker is deleted causing core dumps when unregistering management objects.
Verified on RHEL5.5 x86_64 qpid-cpp-server-store-0.7.946106-2.el5 qpid-cpp-server-cluster-0.7.946106-2.el5 Very easily reproduced on the same system with -1 build packages.
Verified also on the same versions of packages for i386 RHEL5.5