Bug 490855 - clustered qpidd segfaults in qpid::broker::Exchange::propagateFedOp
Summary: clustered qpidd segfaults in qpid::broker::Exchange::propagateFedOp
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: 1.1
Hardware: All
OS: Linux
urgent
medium
Target Milestone: 1.3
: ---
Assignee: Ted Ross
QA Contact: Frantisek Reznicek
URL:
Whiteboard:
: 509212 509436 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-03-18 10:45 UTC by Frantisek Reznicek
Modified: 2015-11-16 00:07 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
The qpidd service no longer terminates with a segmentation fault due to aisexec assertion.
Clone Of:
Environment:
Last Closed: 2010-10-14 16:01:53 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
failover_soak test (13.79 KB, application/x-bzip2)
2009-03-18 10:45 UTC, Frantisek Reznicek
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2010:0773 0 normal SHIPPED_LIVE Moderate: Red Hat Enterprise MRG Messaging and Grid Version 1.3 2010-10-14 15:56:44 UTC

Description Frantisek Reznicek 2009-03-18 10:45:17 UTC
Created attachment 335676 [details]
failover_soak test

Description of problem:
When running slightly modified failover_soak on mrg packages data, there was observed qpidd crash (most probably as the consequence of aisexec assertion).

aisexec assertion:
  aisexec: ../include/sq.h:171: sq_item_add: Assertion `sq->items_inuse[sq_position] == 0' failed.
  There will be another BZ initiated.
qpidd backtraces (2 observations, RHEL 5.3 i386 / x86_64):

[root@hp-dl385-01 fsoak]# file /root/_bzs/fsoak/core.8263
/root/_bzs/fsoak/core.8263: ELF 32-bit LSB core file Intel 80386, version 1 (SYSV), SVR4-style, from 'qpidd'
[root@hp-dl385-01 fsoak]# gdb `which qpidd` /root/_bzs/fsoak/core.8263
GNU gdb Fedora (6.8-27.el5)
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...
(no debugging symbols found)

warning: Can't read pathname for load map: Input/output error.
Reading symbols from /usr/lib/libqpidbroker.so.0...(no debugging symbols found)...done.
...
Loaded symbols for /usr/lib/librdmacm.so.1
Reading symbols from /usr/lib/libibverbs.so.1...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libibverbs.so.1

(no debugging symbols found)
Core was generated by `qpidd --no-module-dir --load-module /usr/lib/qpid/daemon/cluster.so --cluster-n'.
Program terminated with signal 11, Segmentation fault.
[New process 8263]
[New process 8265]
[New process 8264]
#0  0x0070a0ac in memcpy () from /lib/libc.so.6
(gdb) thread apply all bt

Thread 3 (process 8264):
#0  0x00ef4402 in __kernel_vsyscall ()
#1  0x007ee8c2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
#2  0x00776b84 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libc.so.6
#3  0x00b24e70 in qpid::broker::Timer::run () from /usr/lib/libqpidbroker.so.0
#4  0x00381611 in ?? () from /usr/lib/libqpidcommon.so.0
#5  0x007ea49b in start_thread () from /lib/libpthread.so.0
#6  0x0076a42e in clone () from /lib/libc.so.6

Thread 2 (process 8265):
#0  0x00ef4402 in __kernel_vsyscall ()
#1  0x007ee8c2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
#2  0x00776b84 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libc.so.6
#3  0x00b24e70 in qpid::broker::Timer::run () from /usr/lib/libqpidbroker.so.0
#4  0x00381611 in ?? () from /usr/lib/libqpidcommon.so.0
#5  0x007ea49b in start_thread () from /lib/libpthread.so.0
#6  0x0076a42e in clone () from /lib/libc.so.6

Thread 1 (process 8263):
#0  0x0070a0ac in memcpy () from /lib/libc.so.6
#1  0x00d50cb4 in std::string::_Rep::_M_clone () from /usr/lib/libstdc++.so.6
#2  0x00d51617 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string () from /usr/lib/libstdc++.so.6
#3  0x00a66e59 in qpid::broker::Exchange::propagateFedOp () from /usr/lib/libqpidbroker.so.0
#4  0x00a98075 in qpid::broker::DirectExchange::unbind () from /usr/lib/libqpidbroker.so.0
#5  0x00adbbc2 in qpid::broker::QueueBindings::unbind () from /usr/lib/libqpidbroker.so.0
#6  0x00a6b16e in qpid::broker::Queue::unbind () from /usr/lib/libqpidbroker.so.0
#7  0x00a723eb in qpid::broker::Queue::tryAutoDelete () from /usr/lib/libqpidbroker.so.0
#8  0x00b009f5 in qpid::broker::SemanticState::cancel () from /usr/lib/libqpidbroker.so.0
#9  0x00b01521 in qpid::broker::SemanticState::~SemanticState () from /usr/lib/libqpidbroker.so.0
#10 0x00b197b9 in qpid::broker::SessionState::~SessionState () from /usr/lib/libqpidbroker.so.0
#11 0x00b21a12 in qpid::broker::SessionHandler::~SessionHandler () from /usr/lib/libqpidbroker.so.0
#12 0x00a884af in qpid::broker::Connection::~Connection () from /usr/lib/libqpidbroker.so.0
#13 0x00e0f8cc in qpid::cluster::Connection::~Connection () from /usr/lib/qpid/daemon/cluster.so
#14 0x00a5aa95 in qpid::RefCounted::released () from /usr/lib/libqpidbroker.so.0
#15 0x00df8231 in std::_Rb_tree<qpid::cluster::ConnectionId, std::pair<qpid::cluster::ConnectionId const, boost::intrusive_ptr<qpid::cluster::Connection> >, std::_Select1st<std::pair<qpid::cluster::ConnectionId const, boost::intrusive_ptr<qpid::cluster::Connection> > >, std::less<qpid::cluster::ConnectionId>, std::allocator<std::pair<qpid::cluster::ConnectionId const, boost::intrusive_ptr<qpid::cluster::Connection> > > >::_M_erase ()
   from /usr/lib/qpid/daemon/cluster.so
#16 0x00de141d in qpid::cluster::Cluster::~Cluster () from /usr/lib/qpid/daemon/cluster.so
---Type <return> to continue, or q <return> to quit---
#17 0x00dddf57 in qpid::cluster::Cluster::brokerShutdown () from /usr/lib/qpid/daemon/cluster.so
#18 0x00ded9c6 in boost::detail::function::void_function_obj_invoker0<boost::_bi::bind_t<void, boost::_mfi::mf0<void, qpid::cluster::Cluster>, boost::_bi::list1<boost::_bi::value<qpid::cluster::Cluster*> > >, void>::invoke () from /usr/lib/qpid/daemon/cluster.so
#19 0x0039e57c in boost::function0<void, std::allocator<void> >::operator() () from /usr/lib/libqpidcommon.so.0
#20 0x0039da8d in ?? () from /usr/lib/libqpidcommon.so.0
#21 0x0039e3db in std::for_each<__gnu_cxx::__normal_iterator<boost::function<void ()(), std::allocator<void> >*, std::vector<boost::function<void ()(), std::allocator<void> >, std::allocator<boost::function<void ()(), std::allocator<void> > > > >, void (*)(boost::function<void ()(), std::allocator<void> >)> () from /usr/lib/libqpidcommon.so.0
#22 0x0039d9f9 in qpid::Plugin::Target::finalize () from /usr/lib/libqpidcommon.so.0
#23 0x00a533f0 in qpid::broker::Broker::~Broker () from /usr/lib/libqpidbroker.so.0
#24 0x00a5aa95 in qpid::RefCounted::released () from /usr/lib/libqpidbroker.so.0
#25 0x00b23379 in ?? () from /usr/lib/libqpidbroker.so.0
#26 0x006c4fe9 in __cxa_finalize () from /lib/libc.so.6
#27 0x00a06664 in ?? () from /usr/lib/libqpidbroker.so.0
#28 0x00b961a0 in ?? () from /usr/lib/libqpidbroker.so.0
#29 0x00000022 in ?? ()
#30 0x007da140 in ?? () from /lib/libc.so.6
#31 0x00a0663a in ?? () from /usr/lib/libqpidbroker.so.0
#32 0x00b9b684 in ?? () from /usr/lib/libqpidbroker.so.0
#33 0x00696240 in _rtld_local () from /lib/ld-linux.so.2
#34 0xbfd09d18 in ?? ()
#35 0x00b4a05c in _fini () from /usr/lib/libqpidbroker.so.0
Backtrace stopped: frame did not save the PC
(gdb)


--------------------

[root@hp-ml370g4-01 fsoak]# file /root/_bzs/fsoak/core.10220
/root/_bzs/fsoak/core.10220: ELF 64-bit LSB core file AMD x86-64, version 1 (SYSV), SVR4-style, from 'qpidd'
[root@hp-ml370g4-01 fsoak]# gdb `which qpidd` /root/_bzs/fsoak/core.10220
GNU gdb Fedora (6.8-27.el5)
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...
(no debugging symbols found)
Reading symbols from /usr/lib64/libqpidbroker.so.0...(no debugging symbols found)...done.
...
Reading symbols from /usr/lib64/libibverbs.so.1...(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/libibverbs.so.1

Core was generated by `qpidd --no-module-dir --load-module /usr/lib64/qpid/daemon/cluster.so --cluster'.
Program terminated with signal 11, Segmentation fault.
[New process 10220]
[New process 10222]
[New process 10221]
#0  0x0000003446e7b7ec in memcpy () from /lib64/libc.so.6
(gdb) thread apply all bt

Thread 3 (process 10221):
#0  0x0000003447a0ab00 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x000000378e188e6f in qpid::broker::Timer::run () from /usr/lib64/libqpidbroker.so.0
#2  0x000000378d76ac4a in ?? () from /usr/lib64/libqpidcommon.so.0
#3  0x0000003447a06367 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003446ed30ad in clone () from /lib64/libc.so.6

Thread 2 (process 10222):
#0  0x0000003447a0ab00 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x000000378e188e6f in qpid::broker::Timer::run () from /usr/lib64/libqpidbroker.so.0
#2  0x000000378d76ac4a in ?? () from /usr/lib64/libqpidcommon.so.0
#3  0x0000003447a06367 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003446ed30ad in clone () from /lib64/libc.so.6

Thread 1 (process 10220):
#0  0x0000003446e7b7ec in memcpy () from /lib64/libc.so.6
#1  0x0000003447e9c200 in std::string::_Rep::_M_clone () from /usr/lib64/libstdc++.so.6
#2  0x0000003447e9c8ff in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string () from /usr/lib64/libstdc++.so.6
#3  0x000000378e0de5cf in qpid::broker::Exchange::propagateFedOp () from /usr/lib64/libqpidbroker.so.0
#4  0x000000378e109c70 in qpid::broker::DirectExchange::unbind () from /usr/lib64/libqpidbroker.so.0
#5  0x000000378e146ac7 in qpid::broker::QueueBindings::unbind () from /usr/lib64/libqpidbroker.so.0
#6  0x000000378e0dffeb in qpid::broker::Queue::unbind () from /usr/lib64/libqpidbroker.so.0
#7  0x000000378e0e74b6 in qpid::broker::Queue::tryAutoDelete () from /usr/lib64/libqpidbroker.so.0
#8  0x000000378e161000 in qpid::broker::SemanticState::cancel () from /usr/lib64/libqpidbroker.so.0
#9  0x000000378e1696ec in qpid::broker::SemanticState::~SemanticState () from /usr/lib64/libqpidbroker.so.0
#10 0x000000378e17eeaa in qpid::broker::SessionState::~SessionState () from /usr/lib64/libqpidbroker.so.0
#11 0x000000378e185c55 in qpid::broker::SessionHandler::~SessionHandler () from /usr/lib64/libqpidbroker.so.0
#12 0x000000378e0f9f5f in qpid::broker::Connection::~Connection () from /usr/lib64/libqpidbroker.so.0
#13 0x00002b46b5eaae60 in qpid::cluster::Connection::~Connection () from /usr/lib64/qpid/daemon/cluster.so
#14 0x00002b46b5e926ba in std::_Rb_tree<qpid::cluster::ConnectionId, std::pair<qpid::cluster::ConnectionId const, boost::intrusive_ptr<qpid::cluster::Connection> >, std::_Select1st<std::pair<qpid::cluster::ConnectionId const, boost::intrusive_ptr<qpid::cluster::Connection> > >, std::less<qpid::cluster::ConnectionId>, std::allocator<std::pair<qpid::cluster::ConnectionId const, boost::intrusive_ptr<qpid::cluster::Connection> > > >::_M_erase ()
   from /usr/lib64/qpid/daemon/cluster.so
#15 0x00002b46b5e847f1 in qpid::cluster::Cluster::~Cluster () from /usr/lib64/qpid/daemon/cluster.so
#16 0x00002b46b5e7bed5 in qpid::cluster::Cluster::brokerShutdown () from /usr/lib64/qpid/daemon/cluster.so
#17 0x000000378d785a2f in boost::function0<void, std::allocator<void> >::operator() () from /usr/lib64/libqpidcommon.so.0
#18 0x000000378d7858b6 in std::for_each<__gnu_cxx::__normal_iterator<boost::function<void ()(), std::allocator<void> >*, std::vector<boost::function<void ()(), std::allocator<void> >, std::allocator<boost::function<void ()(), std::allocator<void> > > > >, void (*)(boost::function<void ()(), std::allocator<void> >)> () from /usr/lib64/libqpidcommon.so.0
---Type <return> to continue, or q <return> to quit---
#19 0x000000378d784f80 in qpid::Plugin::Target::finalize () from /usr/lib64/libqpidcommon.so.0
#20 0x000000378e0c9b69 in qpid::broker::Broker::~Broker () from /usr/lib64/libqpidbroker.so.0
#21 0x0000003446e3363e in __cxa_finalize () from /lib64/libc.so.6
#22 0x000000378e088e06 in ?? () from /usr/lib64/libqpidbroker.so.0
#23 0x0000003446c1c000 in ?? () from /lib64/ld-linux-x86-64.so.2
#24 0x0000000000000000 in ?? ()
(gdb) quit


Version-Release number of selected component (if applicable):
[root@hp-ml370g4-01 fsoak]# rpm -qa | egrep '(ais|qpid|rhm)'
qpidc-0.5.752581-1.el5
qpidd-acl-0.5.752581-1.el5
rhm-docs-0.5.753238-1.el5
qpid-java-common-0.5.751061-1.el5
qpidc-rdma-0.5.752581-1.el5
qpidd-ssl-0.5.752581-1.el5
qpidc-perftest-0.5.752581-1.el5
openais-0.80.3-22.el5_3.3
qpidd-0.5.752581-1.el5
qpidc-devel-0.5.752581-1.el5
qpidd-rdma-0.5.752581-1.el5
rhm-0.5.3153-1.el5
python-qpid-0.5.752581-1.el5
qpid-java-client-0.5.751061-1.el5
qpidd-cluster-0.5.752581-1.el5
qpidc-ssl-0.5.752581-1.el5
qpidd-xml-0.5.752581-1.el5
qpidd-devel-0.5.752581-1.el5


How reproducible:
>50% (sometimes it show up just after ~30 runs sometime after ~800 runs)

Steps to Reproduce:
1. run failover_soak test in a loop (see attachement for reproducer)
2. watch results
  
Actual results:
(aisexec exits with above mentioned assertion)
clustered qpidd crashes

Expected results:
both aisexec and qpidd should continue working w/o any issue.

Additional info:

Comment 1 Gordon Sim 2009-03-18 13:35:09 UTC
Bit more detail on Thread1 above with debuginfo installed:

...
#22 0x0039d9f9 in qpid::Plugin::Target::finalize (this=<value optimized out>) at qpid/Plugin.cpp:45
#23 0x00a533f0 in ~Broker (this=<value optimized out>) at qpid/broker/Broker.cpp:337
#24 0x00a5aa95 in qpid::RefCounted::released (this=<value optimized out>) at qpid/RefCounted.h:48
#25 0x00b23379 in __tcf_1 () at qpid/RefCounted.h:42
#26 0x006c4fe9 in __cxa_finalize () from /lib/libc.so.6
#27 0x00a06664 in __do_global_dtors_aux () from /usr/lib/libqpidbroker.so.0
#28 0x00b4a05c in _fini () from /usr/lib/libqpidbroker.so.0
#29 0x006897ee in _dl_fini () from /lib/ld-linux.so.2
#30 0x006c4d39 in exit () from /lib/libc.so.6
#31 0x006aee94 in __libc_start_main () from /lib/libc.so.6
#32 0x0804c001 in _start ()

Comment 3 Ted Ross 2009-10-27 15:14:52 UTC
Fixed upstream in revision 823258.

Comment 4 Ted Ross 2010-04-26 19:02:06 UTC
*** Bug 509212 has been marked as a duplicate of this bug. ***

Comment 5 Kim van der Riet 2010-07-08 12:31:31 UTC
*** Bug 509436 has been marked as a duplicate of this bug. ***

Comment 6 Frantisek Reznicek 2010-09-08 14:46:04 UTC
The issue is proved to be fixed (no segfaults/aborts), retested on RHEL 5.5 i386 / x86_64 on packages:
python-qmf-0.7.946106-12.el5
python-qpid-0.7.946106-13.el5
qmf-0.7.946106-12.el5
qmf-devel-0.7.946106-12.el5
qpid-cpp-*-0.7.946106-12.el5
qpid-dotnet-0.4.738274-2.el5
qpid-java-*-0.7.946106-8.el5
qpid-tests-0.7.946106-1.el5
qpid-tools-0.7.946106-10.el5
ruby-qmf-0.7.946106-12.el5

-> VERIFIED

Comment 7 Jaromir Hradilek 2010-10-07 14:59:21 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
The qpidd service no longer terminates with a segmentation fault due to aisexec assertion.

Comment 9 errata-xmlrpc 2010-10-14 16:01:53 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0773.html


Note You need to log in before you can comment on or make changes to this bug.