Bug 711833 - qpidd segfault with signal 11, condor with qmf tested (case: qpid stopped, then condor daemons with qmf plugins stopped)
Summary: qpidd segfault with signal 11, condor with qmf tested (case: qpid stopped, th...
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: Development
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: 2.1
: ---
Assignee: messaging-bugs
QA Contact: MRG Quality Engineering
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-06-08 17:06 UTC by Tomas Rusnak
Modified: 2013-02-24 15:01 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-06-20 12:14:58 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Tomas Rusnak 2011-06-08 17:06:35 UTC
Description of problem:
Due my testing of condor with qmf plugins I found some core files from qpidd - segmentation fault.

Version-Release number of selected component (if applicable):
qpid-cpp-client-devel-0.10-6.el6.x86_64
qpid-cpp-server-devel-0.10-6.el6.x86_64
qpid-cpp-server-store-0.10-6.el6.x86_64
qpid-cpp-server-0.10-6.el6.x86_64
qpid-cpp-client-rdma-0.10-6.el6.x86_64
qpid-cpp-server-rdma-0.10-6.el6.x86_64
rh-qpid-cpp-tests-0.10-6.el6.x86_64
qpid-cpp-server-xml-0.10-6.el6.x86_64
qpid-cpp-client-devel-docs-0.10-6.el6.noarch
qpid-cpp-debuginfo-0.10-6.el6.x86_64
qpid-cpp-client-0.10-6.el6.x86_64
qpid-cpp-client-ssl-0.10-6.el6.x86_64
qpid-cpp-server-ssl-0.10-6.el6.x86_64
qpid-cpp-server-cluster-0.10-6.el6.x86_64
condor-7.6.1-0.10.el6.x86_64
condor-qmf-7.6.1-0.10.el6.x86_64
qpid-qmf-0.10-10.el6.x86_64
ruby-qpid-qmf-0.10-10.el6.x86_64
python-condorutils-1.5-3.el6.noarch
condor-wallaby-tools-4.0-6.el6.noarch
condor-classads-7.6.1-0.10.el6.x86_64
condor-aviary-7.6.1-0.10.el6.x86_64
condor-kbdd-7.6.1-0.10.el6.x86_64
condor-debuginfo-7.6.1-0.10.el6.x86_64
python-qpid-qmf-0.10-10.el6.x86_64
condor-wallaby-base-db-1.13-1.el6.noarch
condor-wallaby-client-4.0-6.el6.noarch
condor-vm-gahp-7.6.1-0.10.el6.x86_64

How reproducible:
about 20% of restarts while condor is going down

Steps to Reproduce:
1. setup condor with qmf (I'm not sure if it depends on)
2. toggle restart condor and qpidd 
3. take a look at /var/lib/qpidd/.qpidd/core*

-rw-------. 1 qpidd qpidd 61415424 Jun  8 18:42 /var/lib/qpidd/.qpidd/core.22486
-rw-------. 1 qpidd qpidd 55955456 Jun  8 18:10 /var/lib/qpidd/.qpidd/core.24074
-rw-------. 1 qpidd qpidd 75616256 Jun  8 18:14 /var/lib/qpidd/.qpidd/core.27705
-rw-------. 1 qpidd qpidd 54710272 Jun  8 18:16 /var/lib/qpidd/.qpidd/core.30336
-rw-------. 1 qpidd qpidd 68251648 Jun  8 18:56 /var/lib/qpidd/.qpidd/core.6791
-rw-------. 1 qpidd qpidd 81121280 Jun  8 18:28 /var/lib/qpidd/.qpidd/core.9356
  
Actual results:
Qpidd segmentation fault after stop

Expected results:
No segfault

Additional info:
(gdb) info threads
* 1 Thread 0x7fb416aa17a0 (LWP 22486)  0x00007fb4145ff76e in memcpy () from /lib64/libc.so.6
(gdb) bt
#0  0x00007fb4145ff76e in memcpy () from /lib64/libc.so.6
#1  0x00007fb414e3f1e6 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::_Rep::_M_clone(std::allocator<char> const&, unsigned long) () from /usr/lib64/libstdc++.so.6
#2  0x00007fb414e3f28c in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () from /usr/lib64/libstdc++.so.6
#3  0x00007fb4165f6a60 in ObjectId (this=0x1854e60, __in_chrg=<value optimized out>) at ../include/qpid/management/ManagementObject.h:51
#4  getObjectId (this=0x1854e60, __in_chrg=<value optimized out>) at ../include/qpid/management/ManagementObject.h:199
#5  qpid::management::ManagementAgent::RemoteAgent::~RemoteAgent (this=0x1854e60, __in_chrg=<value optimized out>) at qpid/management/ManagementAgent.cpp:113
#6  0x00007fb4165f6c29 in qpid::management::ManagementAgent::RemoteAgent::~RemoteAgent (this=0x1854e60, __in_chrg=<value optimized out>) at qpid/management/ManagementAgent.cpp:115
#7  0x00007fb4165133c9 in release (this=<value optimized out>, __in_chrg=<value optimized out>) at /usr/include/boost/smart_ptr/detail/sp_counted_base_gcc_x86.hpp:145
#8  boost::detail::shared_count::~shared_count (this=<value optimized out>, __in_chrg=<value optimized out>) at /usr/include/boost/smart_ptr/detail/shared_count.hpp:217
#9  0x00007fb416610a7e in std::_Rb_tree<qpid::management::ObjectId, std::pair<qpid::management::ObjectId const, boost::shared_ptr<qpid::management::ManagementAgent::RemoteAgent> >, std::_Select1st<std::pair<qpid::management::ObjectId const, boost::shared_ptr<qpid::management::ManagementAgent::RemoteAgent> > >, std::less<qpid::management::ObjectId>, std::allocator<std::pair<qpid::management::ObjectId const, boost::shared_ptr<qpid::management::ManagementAgent::RemoteAgent> > > >::_M_erase(std::_Rb_tree_node<std::pair<qpid::management::ObjectId const, boost::shared_ptr<qpid::management::ManagementAgent::RemoteAgent> > >*) () from /usr/lib64/libqpidbroker.so.5.0.0
#10 0x00007fb416602750 in ~_Rb_tree (this=0x7fb416a66010, __in_chrg=<value optimized out>) at /usr/include/c++/4.4.5/bits/stl_tree.h:614
#11 ~map (this=0x7fb416a66010, __in_chrg=<value optimized out>) at /usr/include/c++/4.4.5/bits/stl_map.h:87
#12 qpid::management::ManagementAgent::~ManagementAgent (this=0x7fb416a66010, __in_chrg=<value optimized out>) at qpid/management/ManagementAgent.cpp:158
#13 0x00007fb416602939 in qpid::management::ManagementAgent::~ManagementAgent (this=0x7fb416a66010, __in_chrg=<value optimized out>) at qpid/management/ManagementAgent.cpp:158
#14 0x00007fb416520103 in ~auto_ptr (this=0x170d220, __in_chrg=<value optimized out>) at /usr/include/c++/4.4.5/backward/auto_ptr.h:168
#15 qpid::broker::Broker::~Broker (this=0x170d220, __in_chrg=<value optimized out>) at qpid/broker/Broker.cpp:405
#16 0x00007fb4165206b9 in qpid::broker::Broker::~Broker (this=0x170d220, __in_chrg=<value optimized out>) at qpid/broker/Broker.cpp:405
#17 0x000000000040ee5b in QpiddDaemon::child() ()
#18 0x00007fb41653be43 in qpid::broker::Daemon::fork (this=0x7fffff138020) at qpid/broker/Daemon.cpp:91
#19 0x000000000040ddfd in QpiddBroker::execute (this=<value optimized out>, options=<value optimized out>) at posix/QpiddBroker.cpp:179
#20 0x000000000040a1f2 in main (argc=4, argv=0x7fffff1385e8) at qpidd.cpp:80

Note:
I finished only test over RHEL6/x86_64 and still waiting for other platforms. I will post a comment with additional info from other platforms.

Comment 1 Tomas Rusnak 2011-06-09 09:55:15 UTC
I tested on KVM virtual guest with x86_64, RHEL6.1 (Santiago), 1 core, 512MB RAM. My other tests on real systems (RHEL5/6, both platforms) with same RHEL and CPU >= 2 were all negative. 
It looks to be harder to reproduce. In 100 restarts I can find 7 core dumps in virtual system at same code while qpidd is shutting down.

Comment 2 Tomas Rusnak 2011-06-20 12:14:58 UTC
I can't reproduce this after new installation of qpidd and about 10000 tries. If you see this again please reopen.


Note You need to log in before you can comment on or make changes to this bug.