Bug 756446 - Qpid vs. Wallaby SEGFAULT
Summary: Qpid vs. Wallaby SEGFAULT
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: 2.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: 2.3
: ---
Assignee: Ken Giusti
QA Contact: Stanislav Graf
URL:
Whiteboard:
Depends On: 790390
Blocks: 698425 783492
TreeView+ depends on / blocked
 
Reported: 2011-11-23 16:04 UTC by ppecka
Modified: 2013-03-19 16:40 UTC (History)
6 users (show)

Fixed In Version: qpid-cpp-mrg-0.14-3.el5
Doc Type: Bug Fix
Doc Text:
Cause Shutting down the broker while QMF V1 agents are connected to it. Consequence The broker may incorrectly reference freed memory, and crash. Fix Zero out the memory pointer after freeing the memory. Result The broker will detect that the memory is freed, and will not access it.
Clone Of:
Environment:
Last Closed: 2013-03-19 16:40:05 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Corefiles from broker restarts (6.77 MB, application/x-bzip2)
2012-08-22 09:16 UTC, Tomas Rusnak
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 886143 1 None None None 2021-03-16 12:45:55 UTC

Internal Links: 886143

Description ppecka 2011-11-23 16:04:40 UTC
Description of problem:
Restarting qpidd service daemon while wallaby daemon is running terminates qpidd with segmentation fault.


Version-Release number of selected component (if applicable):
# rpm -qa | grep -P '(wallaby|qpid|sesame|condor|qmf)' | sort -u
condor-7.6.5-0.7.el5
condor-aviary-7.6.5-0.7.el5
condor-classads-7.6.5-0.7.el5
condor-debuginfo-7.6.5-0.7.el5
condor-qmf-7.6.5-0.7.el5
condor-wallaby-base-db-1.16-2.el5
condor-wallaby-client-4.1.2-1.el5
condor-wallaby-tools-4.1.2-1.el5
python-condorutils-1.5-4.el5
python-qpid-0.10-1.el5
python-qpid-qmf-0.10-11.el5
python-wallaby-0.12.1-1.el5
python-wallabyclient-4.1.2-1.el5
qpid-cpp-client-0.10-9.el5
qpid-cpp-client-devel-0.10-9.el5
qpid-cpp-client-devel-docs-0.10-9.el5
qpid-cpp-client-ssl-0.10-9.el5
qpid-cpp-mrg-debuginfo-0.10-9.el5
qpid-cpp-server-0.10-9.el5
qpid-cpp-server-cluster-0.10-9.el5
qpid-cpp-server-devel-0.10-9.el5
qpid-cpp-server-ssl-0.10-9.el5
qpid-cpp-server-store-0.10-9.el5
qpid-cpp-server-xml-0.10-9.el5
qpid-java-client-0.10-11.el5
qpid-java-common-0.10-11.el5
qpid-java-example-0.10-11.el5
qpid-qmf-0.10-11.el5
qpid-qmf-0.10-2.el5
qpid-qmf-devel-0.10-11.el5
qpid-tools-0.10-6.el5
rh-tests-distribution-MRG-Grid-grid_ptest_unit_wallaby-1.0-5
ruby-qpid-qmf-0.10-11.el5
ruby-qpid-qmf-0.10-2.el5
ruby-wallaby-0.12.1-1.el5
sesame-1.0-1.el5
wallaby-0.12.1-1.el5
wallaby-utils-0.12.1-1.el5


How reproducible:
90%

Steps to Reproduce:
1. service qpidd restat
2. service wallaby start
3. watch 'service qpidd restart'
4. ls -lSrh /var/lib/qpidd/.qpidd/ 
  
Actual results:
Core was generated by `/usr/sbin/qpidd --data-dir /var/lib/qpidd --daemon'.
Program terminated with signal 11, Segmentation fault.
#0  0x006b0b6c in memcpy () from /lib/libc.so.6
(gdb) info threads 
* 1 Thread 0xb7f74960 (LWP 3509)  0x006b0b6c in memcpy () from /lib/libc.so.6

(gdb) thread apply all bt

Thread 1 (Thread 0xb7f74960 (LWP 3509)):
#0  0x006b0b6c in memcpy () from /lib/libc.so.6
#1  0x00423d04 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::_Rep::_M_clone(std::allocator<char> const&, unsigned int) () from /usr/lib/libstdc++.so.6
#2  0x00424667 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
   from /usr/lib/libstdc++.so.6
#3  0x0565ce5f in ObjectId (this=0x9807498, __in_chrg=<value optimized out>)
    at ../include/qpid/management/ManagementObject.h:51
#4  getObjectId (this=0x9807498, __in_chrg=<value optimized out>)
    at ../include/qpid/management/ManagementObject.h:199
#5  qpid::management::ManagementAgent::RemoteAgent::~RemoteAgent (this=0x9807498, 
    __in_chrg=<value optimized out>) at qpid/management/ManagementAgent.cpp:113
#6  0x056644e8 in checked_delete<qpid::management::ManagementAgent::RemoteAgent> (
    this=0x9806320) at /usr/include/boost/checked_delete.hpp:34
#7  boost::detail::sp_counted_impl_p<qpid::management::ManagementAgent::RemoteAgent>::dispose (
    this=0x9806320) at /usr/include/boost/detail/sp_counted_impl.hpp:76
#8  0x05664a18 in ~shared_count (this=0x98075c8, __in_chrg=<value optimized out>)
    at /usr/include/boost/detail/sp_counted_base_gcc_x86.hpp:145
#9  ~shared_ptr (this=0x98075c8, __in_chrg=<value optimized out>)
    at /usr/include/boost/shared_ptr.hpp:106
#10 std::pair<qpid::management::ObjectId const, boost::shared_ptr<qpid::management::ManagementAgent::RemoteAgent> >::~pair (this=0x98075c8, __in_chrg=<value optimized out>)
    at /usr/include/c++/4.1.2/bits/stl_pair.h:69
#11 0x05656ed4 in _M_erase (this=0xb7542008, __in_chrg=<value optimized out>)
    at /usr/include/c++/4.1.2/ext/new_allocator.h:107
#12 ~_Rb_tree (this=0xb7542008, __in_chrg=<value optimized out>)
    at /usr/include/c++/4.1.2/bits/stl_tree.h:578
#13 ~map (this=0xb7542008, __in_chrg=<value optimized out>)
    at /usr/include/c++/4.1.2/bits/stl_map.h:93
---Type <return> to continue, or q <return> to quit---
#14 qpid::management::ManagementAgent::~ManagementAgent (this=0xb7542008, 
    __in_chrg=<value optimized out>) at qpid/management/ManagementAgent.cpp:158
#15 0x05512552 in ~auto_ptr (this=0x970b708, __in_chrg=<value optimized out>)
    at /usr/include/c++/4.1.2/memory:259
#16 qpid::broker::Broker::~Broker (this=0x970b708, __in_chrg=<value optimized out>)
    at qpid/broker/Broker.cpp:405
#17 0x0550b505 in qpid::RefCounted::released (this=0xa3b7e008) at qpid/RefCounted.h:48
#18 0x05540630 in qpid::broker::Daemon::fork (this=0xbf8a5498) at qpid/broker/Daemon.cpp:91
#19 0x0804e12d in QpiddBroker::execute (this=0xbf8a5725, options=0x9704520)
    at posix/QpiddBroker.cpp:179
#20 0x0804c811 in main (argc=4, argv=0xbf8a57d4) at qpidd.cpp:80
(gdb

Expected results:
qpidd restarts without segfaults

Additional info:
#cat /etc/qpidd.conf
cluster-mechanism=ANONYMOUS
log-to-file=/var/lib/qpidd/qpidd.log
log-enable=debug+



#>/var/lib/qpidd/qpidd.log ; service qpidd restart; service wallaby restart;sleep 3.5; echo -ne 'Y\ny\n*\n*\nlocalhost\nn\ny\n' | condor_configure_pool -a -n `hostname` -f Master,ExecuteNode,NodeAccess; service qpidd restart

triggered by Grid QE trusnak with grid_ptest_triggerd test.

Comment 1 Tomas Rusnak 2011-11-23 16:07:09 UTC
This can be reproduced on all supported platforms: RHEL5,RHEL6 on x86,x86_64

Comment 2 Justin Ross 2011-11-28 15:06:02 UTC
Ken, please assess..

Comment 3 Ken Giusti 2011-11-28 15:48:57 UTC
I *think* the crash is happening during shutdown, when the broker object is being destroyed.

It appears that the ~RemoteAgent() destructor is being called after the ~ManagementAgent() destructor has released the management object the remote agent was pointing at.  


The ~ManagementAgent destructor probably needs to manually clear out the remote agents while it holds the lock, *before* the management agents clears out the management objects.

Comment 4 Ken Giusti 2011-11-28 21:53:22 UTC
Confirmed this is indeed a failure to clean up resources on system exit.

Upstream JIRA:

https://issues.apache.org/jira/browse/QPID-3648

Comment 5 ppecka 2011-11-29 00:28:20 UTC
i can confirm that segfaults are seen only on qpidd shutdown.
Clustered service seems to be unaffected and stopped/segfaulted nodes are able to re-join cluster(even after segfault)

Comment 6 Ken Giusti 2011-11-29 13:39:26 UTC
I've checked in a fix upstream that resolves the crashes on shutdown that I've been able to reproduce using trunk:

http://svn.apache.org/viewvc?view=revision&revision=1207877

Comment 7 ppecka 2011-11-29 15:48:05 UTC
it also happens with 0.12 packages on rhel6

rpm -qa | grep -P '(wallaby|qpid|sesame|condor|qmf)' | sort -u
condor-7.6.5-0.7.el6.i686
condor-aviary-7.6.5-0.7.el6.i686
condor-classads-7.6.5-0.7.el6.i686
condor-classads-devel-7.6.5-0.7.el6.i686
condor-debuginfo-7.6.5-0.7.el6.i686
condor-kbdd-7.6.5-0.7.el6.i686
condor-qmf-7.6.5-0.7.el6.i686
condor-wallaby-base-db-1.16-2.el6.noarch
condor-wallaby-client-4.1.2-1.el6.noarch
condor-wallaby-tools-4.1.2-1.el6.noarch
python-condorutils-1.5-4.el6.noarch
python-qpid-0.12-1.el6.noarch
python-qpid-qmf-0.12-6.el6.i686
python-wallaby-0.12.1-1.el6.noarch
python-wallabyclient-4.1.2-1.el6.noarch
qpid-cpp-client-0.12-6.el6.i686
qpid-cpp-client-ssl-0.12-6.el6.i686
qpid-cpp-server-0.12-6.el6.i686
qpid-cpp-server-ssl-0.12-6.el6.i686
qpid-qmf-0.12-6.el6.i686
qpid-tests-0.12-1.el6.noarch
qpid-tools-0.12-2.el6.noarch
rh-qpid-cpp-tests-0.12-6.el6.i686
rh-tests-distribution-MRG-Grid-grid_ptest_unit_wallaby-1.0-5.noarch
ruby-qpid-0.7.946106-2.el6.i686
ruby-qpid-qmf-0.12-6.el6.i686
ruby-wallaby-0.12.1-1.el6.noarch
sesame-debuginfo-1.0-1.el6.i686
wallaby-0.12.1-1.el6.noarch
wallaby-utils-0.12.1-1.el6.noarch

Comment 8 Justin Ross 2011-11-29 18:39:09 UTC
Fixed upstream at http://svn.apache.org/viewvc?view=rev&rev=1207877

Comment 10 ppecka 2012-02-14 09:47:06 UTC


#0  0x2e656863 in ?? ()
#1  0x0562a64a in qpid::management::ManagementAgent::DeletedObject::DeletedObject (this=0x852ea70, src=0xb5f57df8, v1=true, v2=true)
    at qpid/management/ManagementAgent.cpp:2822
#2  0x05635e59 in qpid::management::ManagementAgent::moveNewObjectsLH (
    this=0xb6e23008) at qpid/management/ManagementAgent.cpp:679
#3  0x056491f9 in qpid::management::ManagementAgent::~ManagementAgent (
    this=0xb6e23008, __in_chrg=<value optimized out>)
    at qpid/management/ManagementAgent.cpp:153
#4  0x05649863 in qpid::management::ManagementAgent::~ManagementAgent (
    this=0xb6e23008, __in_chrg=<value optimized out>)
    at qpid/management/ManagementAgent.cpp:162
#5  0x055338e2 in ~auto_ptr (this=0x84ed4f8, __in_chrg=<value optimized out>)
    at /usr/include/c++/4.4.6/backward/auto_ptr.h:168
#6  qpid::broker::Broker::~Broker (this=0x84ed4f8, 
    __in_chrg=<value optimized out>) at qpid/broker/Broker.cpp:426
#7  0x05533f53 in qpid::broker::Broker::~Broker (this=0x84ed4f8, 
    __in_chrg=<value optimized out>) at qpid/broker/Broker.cpp:426
#8  0x05522e16 in qpid::RefCounted::released (this=0x84ed510)
    at qpid/RefCounted.h:48
#9  0x08055ec4 in release (this=0xbfabe60c) at qpid/RefCounted.h:42
#10 intrusive_ptr_release<qpid::broker::Broker> (this=0xbfabe60c)
---Type <return> to continue, or q <return> to quit---
    at qpid/RefCounted.h:59
#11 ~intrusive_ptr (this=0xbfabe60c)
    at /usr/include/boost/smart_ptr/intrusive_ptr.hpp:101
#12 QpiddDaemon::child (this=0xbfabe60c) at posix/QpiddBroker.cpp:144
#13 0x055513f0 in qpid::broker::Daemon::fork (this=0xbfabe60c)
    at qpid/broker/Daemon.cpp:91
#14 0x08054778 in QpiddBroker::execute (this=0xbfabe87f, options=0x84e8a20)
    at posix/QpiddBroker.cpp:182
#15 0x08050d79 in run_broker (argc=4, argv=0xbfabe964, hidden=false)
    at qpidd.cpp:83
#16 0x08054074 in main (argc=4, argv=0xbfabe964) at posix/QpiddBroker.cpp:202


rpm -qa | grep -P '(wallaby|qpid|sesame|condor|qmf)' | sort -u
condor-7.6.5-0.12.el6.i686
condor-classads-7.6.5-0.12.el6.i686
condor-qmf-7.6.5-0.12.el6.i686
condor-wallaby-base-db-1.19-1.el6.noarch
condor-wallaby-client-4.1.2-1.el6.noarch
condor-wallaby-tools-4.1.2-1.el6.noarch
python-condorutils-1.5-4.el6.noarch
python-qpid-0.14-2.el6.noarch
python-qpid-qmf-0.14-3.el6.i686
python-wallaby-0.12.5-1.el6.noarch
python-wallabyclient-4.1.2-1.el6.noarch
qpid-cpp-client-0.14-6.el6.i686
qpid-cpp-debuginfo-0.14-6.el6.i686
qpid-cpp-server-0.14-6.el6.i686
qpid-qmf-0.14-3.el6.i686
qpid-tools-0.14-1.el6.noarch
ruby-qpid-qmf-0.14-3.el6.i686
ruby-wallaby-0.12.5-1.el6.noarch
sesame-1.0-2.el6.i686
wallaby-0.12.5-1.el6.noarch
wallaby-utils-0.12.5-1.el6.noarch

Comment 11 Ken Giusti 2012-02-14 14:50:34 UTC
Crash has a different signature - I think you found another race in the cleanup code for ManagementAgent.

This crash involves the DeletedObject class, I'll look into it.

Comment 12 ppecka 2012-03-06 12:28:48 UTC
there is blocker https://bugzilla.redhat.com/show_bug.cgi?id=790390 with unconfirmed target milestone. Proposing skip errata

Comment 13 Ken Giusti 2012-03-07 21:39:18 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause
    Shutting down the broker while QMF V1 agents are connected to it.
Consequence
    The broker may incorrectly reference freed memory, and crash.
Fix
    Zero out the memory pointer after freeing the memory.
Result
    The broker will detect that the memory is freed, and will not access it.

Comment 16 Stanislav Graf 2012-04-05 15:16:34 UTC
Bug 802807 is possible duplicate of this one.
See bug 802807, comment 2

Comment 17 Stanislav Graf 2012-04-10 14:33:49 UTC
Scenario:
for i in $(seq 1000); do service qpidd restart; done
While wallaby is configured to connect to qpidd.

Reproduced:
RHEL5 i386/x86_64
python-qpid-0.10-1.el5.noarch
python-qpid-qmf-0.10-11.el5.x86_64
qpid-cpp-client-0.10-9.el5.x86_64
qpid-cpp-client-ssl-0.10-9.el5.x86_64
qpid-cpp-server-0.10-9.el5.x86_64
qpid-cpp-server-ssl-0.10-9.el5.x86_64
qpid-qmf-0.10-11.el5.x86_64
qpid-tools-0.10-6.el5.noarch
ruby-qpid-qmf-0.10-11.el5.x86_64

RHEL6 i386/x86_64
python-qpid-0.12-1.el6.noarch
python-qpid-qmf-0.12-6.el6.x86_64
qpid-cpp-client-0.12-6.el6.x86_64
qpid-cpp-client-ssl-0.12-6.el6.x86_64
qpid-cpp-server-0.12-6.el6.x86_64
qpid-cpp-server-ssl-0.12-6.el6.x86_64
qpid-qmf-0.12-6.el6.x86_64
qpid-tools-0.12-2.el6.noarch
ruby-qpid-qmf-0.12-6.el6.x86_64

Verify:
python-qpid-0.14-6.el5.noarch
python-qpid-qmf-0.14-6.el5.x86_64
qpid-cpp-client-0.14-14.el5.x86_64
qpid-cpp-client-ssl-0.14-14.el5.x86_64
qpid-cpp-server-0.14-14.el5.x86_64
qpid-cpp-server-ssl-0.14-14.el5.x86_64
qpid-qmf-0.14-6.el5.x86_64
qpid-tools-0.14-1.el5.noarch
ruby-qpid-qmf-0.14-6.el5.x86_64

python-qpid-0.14-7.el6_2.noarch
python-qpid-qmf-0.14-6.el6_2.x86_64
qpid-cpp-client-0.14-14.el6_2.x86_64
qpid-cpp-client-ssl-0.14-14.el6_2.x86_64
qpid-cpp-server-0.14-14.el6_2.x86_64
qpid-cpp-server-ssl-0.14-14.el6_2.x86_64
qpid-qmf-0.14-6.el6_2.x86_64
qpid-tools-0.14-1.el6_2.noarch
ruby-qpid-qmf-0.14-6.el6_2.x86_64

---> FAIL, back to ASSIGNED

Comment 25 Tomas Rusnak 2012-08-22 09:14:29 UTC
Retested with x86_64/RHEL6 (100 restarts):

qpid-cpp-client-devel-0.14-21.el6_3.x86_64
qpid-tools-0.14-5.el6_3.noarch
qpid-cpp-server-0.14-21.el6_3.x86_64
qpid-cpp-server-store-0.14-21.el6_3.x86_64
python-qpid-qmf-0.14-14.el6_3.x86_64
qpid-java-client-0.18-1.el6.noarch
qpid-qmf-devel-0.14-14.el6_3.x86_64
qpid-cpp-client-0.14-21.el6_3.x86_64
qpid-java-example-0.18-1.el6.noarch
qpid-qmf-debuginfo-0.14-14.el6_3.x86_64
qpid-qmf-0.14-14.el6_3.x86_64
qpid-cpp-server-xml-0.14-21.el6_3.x86_64
python-qpid-0.14-11.el6_3.noarch
qpid-java-common-0.18-1.el6.noarch
qpid-cpp-server-devel-0.14-21.el6_3.x86_64
qpid-cpp-client-devel-docs-0.14-21.el6_3.noarch
ruby-qpid-qmf-0.14-14.el6_3.x86_64
qpid-cpp-server-cluster-0.14-21.el6_3.x86_64
wallaby-utils-0.12.5-10.el6.noarch
condor-wallaby-client-4.1.3-1.el6.noarch
ruby-wallaby-0.12.5-10.el6.noarch
python-wallabyclient-4.1.3-1.el6.noarch
condor-wallaby-tools-4.1.3-1.el6.noarch
wallaby-0.12.5-10.el6.noarch
python-wallaby-0.12.5-10.el6.noarch
condor-wallaby-base-db-1.22-5.el6.noarch


# ls -la /var/lib/qpidd/.qpidd/
total 42936
drwxr-xr-x. 2 qpidd qpidd     4096 Aug 22 04:44 .
drwxr-xr-x. 4 qpidd qpidd     4096 Aug 13 18:12 ..
-rw-------. 1 qpidd qpidd 37101568 Aug 21 13:33 core.13410
-rw-------. 1 qpidd qpidd 37421056 Aug 21 21:23 core.14410
-rw-------. 1 qpidd qpidd 37404672 Aug 21 22:06 core.19768
-rw-------. 1 qpidd qpidd 37220352 Aug 21 22:27 core.21627
-rw-------. 1 qpidd qpidd 52408320 Aug 21 12:55 core.2668
-rw-------. 1 qpidd qpidd 37568512 Aug 21 23:09 core.26885
-rw-------. 1 qpidd qpidd 36716544 Aug 21 23:10 core.27187
-rw-------. 1 qpidd qpidd 37179392 Aug 21 23:10 core.27490
-rw-------. 1 qpidd qpidd 37175296 Aug 21 19:36 core.32330
-rw-------. 1 qpidd qpidd 36986880 Aug 21 20:40 core.9198


[04:45:39] Core file: /var/lib/qpidd/.qpidd/core.27490 generated by /usr/sbin/qpidd ----------------------9/9-
-rw-------. 1 qpidd qpidd 37179392 Aug 21 23:10 /var/lib/qpidd/.qpidd/core.27490
/var/lib/qpidd/.qpidd/core.27490: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from '/usr/sbin/qpidd --data-dir /var/lib/qpidd --daemon'
  GNU gdb (GDB) Red Hat Enterprise Linux (7.2-56.el6)
  Copyright (C) 2010 Free Software Foundation, Inc.
  License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
  This is free software: you are free to change and redistribute it.
  There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
  and "show warranty" for details.
  This GDB was configured as "x86_64-redhat-linux-gnu".
  For bug reporting instructions, please see:
  [New Thread 27490]
  [Thread debugging using libthread_db enabled]
  Core was generated by `/usr/sbin/qpidd --data-dir /var/lib/qpidd --daemon'.
  Program terminated with signal 11, Segmentation fault.
  #0  0x00007f79100b0f18 in ?? ()
  Missing separate debuginfos, use: debuginfo-install qpid-cpp-server-0.14-21.el6_3.x86_64
  (gdb) rax            0x7f79100b16d0	140157936932560
  rbx            0x80f080	8450176
  rcx            0x1	1
  rdx            0x1	1
  rsi            0x7f7910024a90	140157936355984
  rdi            0x7f7910024a90	140157936355984
  rbp            0x7f7910024a90	0x7f7910024a90
  rsp            0x7fffd53548f8	0x7fffd53548f8
  r8             0x7f7946ae4ee8	140158853598952
  r9             0x1	1
  r10            0x58	88
  r11            0x200	512
  r12            0x7fffd5354ed0	140736770428624
  r13            0x7f7910024a90	140157936355984
  r14            0x7f7947eed010	140158874603536
  r15            0x7f7910009010	140157936242704
  rip            0x7f79100b0f18	0x7f79100b0f18
  eflags         0x10206	[ PF IF RF ]
  cs             0x33	51
  ss             0x2b	43
  ds             0x0	0
  es             0x0	0
  fs             0x0	0
  gs             0x0	0
  (gdb) Using memory regions provided by the target.
  There are no memory regions defined.
  (gdb) 33   AT_SYSINFO_EHDR      System-supplied DSO's ELF header 0x7fffd53ff000
  16   AT_HWCAP             Machine-dependent CPU capability hints 0xbfebf3ff
  6    AT_PAGESZ            System page size               4096
  17   AT_CLKTCK            Frequency of times()           100
  3    AT_PHDR              Program headers for program    0x400040
  4    AT_PHENT             Size of program header entry   56
  5    AT_PHNUM             Number of program headers      8
  7    AT_BASE              Base address of interpreter    0x7f7947d23000
  8    AT_FLAGS             Flags                          0x0
  9    AT_ENTRY             Entry point of program         0x409f50
  11   AT_UID               Real user ID                   498
  12   AT_EUID              Effective user ID              498
  13   AT_GID               Real group ID                  499
  14   AT_EGID              Effective group ID             499
  23   AT_SECURE            Boolean, was exec setuid-like? 0
  25   AT_RANDOM            Address of 16 random bytes     0x7fffd5355ef9
  31   AT_EXECFN            File name of executable        0x7fffd5356fe8 "/usr/sbin/qpidd"
  15   AT_PLATFORM          String identifying platform    0x7fffd5355f09 "x86_64"
  0    AT_NULL              End of vector                  0x0
  (gdb) Stack level 0, frame at 0x7fffd5354900:
   rip = 0x7f79100b0f18; saved rip 0x7f7947a73068
   called by frame at 0x7fffd5354ca0
   Arglist at 0x7fffd53548f0, args: 
   Locals at 0x7fffd53548f0, Previous frame's sp is 0x7fffd5354900
   Saved registers:
    rip at 0x7fffd53548f8
  (gdb) From                To                  Syms Read   Shared Object Library
  0x00007f794790efa0  0x00007f7947a9c948  Yes (*)     /usr/lib64/libqpidbroker.so.7
  0x00007f79474c5b00  0x00007f79475afcf8  Yes (*)     /usr/lib64/libqpidcommon.so.7
  0x00007f7947182a50  0x00007f794718fa08  Yes (*)     /usr/lib64/libqpidtypes.so.1
  0x0000003eea021b20  0x0000003eea03dc18  Yes (*)     /usr/lib64/libboost_program_options.so.5
  0x0000003eeb009a90  0x0000003eeb010b18  Yes (*)     /usr/lib64/libboost_filesystem.so.5
  0x0000003eeec015a0  0x0000003eeec02cc8  Yes (*)     /lib64/libuuid.so.1
  0x00007f7946f75de0  0x00007f7946f76998  Yes (*)     /lib64/libdl.so.2
  0x00007f7946d6f140  0x00007f7946d724f8  Yes (*)     /lib64/librt.so.1
  0x0000003ef08046e0  0x0000003ef0814578  Yes (*)     /usr/lib64/libsasl2.so.2
  0x0000003eeac563f0  0x0000003eeacc3376  Yes (*)     /usr/lib64/libstdc++.so.6
  0x00007f7946aecea0  0x00007f7946b2cfe8  Yes (*)     /lib64/libm.so.6
  0x0000003eea802910  0x0000003eea812f18  Yes (*)     /lib64/libgcc_s.so.1
  0x00007f7946775a20  0x00007f794689552c  Yes (*)     /lib64/libc.so.6
  0x00007f794653f660  0x00007f794654aeb8  Yes (*)     /lib64/libpthread.so.0
  0x00007f7947d23b00  0x00007f7947d3c85b  Yes (*)     /lib64/ld-linux-x86-64.so.2
  0x0000003ee9c01220  0x0000003ee9c01a08  Yes (*)     /usr/lib64/libboost_system.so.5
  0x00007f7946323930  0x00007f79463328e8  Yes (*)     /lib64/libresolv.so.2
  0x00007f79460e9c00  0x00007f79460ee9a8  Yes (*)     /lib64/libcrypt.so.1
  0x0000003eee4032b0  0x0000003eee442078  Yes (*)     /lib64/libfreebl3.so
  0x00007f7945edf220  0x00007f7945ee3f88  Yes (*)     /usr/lib64/qpid/daemon/watchdog.so
  0x00007f7945ccbf20  0x00007f7945cd05d8  Yes (*)     /usr/lib64/qpid/daemon/replication_exchange.so
  0x00007f7945a35690  0x00007f7945a9aa58  Yes (*)     /usr/lib64/qpid/daemon/cluster.so
  0x00007f79457da160  0x00007f79457de418  Yes (*)     /usr/lib64/libcpg.so.4
  0x00007f79455d5340  0x00007f79455d74d8  Yes (*)     /usr/lib64/libcman.so.3
  0x00007f794535ec80  0x00007f79453af318  Yes (*)     /usr/lib64/libqpidclient.so.7
  0x0000003ee9801290  0x0000003ee98046d8  Yes (*)     /usr/lib64/libcoroipcc.so.4
  0x00007f794510a1c0  0x00007f794510f1e8  Yes (*)     /usr/lib64/qpid/daemon/replicating_listener.so
  0x00007f7944ef1110  0x00007f7944efb158  Yes (*)     /usr/lib64/qpid/daemon/xml.so
  0x00007f7944a7f9c0  0x00007f7944bd8b78  Yes (*)     /usr/lib64/libxerces-c-3.0.so
  0x00007f7944429a80  0x00007f79445e9438  Yes (*)     /usr/lib64/libxqilla.so.5
  0x00007f7944079070  0x00007f79440869f8  Yes (*)     /lib64/libnsl.so.1
  0x00007f7943e71580  0x00007f7943e72cd8  Yes (*)     /usr/lib64/gconv/UTF-16.so
  0x00007f7943c3ff30  0x00007f7943c628b8  Yes (*)     /usr/lib64/qpid/daemon/acl.so
  0x00007f7943975960  0x00007f79439fd608  Yes (*)     /usr/lib64/qpid/daemon/msgstore.so
  0x00007f79435d72a0  0x00007f79436f88c8  Yes (*)     /usr/lib64/libdb_cxx-4.7.so
  0x0000003ee8c00570  0x0000003ee8c00721  Yes (*)     /lib64/libaio.so.1
  (*): Shared library is missing debugging information.
  (gdb) * 1 Thread 0x7f7947f287a0 (LWP 27490)  0x00007f79100b0f18 in ?? ()
  Thread 1 (Thread 0x7f7947f287a0 (LWP 27490)):
  #0  0x00007f79100b0f18 in ?? ()
  #1  0x00007f7947a73068 in qpid::management::ManagementAgent::DeletedObject::DeletedObject(qpid::management::ManagementObject*, bool, bool) () from /usr/lib64/libqpidbroker.so.7
  #2  0x00007f7947a74dc6 in qpid::management::ManagementAgent::moveNewObjectsLH() () from /usr/lib64/libqpidbroker.so.7
  #3  0x00007f7947a76ee0 in qpid::management::ManagementAgent::~ManagementAgent() () from /usr/lib64/libqpidbroker.so.7
  #4  0x00007f7947a77389 in qpid::management::ManagementAgent::~ManagementAgent() () from /usr/lib64/libqpidbroker.so.7
  #5  0x00007f7947983451 in qpid::broker::Broker::~Broker() () from /usr/lib64/libqpidbroker.so.7
  #6  0x00007f7947983ad9 in qpid::broker::Broker::~Broker() () from /usr/lib64/libqpidbroker.so.7
  #7  0x000000000040f1c3 in ?? ()
  #8  0x00007f79479a9d53 in qpid::broker::Daemon::fork() () from /usr/lib64/libqpidbroker.so.7
  #9  0x000000000040dded in ?? ()
  #10 0x000000000040a32a in ?? ()
  #11 0x00007f7946775cdd in __libc_start_main () from /lib64/libc.so.6
  #12 0x0000000000409f79 in ?? ()
  #13 0x00007fffd5355d38 in ?? ()
  #14 0x000000000000001c in ?? ()
  #15 0x0000000000000004 in ?? ()
  #16 0x00007fffd5356f19 in ?? ()
  #17 0x00007fffd5356f29 in ?? ()
  #18 0x00007fffd5356f34 in ?? ()
  #19 0x00007fffd5356f43 in ?? ()
  #20 0x0000000000000000 in ?? ()
  (gdb) quit

Comment 26 Tomas Rusnak 2012-08-22 09:16:51 UTC
Created attachment 606182 [details]
Corefiles from broker restarts

Comment 28 Stanislav Graf 2012-12-11 15:47:14 UTC
Reproduction scenario:
# rm -f /var/lib/qpidd/.qpidd/core.*
# ls /var/lib/qpidd/.qpidd/core.* | wc -w
# for i in $(seq 1000); do service qpidd restart; done
# ls /var/lib/qpidd/.qpidd/core.* | wc -w

Reproduced on RHN version: qpid-cpp-server-0.14-22
upgrade to qpid-cpp-server-0.18-12, reboot
check 'qpid-stat -c' that all daemons are connected

Verification:
# ls /var/lib/qpidd/.qpidd/core.* | wc -w
ls: /var/lib/qpidd/.qpidd/core.*: No such file or directory
0

New issue separated into Bug 886143.

---> VERIFIED


Note You need to log in before you can comment on or make changes to this bug.