Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 510241 - clustered qpidd crash in qpid::sys::Poller::run()
clustered qpidd crash in qpid::sys::Poller::run()
Status: CLOSED ERRATA
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp (Show other bugs)
1.1.2
All Linux
urgent Severity urgent
: 1.3
: ---
Assigned To: Andrew Stitcher
ppecka
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-07-08 09:13 EDT by Frantisek Reznicek
Modified: 2015-11-15 20:11 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
The clustered qpidd service no longer terminates unexpectedly in qpid::sys::Poller::run().
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-10-14 11:59:20 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
reproducer (11.58 KB, application/x-tbz)
2009-07-08 09:13 EDT, Frantisek Reznicek
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2010:0773 normal SHIPPED_LIVE Moderate: Red Hat Enterprise MRG Messaging and Grid Version 1.3 2010-10-14 11:56:44 EDT

  None (edit)
Description Frantisek Reznicek 2009-07-08 09:13:39 EDT
Created attachment 350930 [details]
reproducer

Description of problem:
During BZ 506758 validation there was seen this crash in qpid::sys::Poller::run(). Seen on RHEL 5.3 x86_64 on an 
Quad-Core AMD Opteron(tm) Processor 2376


Version-Release number of selected component (if applicable):
[root@mrg-qe-02 bz506758]# rpm -qa | egrep '(qpid|rhm|openais)' | sort -u
openais-0.80.3-22.el5_3.8
openais-debuginfo-0.80.3-22.el5_3.8
python-qpid-0.5.752581-3.el5
qpidc-0.5.752581-22.el5
qpidc-debuginfo-0.5.752581-22.el5
qpidc-devel-0.5.752581-22.el5
qpidc-perftest-0.5.752581-22.el5
qpidc-rdma-0.5.752581-22.el5
qpidc-ssl-0.5.752581-22.el5
qpidd-0.5.752581-22.el5
qpidd-acl-0.5.752581-22.el5
qpidd-cluster-0.5.752581-22.el5
qpidd-devel-0.5.752581-22.el5
qpid-dotnet-0.4.738274-2.el5
qpidd-rdma-0.5.752581-22.el5
qpidd-ssl-0.5.752581-22.el5
qpidd-xml-0.5.752581-22.el5
qpid-java-client-0.5.751061-8.el5
qpid-java-common-0.5.751061-8.el5
rhm-0.5.3206-5.el5
rhm-docs-0.5.756148-1.el5


How reproducible:
very hard (<1%)

Steps to Reproduce:
0. install and set-up openais
1. run attached reproducer ./run.sh 5 100
   (5 node[s] cluster, 100 instances of subscribe running in parallel)
2. wait for crash
  
Actual results:
Crash

Expected results:
No crash

Additional info (threaded backtrace):

GNU gdb Fedora (6.8-27.el5)
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...
Reading symbols from /usr/lib64/libqpidbroker.so.0...Reading symbols from /usr/lib/debug/usr/lib64/libqpidbroker.so.0.1.0.debug...done.
done.
Loaded symbols for /usr/lib64/libqpidbroker.so.0
Reading symbols from /usr/lib64/libqpidcommon.so.0...Reading symbols from /usr/lib/debug/usr/lib64/libqpidcommon.so.0.1.0.debug...done.
done.
Loaded symbols for /usr/lib64/libqpidcommon.so.0
Reading symbols from /usr/lib64/libboost_program_options.so.2...done.
Loaded symbols for /usr/lib64/libboost_program_options.so.2
Reading symbols from /usr/lib64/libboost_filesystem.so.2...done.
Loaded symbols for /usr/lib64/libboost_filesystem.so.2
Reading symbols from /lib64/libuuid.so.1...done.
Loaded symbols for /lib64/libuuid.so.1
Reading symbols from /lib64/libdl.so.2...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/librt.so.1...done.
Loaded symbols for /lib64/librt.so.1
Reading symbols from /usr/lib64/libsasl2.so.2...done.
Loaded symbols for /usr/lib64/libsasl2.so.2
Reading symbols from /usr/lib64/libstdc++.so.6...done.
Loaded symbols for /usr/lib64/libstdc++.so.6
Reading symbols from /lib64/libm.so.6...done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /lib64/libgcc_s.so.1...done.
Loaded symbols for /lib64/libgcc_s.so.1
Reading symbols from /lib64/libc.so.6...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib64/libpthread.so.0...done.
Loaded symbols for /lib64/libpthread.so.0
Reading symbols from /lib64/libresolv.so.2...done.
Loaded symbols for /lib64/libresolv.so.2
Reading symbols from /lib64/libcrypt.so.1...done.
Loaded symbols for /lib64/libcrypt.so.1
Reading symbols from /usr/lib64/qpid/daemon/replicating_listener.so...Reading symbols from /usr/lib/debug/usr/lib64/qpid/daemon/replicating_listener.so.debug...done.
done.
Loaded symbols for /usr/lib64/qpid/daemon/replicating_listener.so
Reading symbols from /usr/lib64/qpid/daemon/rdma.so...Reading symbols from /usr/lib/debug/usr/lib64/qpid/daemon/rdma.so.debug...done.
done.
Loaded symbols for /usr/lib64/qpid/daemon/rdma.so
Reading symbols from /usr/lib64/librdmawrap.so.0...Reading symbols from /usr/lib/debug/usr/lib64/librdmawrap.so.0.1.0.debug...done.
done.
Loaded symbols for /usr/lib64/librdmawrap.so.0
Reading symbols from /usr/lib64/librdmacm.so.1...done.
Loaded symbols for /usr/lib64/librdmacm.so.1
Reading symbols from /usr/lib64/libibverbs.so.1...done.
Loaded symbols for /usr/lib64/libibverbs.so.1
Reading symbols from /usr/lib64/qpid/daemon/cluster.so...Reading symbols from /usr/lib/debug/usr/lib64/qpid/daemon/cluster.so.debug...done.
done.
Loaded symbols for /usr/lib64/qpid/daemon/cluster.so
Reading symbols from /usr/lib64/openais/libcpg.so.2...Reading symbols from /usr/lib/debug/usr/lib64/openais/libcpg.so.2.0.0.debug...done.
done.
Loaded symbols for /usr/lib64/openais/libcpg.so.2
Reading symbols from /usr/lib64/libcman.so.2...done.
Loaded symbols for /usr/lib64/libcman.so.2
Reading symbols from /usr/lib64/libqpidclient.so.0...Reading symbols from /usr/lib/debug/usr/lib64/libqpidclient.so.0.1.0.debug...done.
done.
Loaded symbols for /usr/lib64/libqpidclient.so.0
Reading symbols from /usr/lib64/qpid/client/rdmaconnector.so...Reading symbols from /usr/lib/debug/usr/lib64/qpid/client/rdmaconnector.so.debug...done.
done.
Loaded symbols for /usr/lib64/qpid/client/rdmaconnector.so
Reading symbols from /usr/lib64/qpid/client/sslconnector.so...Reading symbols from /usr/lib/debug/usr/lib64/qpid/client/sslconnector.so.debug...done.
done.
Loaded symbols for /usr/lib64/qpid/client/sslconnector.so
Reading symbols from /usr/lib64/libsslcommon.so.0...Reading symbols from /usr/lib/debug/usr/lib64/libsslcommon.so.0.1.0.debug...done.
done.
Loaded symbols for /usr/lib64/libsslcommon.so.0
Reading symbols from /usr/lib64/libnss3.so...done.
Loaded symbols for /usr/lib64/libnss3.so
Reading symbols from /usr/lib64/libssl3.so...done.
Loaded symbols for /usr/lib64/libssl3.so
Reading symbols from /usr/lib64/libnspr4.so...done.
Loaded symbols for /usr/lib64/libnspr4.so
Reading symbols from /usr/lib64/libnssutil3.so...done.
Loaded symbols for /usr/lib64/libnssutil3.so
Reading symbols from /usr/lib64/libplc4.so...done.
Loaded symbols for /usr/lib64/libplc4.so
Reading symbols from /usr/lib64/libplds4.so...done.
Loaded symbols for /usr/lib64/libplds4.so
Reading symbols from /usr/lib64/qpid/daemon/replication_exchange.so...Reading symbols from /usr/lib/debug/usr/lib64/qpid/daemon/replication_exchange.so.debug...done.
done.
Loaded symbols for /usr/lib64/qpid/daemon/replication_exchange.so
Reading symbols from /usr/lib64/qpid/daemon/xml.so...Reading symbols from /usr/lib/debug/usr/lib64/qpid/daemon/xml.so.debug...done.
done.
Loaded symbols for /usr/lib64/qpid/daemon/xml.so
Reading symbols from /usr/lib64/libxerces-c.so.28...done.
Loaded symbols for /usr/lib64/libxerces-c.so.28
Reading symbols from /usr/lib64/libxqilla.so.3...done.
Loaded symbols for /usr/lib64/libxqilla.so.3
Reading symbols from /usr/lib64/qpid/daemon/acl.so...Reading symbols from /usr/lib/debug/usr/lib64/qpid/daemon/acl.so.debug...done.
done.
Loaded symbols for /usr/lib64/qpid/daemon/acl.so
Reading symbols from /usr/lib64/qpid/daemon/ssl.so...Reading symbols from /usr/lib/debug/usr/lib64/qpid/daemon/ssl.so.debug...done.
done.
Loaded symbols for /usr/lib64/qpid/daemon/ssl.so
Reading symbols from /usr/lib64/qpid/daemon/msgstore.so...done.
Loaded symbols for /usr/lib64/qpid/daemon/msgstore.so
Reading symbols from /usr/lib64/libdb_cxx-4.3.so...done.
Loaded symbols for /usr/lib64/libdb_cxx-4.3.so
Reading symbols from /usr/lib64/libaio.so.1...done.
Loaded symbols for /usr/lib64/libaio.so.1
Reading symbols from /usr/lib64/sasl2/libplain.so.2...done.
Loaded symbols for /usr/lib64/sasl2/libplain.so.2
Reading symbols from /usr/lib64/sasl2/libsasldb.so.2...done.
Loaded symbols for /usr/lib64/sasl2/libsasldb.so.2
Reading symbols from /usr/lib64/sasl2/libanonymous.so.2...done.
Loaded symbols for /usr/lib64/sasl2/libanonymous.so.2
Reading symbols from /usr/lib64/sasl2/liblogin.so.2...done.
Loaded symbols for /usr/lib64/sasl2/liblogin.so.2
Core was generated by `qpidd -p 5672 --auth no --log-enable info+ --cluster-name mrg-qe-02.lab.eng.brq'.
Program terminated with signal 11, Segmentation fault.
[New process 25133]
[New process 25132]
[New process 25131]
[New process 25130]
[New process 25129]
[New process 25128]
[New process 25127]
[New process 25126]
[New process 25125]
[New process 25123]
[New process 25122]
[New process 25121]
[New process 25120]
#0  0x0000000002058740 in ?? ()
(gdb)
Thread 13 (process 25120):
#0  0x00000036e04d3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00000030eff7d0dd in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:439
#2  0x00000030eff7dc87 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:405
#3  0x00000030f04c941e in qpid::broker::Broker::run (this=<value optimized out>) at qpid/broker/Broker.cpp:319
#4  0x00000000004069b8 in QpiddBroker::execute (this=<value optimized out>, options=0x1241d30) at posix/QpiddBroker.cpp:166
#5  0x00000000004054a8 in main (argc=11, argv=0x7fffed422d38) at qpidd.cpp:77

Thread 12 (process 25121):
#0  0x00000036e100ab00 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000030f0591379 in qpid::broker::Timer::run (this=<value optimized out>) at qpid/sys/posix/Condition.h:69
#2  0x00000030eff73cea in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#3  0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000036e04d30ad in clone () from /lib64/libc.so.6

Thread 11 (process 25122):
#0  0x00000036e100ab00 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000030f0591379 in qpid::broker::Timer::run (this=<value optimized out>) at qpid/sys/posix/Condition.h:69
#2  0x00000030eff73cea in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#3  0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000036e04d30ad in clone () from /lib64/libc.so.6

Thread 10 (process 25123):
#0  0x00000036e100ab00 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000030f0591379 in qpid::broker::Timer::run (this=<value optimized out>) at qpid/sys/posix/Condition.h:69
#2  0x00000030eff73cea in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#3  0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000036e04d30ad in clone () from /lib64/libc.so.6

Thread 9 (process 25125):
#0  0x00000036e100ab00 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000030f0591379 in qpid::broker::Timer::run (this=<value optimized out>) at qpid/sys/posix/Condition.h:69
#2  0x00000030eff73cea in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#3  0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000036e04d30ad in clone () from /lib64/libc.so.6

Thread 8 (process 25126):
#0  0x00000036e04d3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00000030eff7d0dd in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:439
#2  0x00000030eff7dc87 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:405
#3  0x00000030eff73cea in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#4  0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0
#5  0x00000036e04d30ad in clone () from /lib64/libc.so.6

Thread 7 (process 25127):
#0  0x00000036e04d3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00000030eff7d0dd in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:439
#2  0x00000030eff7dc87 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:405
#3  0x00000030eff73cea in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#4  0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0
#5  0x00000036e04d30ad in clone () from /lib64/libc.so.6

Thread 6 (process 25128):
#0  0x00000036e04d3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00000030eff7d0dd in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:439
#2  0x00000030eff7dc87 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:405
#3  0x00000030eff73cea in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#4  0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0
#5  0x00000036e04d30ad in clone () from /lib64/libc.so.6

Thread 5 (process 25129):
#0  0x00000036e04d3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00000030eff7d0dd in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:439
#2  0x00000030eff7dc87 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:405
#3  0x00000030eff73cea in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#4  0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0
#5  0x00000036e04d30ad in clone () from /lib64/libc.so.6

Thread 4 (process 25130):
#0  0x00000036e04d3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00000030eff7d0dd in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:439
#2  0x00000030eff7dc87 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:405
#3  0x00000030eff73cea in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#4  0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0
#5  0x00000036e04d30ad in clone () from /lib64/libc.so.6

Thread 3 (process 25131):
#0  0x00000036e04d3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00000030eff7d0dd in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:439
#2  0x00000030eff7dc87 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:405
#3  0x00000030eff73cea in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#4  0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0
#5  0x00000036e04d30ad in clone () from /lib64/libc.so.6

Thread 2 (process 25132):
#0  0x00000036e04d3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00000030eff7d0dd in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:439
#2  0x00000030eff7dc87 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:405
#3  0x00000030eff73cea in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#4  0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0
#5  0x00000036e04d30ad in clone () from /lib64/libc.so.6

Thread 1 (process 25133):
#0  0x0000000002058740 in ?? ()
#1  0x00000030eff7dcb3 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/Poller.h:122
#2  0x00000030eff73cea in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#3  0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000036e04d30ad in clone () from /lib64/libc.so.6
(gdb) quit

[09:08:45] get_cpu_info():CPU information:
processor       : 0 1 2 3 4 5 6 7
vendor_id       : AuthenticAMD
model name      : Quad-Core AMD Opteron(tm) Processor 2376
cpu MHz         : 800.000
cpu cores       : 4
bogomips        : 4592.50 4588.47 4588.55 4588.32 4589.26 4587.98 4588.54 4590.52
[09:08:45] Memory info:
             total       used       free     shared    buffers     cached
Mem:       8247168     549744    7697424          0      30376     361600
-/+ buffers/cache:     157768    8089400
Swap:     10289144          0   10289144
Comment 3 Gordon Sim 2010-06-08 04:21:26 EDT
This is believed to have been fixed by changes in the 1.3 rebase (as part of the cleanup of deletion of dispatch handles) and we are requesting verification of that.
Comment 5 ppecka 2010-09-20 04:12:03 EDT
VERIFIED on RHEL 5.5 both i386 / x86_64:
(tested for over 4days)

# rpm -qa | grep qpid | sort -u
python-qpid-0.7.946106-14.el5
qpid-cpp-client-0.7.946106-15.el5
qpid-cpp-client-devel-0.7.946106-15.el5
qpid-cpp-client-devel-docs-0.7.946106-15.el5
qpid-cpp-client-ssl-0.7.946106-15.el5
qpid-cpp-mrg-debuginfo-0.7.946106-15.el5
qpid-cpp-server-0.7.946106-15.el5
qpid-cpp-server-cluster-0.7.946106-15.el5
qpid-cpp-server-devel-0.7.946106-15.el5
qpid-cpp-server-ssl-0.7.946106-15.el5
qpid-cpp-server-store-0.7.946106-15.el5
qpid-cpp-server-xml-0.7.946106-15.el5
qpid-java-client-0.7.946106-9.el5
qpid-java-common-0.7.946106-9.el5
qpid-tests-0.7.946106-1.el5
qpid-tools-0.7.946106-10.el5

--> VERIFIED
Comment 6 Jaromir Hradilek 2010-10-08 06:18:14 EDT
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
The clustered qpidd service no longer terminates unexpectedly in qpid::sys::Poller::run().
Comment 8 errata-xmlrpc 2010-10-14 11:59:20 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0773.html

Note You need to log in before you can comment on or make changes to this bug.