Bug 490457
| Summary: | clustered broker crashes in DispatchHandle II | ||
|---|---|---|---|
| Product: | Red Hat Enterprise MRG | Reporter: | Frantisek Reznicek <freznice> |
| Component: | qpid-cpp | Assignee: | Andrew Stitcher <astitcher> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Frantisek Reznicek <freznice> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 1.1 | CC: | esammons, jross |
| Target Milestone: | 1.1.1 | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | 1.2 | Doc Type: | Bug Fix |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | --- | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 577362 | ||
| Bug Blocks: | |||
|
Description
Frantisek Reznicek
2009-03-16 14:55:03 UTC
This bug appears to only occur when the qpidd aommand line paramter --cluster-read-max is set to 1 or 2. For values 3 or above I cannot get this failure to occur. So it appears that a work around is never to set the values to 1 or 2 Removed blocking connection to previous bug as that bug was probabilistic and this one doesn't appear to be the same cause. As long as you don't choose 1 or 2 as the cluster read max that bug appears fixed. Fixed by refactoring DispatchHandle/Poller code responibilities *** Bug 492327 has been marked as a duplicate of this bug. *** Status, issue still visible on qpid*-0.5.752581-10.el5 using openais-0.80.3-22.el5_3.7. See backtrace below... GNU gdb Fedora (6.8-27.el5) Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu"... Reading symbols from /usr/lib64/libqpidbroker.so.0...Reading symbols from /usr/lib/debug/usr/lib64/libqpidbroker.so.0.1.0.debug...done. done. ... Loaded symbols for /usr/lib64/libibverbs.so.1 Core was generated by `qpidd --cluster-name soakTestCluster_5cec5ec2-5279-4dd4-87e5-126dbd50eb96 --aut'. Program terminated with signal 6, Aborted. [New process 20140] [New process 20153] [New process 20152] [New process 20151] [New process 20150] [New process 20149] [New process 20148] [New process 20147] [New process 20146] [New process 20145] [New process 20144] [New process 20143] [New process 20142] [New process 20141] [New process 20139] [New process 20137] [New process 20135] [New process 20134] [New process 20133] [New process 20128] #0 0x00000036ffc30215 in raise () from /lib64/libc.so.6 (gdb) Thread 20 (process 20128): #0 0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6 #1 0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:432 #2 0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398 #3 0x00000032b54cd116 in qpid::broker::Broker::run (this=<value optimized out>) at qpid/broker/Broker.cpp:319 #4 0x00000000004069b8 in QpiddBroker::execute (this=<value optimized out>, options=0x174f7d40) at posix/QpiddBroker.cpp:166 #5 0x00000000004054a8 in main (argc=17, argv=0x7fff84e51328) at qpidd.cpp:77 Thread 19 (process 20133): #0 0x000000370040ab00 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00000032b558f46f in qpid::broker::Timer::run (this=<value optimized out>) at qpid/sys/posix/Condition.h:69 #2 0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35 #3 0x0000003700406367 in start_thread () from /lib64/libpthread.so.0 #4 0x00000036ffcd30ad in clone () from /lib64/libc.so.6 Thread 18 (process 20134): #0 0x000000370040ab00 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00000032b558f46f in qpid::broker::Timer::run (this=<value optimized out>) at qpid/sys/posix/Condition.h:69 #2 0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35 #3 0x0000003700406367 in start_thread () from /lib64/libpthread.so.0 #4 0x00000036ffcd30ad in clone () from /lib64/libc.so.6 Thread 17 (process 20135): #0 0x000000370040ab00 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00000032b558f46f in qpid::broker::Timer::run (this=<value optimized out>) at qpid/sys/posix/Condition.h:69 #2 0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35 #3 0x0000003700406367 in start_thread () from /lib64/libpthread.so.0 #4 0x00000036ffcd30ad in clone () from /lib64/libc.so.6 Thread 16 (process 20137): #0 0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6 #1 0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:432 #2 0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398 #3 0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35 #4 0x0000003700406367 in start_thread () from /lib64/libpthread.so.0 #5 0x00000036ffcd30ad in clone () from /lib64/libc.so.6 Thread 15 (process 20139): #0 0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6 #1 0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:432 #2 0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398 #3 0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35 #4 0x0000003700406367 in start_thread () from /lib64/libpthread.so.0 #5 0x00000036ffcd30ad in clone () from /lib64/libc.so.6 Thread 14 (process 20141): #0 0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6 #1 0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:432 #2 0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398 #3 0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35 #4 0x0000003700406367 in start_thread () from /lib64/libpthread.so.0 #5 0x00000036ffcd30ad in clone () from /lib64/libc.so.6 Thread 13 (process 20142): #0 0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6 #1 0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:432 #2 0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398 #3 0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35 #4 0x0000003700406367 in start_thread () from /lib64/libpthread.so.0 #5 0x00000036ffcd30ad in clone () from /lib64/libc.so.6 Thread 12 (process 20143): #0 0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6 #1 0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:432 #2 0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398 #3 0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35 #4 0x0000003700406367 in start_thread () from /lib64/libpthread.so.0 #5 0x00000036ffcd30ad in clone () from /lib64/libc.so.6 Thread 11 (process 20144): #0 0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6 #1 0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:432 #2 0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398 #3 0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35 #4 0x0000003700406367 in start_thread () from /lib64/libpthread.so.0 #5 0x00000036ffcd30ad in clone () from /lib64/libc.so.6 Thread 10 (process 20145): #0 0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6 #1 0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:432 #2 0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398 #3 0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35 #4 0x0000003700406367 in start_thread () from /lib64/libpthread.so.0 #5 0x00000036ffcd30ad in clone () from /lib64/libc.so.6 Thread 9 (process 20146): #0 0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6 #1 0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:432 #2 0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398 #3 0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35 #4 0x0000003700406367 in start_thread () from /lib64/libpthread.so.0 #5 0x00000036ffcd30ad in clone () from /lib64/libc.so.6 Thread 8 (process 20147): #0 0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6 #1 0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:432 #2 0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398 #3 0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35 #4 0x0000003700406367 in start_thread () from /lib64/libpthread.so.0 #5 0x00000036ffcd30ad in clone () from /lib64/libc.so.6 Thread 7 (process 20148): #0 0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6 #1 0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:432 #2 0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398 #3 0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35 #4 0x0000003700406367 in start_thread () from /lib64/libpthread.so.0 #5 0x00000036ffcd30ad in clone () from /lib64/libc.so.6 Thread 6 (process 20149): #0 0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6 #1 0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:432 #2 0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398 #3 0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35 #4 0x0000003700406367 in start_thread () from /lib64/libpthread.so.0 #5 0x00000036ffcd30ad in clone () from /lib64/libc.so.6 Thread 5 (process 20150): #0 0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6 #1 0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:432 #2 0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398 #3 0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35 #4 0x0000003700406367 in start_thread () from /lib64/libpthread.so.0 #5 0x00000036ffcd30ad in clone () from /lib64/libc.so.6 Thread 4 (process 20151): #0 0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6 #1 0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:432 #2 0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398 #3 0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35 #4 0x0000003700406367 in start_thread () from /lib64/libpthread.so.0 #5 0x00000036ffcd30ad in clone () from /lib64/libc.so.6 Thread 3 (process 20152): #0 0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6 #1 0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:432 #2 0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398 #3 0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35 #4 0x0000003700406367 in start_thread () from /lib64/libpthread.so.0 #5 0x00000036ffcd30ad in clone () from /lib64/libc.so.6 Thread 2 (process 20153): #0 0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6 #1 0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:432 #2 0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398 #3 0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35 #4 0x0000003700406367 in start_thread () from /lib64/libpthread.so.0 #5 0x00000036ffcd30ad in clone () from /lib64/libc.so.6 Thread 1 (process 20140): #0 0x00000036ffc30215 in raise () from /lib64/libc.so.6 #1 0x00000036ffc31cc0 in abort () from /lib64/libc.so.6 #2 0x0000003ee44bec44 in __gnu_cxx::__verbose_terminate_handler () from /usr/lib64/libstdc++.so.6 #3 0x0000003ee44bcdb6 in ?? () from /usr/lib64/libstdc++.so.6 #4 0x0000003ee44bcde3 in std::terminate () from /usr/lib64/libstdc++.so.6 #5 0x0000003ee44bceca in __cxa_throw () from /usr/lib64/libstdc++.so.6 #6 0x00000032b4fc352e in ~ScopedLock (this=<value optimized out>) at qpid/sys/posix/Mutex.h:120 #7 0x00000032b4fc2df6 in qpid::sys::DispatchHandle::processEvent (this=<value optimized out>, type=<value optimized out>) at qpid/sys/DispatchHandle.cpp:420 #8 0x00000032b4f75aa3 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/Poller.h:122 #9 0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35 #10 0x0000003700406367 in start_thread () from /lib64/libpthread.so.0 #11 0x00000036ffcd30ad in clone () from /lib64/libc.so.6 (gdb) quit There is another variant of the issue detected on RHEL 5.3 x86_64 by long term failover soak run (just to have more info about the issue): GNU gdb Fedora (6.8-27.el5) Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu"... ... Reading symbols from /usr/lib64/libplds4.so...done. Loaded symbols for /usr/lib64/libplds4.so Core was generated by `qpidd --cluster-name soakTestCluster_43b4edc7-7e88-4fe8-9e6c-a5aff382ffa7 --aut'. Program terminated with signal 6, Aborted. [New process 4404] [New process 4410] [New process 4409] [New process 4408] [New process 4407] [New process 4406] [New process 4405] [New process 4403] [New process 4401] [New process 4400] [New process 4399] #0 0x00000036e0430215 in raise () from /lib64/libc.so.6 (gdb) Thread 11 (process 4399): #0 0x00000036e04d3498 in epoll_wait () from /lib64/libc.so.6 #1 0x00002aafc56620dd in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:439 #2 0x00002aafc5662c87 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:405 #3 0x00002aafc519c41e in qpid::broker::Broker::run ( this=<value optimized out>) at qpid/broker/Broker.cpp:319 #4 0x00000000004069b8 in QpiddBroker::execute (this=<value optimized out>, options=0xb40870) at posix/QpiddBroker.cpp:166 #5 0x00000000004054a8 in main (argc=18, argv=0x7fffe59ecec8) at qpidd.cpp:77 Thread 10 (process 4400): #0 0x00000036e100ab00 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00002aafc5264379 in qpid::broker::Timer::run (this=<value optimized out>) at qpid/sys/posix/Condition.h:69 #2 0x00002aafc5658cea in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35 #3 0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0 #4 0x00000036e04d30ad in clone () from /lib64/libc.so.6 Thread 9 (process 4401): #0 0x00000036e100ab00 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00002aafc5264379 in qpid::broker::Timer::run (this=<value optimized out>) at qpid/sys/posix/Condition.h:69 #2 0x00002aafc5658cea in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35 #3 0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0 #4 0x00000036e04d30ad in clone () from /lib64/libc.so.6 Thread 8 (process 4403): #0 0x00000036e04d3498 in epoll_wait () from /lib64/libc.so.6 #1 0x00002aafc56620dd in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:439 #2 0x00002aafc5662c87 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:405 #3 0x00002aafc5658cea in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35 #4 0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0 #5 0x00000036e04d30ad in clone () from /lib64/libc.so.6 Thread 7 (process 4405): #0 0x00000036e04d3498 in epoll_wait () from /lib64/libc.so.6 #1 0x00002aafc56620dd in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:439 #2 0x00002aafc5662c87 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:405 #3 0x00002aafc5658cea in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35 #4 0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0 #5 0x00000036e04d30ad in clone () from /lib64/libc.so.6 Thread 6 (process 4406): #0 0x00000036e04d3498 in epoll_wait () from /lib64/libc.so.6 #1 0x00002aafc56620dd in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:439 #2 0x00002aafc5662c87 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:405 #3 0x00002aafc5658cea in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35 #4 0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0 #5 0x00000036e04d30ad in clone () from /lib64/libc.so.6 Thread 5 (process 4407): #0 0x00000036e04d3498 in epoll_wait () from /lib64/libc.so.6 #1 0x00002aafc56620dd in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:439 #2 0x00002aafc5662c87 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:405 #3 0x00002aafc5658cea in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35 #4 0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0 #5 0x00000036e04d30ad in clone () from /lib64/libc.so.6 Thread 4 (process 4408): #0 0x00000036e04d4477 in semop () from /lib64/libc.so.6 #1 0x00002aafc5bf9bc4 in openais_msg_send_reply_receive ( ipc_context=<value optimized out>, iov=<value optimized out>, iov_len=<value optimized out>, res_msg=<value optimized out>, res_len=<value optimized out>) at util.c:623 #2 0x00002aafc5bfa68e in cpg_mcast_joined (handle=<value optimized out>, guarantee=<value optimized out>, iovec=<value optimized out>, iov_len=<value optimized out>) at cpg.c:513 #3 0x00002aafc59a2754 in qpid::cluster::Cpg::mcast (this=0xb4ca10, iov=0x454f5560, iovLen=1) at qpid/cluster/Cpg.cpp:124 #4 0x00002aafc59bd19e in qpid::cluster::Multicaster::sendMcast ( this=0xb4cb48, values=@0xb4cc48) at qpid/cluster/Multicaster.cpp:79 #5 0x00002aafc59bef96 in boost::detail::function::function_obj_invoker1<boost::_bi::bind_t<__gnu_cxx::__normal_iterator<qpid::cluster::Event const*, std::vector<qpid::cluster::Event, std::allocator<qpid::cluster::Event> > >, boost::_mfi::mf1<__gnu_cxx::__normal_iterator<qpid::cluster::Event const*, std::vector<qpid::cluster::Event, std::allocator<qpid::cluster::Event> > >, qpid::cluster::Multicaster, std::vector<qpid::cluster::Event, std::allocator<qpid::cluster::Event> > const&>, boost::_bi::list2<boost::_bi::value<qpid::cluster::Multicaster*>, boost::arg<1> > >, __gnu_cxx::__normal_iterator<qpid::cluster::Event const*, std::vector<qpid::cluster::Event, std::allocator<qpid::cluster::Event> > >, std::vector<qpid::cluster::Event, std::allocator<qpid::cluster::Event> > const&>::invoke ( function_obj_ptr=<value optimized out>, a0=@0x454f4ec0) at /usr/include/boost/bind/mem_fn_template.hpp:149 #6 0x00002aafc5981aea in boost::function1<__gnu_cxx::__normal_iterator<qpid::cluster::Event const*, std::vector<qpid::cluster::Event, std::allocator<qpid::cluster::Event> > >, std::vector<qpid::cluster::Event, std::allocator<qpid::cluster::Event> > const&, std::allocator<void> >::operator() (this=0x0, a0=@0x454f4ec0) at /usr/include/boost/function/function_template.hpp:576 #7 0x00002aafc598829c in qpid::sys::PollableQueue<qpid::cluster::Event>::process (this=0xb4cb90) at qpid/sys/PollableQueue.h:153 #8 0x00002aafc5989c0d in qpid::sys::PollableQueue<qpid::cluster::Event>::dispatch (this=0xb4cb90, cond=@0xb4cc00) at qpid/sys/PollableQueue.h:138 #9 0x00002aafc566010f in boost::function1<void, qpid::sys::PollableCondition&, std::allocator<boost::function_base> >::operator() ( this=<value optimized out>, a0=<value optimized out>) at /usr/include/boost/function/function_template.hpp:576 #10 0x00002aafc56aed07 in boost::function1<void, qpid::sys::DispatchHandle&, std::allocator<boost::function_base> >::operator() (this=<value optimized out>, a0=<value optimized out>) at /usr/include/boost/function/function_template.hpp:576 #11 0x00002aafc56aca0b in qpid::sys::DispatchHandle::processEvent ( this=<value optimized out>, type=<value optimized out>) at qpid/sys/DispatchHandle.cpp:432 #12 0x00002aafc5662cb3 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/Poller.h:122 #13 0x00002aafc5658cea in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35 #14 0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0 #15 0x00000036e04d30ad in clone () from /lib64/libc.so.6 Thread 3 (process 4409): #0 0x00000036e04d3498 in epoll_wait () from /lib64/libc.so.6 #1 0x00002aafc56620dd in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:439 #2 0x00002aafc5662c87 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:405 #3 0x00002aafc5658cea in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35 #4 0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0 #5 0x00000036e04d30ad in clone () from /lib64/libc.so.6 Thread 2 (process 4410): #0 0x00000036e04d3498 in epoll_wait () from /lib64/libc.so.6 #1 0x00002aafc56620dd in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:439 #2 0x00002aafc5662c87 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:405 #3 0x00002aafc5658cea in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35 #4 0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0 #5 0x00000036e04d30ad in clone () from /lib64/libc.so.6 Thread 1 (process 4404): #0 0x00000036e0430215 in raise () from /lib64/libc.so.6 #1 0x00000036e0431cc0 in abort () from /lib64/libc.so.6 #2 0x00000036ecabec44 in __gnu_cxx::__verbose_terminate_handler () from /usr/lib64/libstdc++.so.6 #3 0x00000036ecabcdb6 in ?? () from /usr/lib64/libstdc++.so.6 #4 0x00000036ecabcde3 in std::terminate () from /usr/lib64/libstdc++.so.6 #5 0x00000036ecabceca in __cxa_throw () from /usr/lib64/libstdc++.so.6 #6 0x00002aafc56ac8d3 in qpid::sys::DispatchHandle::processEvent ( this=<value optimized out>, type=<value optimized out>) at qpid/sys/posix/Mutex.h:120 #7 0x00002aafc5662cb3 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/Poller.h:122 #8 0x00002aafc5658cea in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35 #9 0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0 #10 0x00000036e04d30ad in clone () from /lib64/libc.so.6 (gdb) quit Reproducibility is very low (~1%), seen on packages: [root@mrg-qe-02 qpid_ptest_cluster_failover_soak]# rpm -qa | egrep '(qpid|rhm|openais)' | sort -u openais-0.80.3-22.el5_3.8 openais-debuginfo-0.80.3-22.el5_3.8 python-qpid-0.5.752581-3.el5 qpidc-0.5.752581-22.el5 qpidc-debuginfo-0.5.752581-22.el5 qpidc-devel-0.5.752581-22.el5 qpidc-perftest-0.5.752581-22.el5 qpidc-rdma-0.5.752581-22.el5 qpidc-ssl-0.5.752581-22.el5 qpidd-0.5.752581-22.el5 qpidd-acl-0.5.752581-22.el5 qpidd-cluster-0.5.752581-22.el5 qpidd-devel-0.5.752581-22.el5 qpid-dotnet-0.4.738274-2.el5 qpidd-rdma-0.5.752581-22.el5 qpidd-ssl-0.5.752581-22.el5 qpidd-xml-0.5.752581-22.el5 qpid-java-client-0.5.751061-8.el5 qpid-java-common-0.5.751061-8.el5 rhm-0.5.3206-5.el5 rhm-docs-0.5.756148-1.el5 Is this fixed and verified? The issue is going to be re-tested in stress test (low reproducibility). Long term testing on RHEL5.6 i386/x86_64 proved that issue has been resolved. python-qpid-0.10-1.el5.noarch python-qpid-qmf-0.10-10.el5.x86_64 qpid-cpp-client-0.10-8.el5.x86_64 qpid-cpp-client-devel-0.10-8.el5.x86_64 qpid-cpp-client-devel-docs-0.10-8.el5.x86_64 qpid-cpp-client-ssl-0.10-8.el5.x86_64 qpid-cpp-mrg-debuginfo-0.9.1073306-1.el5.x86_64 qpid-cpp-server-0.10-8.el5.x86_64 qpid-cpp-server-cluster-0.10-8.el5.x86_64 qpid-cpp-server-devel-0.10-8.el5.x86_64 qpid-cpp-server-ssl-0.10-8.el5.x86_64 qpid-cpp-server-store-0.10-8.el5.x86_64 qpid-cpp-server-xml-0.10-8.el5.x86_64 qpid-java-client-0.10-6.el5.noarch qpid-java-common-0.10-6.el5.noarch qpid-java-example-0.10-6.el5.noarch qpid-qmf-0.10-10.el5.x86_64 qpid-qmf-devel-0.10-10.el5.x86_64 qpid-tools-0.10-6.el5.noarch rh-qpid-cpp-tests-0.10-8.el5.x86_64 -> VERIFIED |