Bug 490457 - clustered broker crashes in DispatchHandle II
Summary: clustered broker crashes in DispatchHandle II
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: 1.1
Hardware: All
OS: Linux
high
high
Target Milestone: 1.1.1
: ---
Assignee: Andrew Stitcher
QA Contact: Frantisek Reznicek
URL:
Whiteboard:
: 492327 (view as bug list)
Depends On: 577362
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-03-16 14:55 UTC by Frantisek Reznicek
Modified: 2015-11-16 00:07 UTC (History)
2 users (show)

Fixed In Version: 1.2
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Frantisek Reznicek 2009-03-16 14:55:03 UTC
Description of problem:
During bug 479326 validation found another one crash when clustered qpidd run loaded by perftest.

qpidd --auth no --log-enable debug+ --cluster-name foobar \
      --cluster-read-max 1--no-module-dir --no-data-dir \
      --load-module ${cluster_so} &

perftest --count 10 --npubs 10 --nsubs 10 -s 
in a loop.

Very close to bug 479326.


Version-Release number of selected component (if applicable):
  [root@dhcp-lab-200 bz479326]# rpm -qa | egrep '(qpidd|openais)'
  openais-devel-0.80.3-22.el5_3.3
  qpidd-devel-0.5.752581-1.el5
  openais-0.80.3-22.el5_3.3
  qpidd-0.5.752581-1.el5
  qpidd-rdma-0.5.752581-1.el5
  qpidd-ssl-0.5.752581-1.el5
  qpidd-acl-0.5.752581-1.el5
  qpidd-xml-0.5.752581-1.el5
  qpidd-cluster-0.5.752581-1.el5

How reproducible:
100%, quickly

Steps to Reproduce:
1. see below test reproducer, just run it 
  
Actual results:
The qpidd crashes. shortly after test reproducer is launched (in 2nd or third perftest run)

Expected results:
The qpidd should not crash.

Additional info:

test reproducer:
~~~~~~~~~~~~~~~
#!/bin/bash

service openais restart

ulimit -c unlimited

cluster_so=$(rpm -ql qpidd-cluster 2>/dev/null | head -1)

qpidd --auth no --log-enable debug+ --cluster-name foobar --cluster-read-max 1
\
      --no-module-dir --no-data-dir --load-module ${cluster_so} >qpidd.log 2>&1
&

netstat -nlp | grep qpidd


sleep 5

for ((i=0;i<10;i++)); do 
  perftest --count 10 --npubs 10 --nsubs 10 -s >>perftest.log
  echo -n "$?"

  pid=$(netstat -nlp | grep qpidd | awk '{print $NF}' | awk  -F/ '{print
$(NF-1)}')
  if [ -z "${pid}"  ]; then
    break
  fi


done
echo

pid=$(netstat -nlp | grep qpidd | awk '{print $NF}' | awk  -F/ '{print
$(NF-1)}')
if [ -n "${pid}"  ]; then
  kill %1
  sleep 5
fi

pid=$(netstat -nlp | grep qpidd | awk '{print $NF}' | awk  -F/ '{print
$(NF-1)}')

if [ -n "${pid}"  ]; then
  echo -n "killing ..."
  kill ${pid}
fi
echo "exiting"

cat perftest.log

# eof



transcript: which results in core-dump (backtrace included):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[root@dhcp-lab-200 bz479326]# ./run.sh
Stopping OpenAIS daemon (aisexec):                         [  OK  ]
Starting OpenAIS daemon (aisexec):                         [  OK  ]
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
Error in shutdown: Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
Error in shutdown: Connection closed
Error in shutdown: Connection closed
Error in shutdown: Connection closed
Error in shutdown: Connection closed
Error in shutdown: Connection closed
Error in shutdown: Connection closed
Error in shutdown: Connection closed
Error in shutdown: Connection closed
Error in shutdown: Connection closed
Error in shutdown: Connection closed
Error in shutdown: Connection closed
Error in shutdown: Connection closed
Error in shutdown: Connection closed
Error in shutdown: Connection closed
Error in shutdown: Connection closed
Error in shutdown: Connection closed
./run.sh: line 30:  2823 Aborted                 (core dumped) qpidd --auth no
--log-enable debug+ --cluster-name foobar --cluster-read-max 1 --no-module-dir
--no-data-dir --load-module ${cluster_so} > qpidd.log 2>&1
0
exiting
2408.44 28.9542 2270.51 2.21729
3320.2  21.9707 1665.36 1.62633

Connection closed

Connection refused: localhost:5672 (qpid/sys/posix/Socket.cpp:162)

Connection refused: localhost:5672 (qpid/sys/posix/Socket.cpp:162)

Connection refused: localhost:5672 (qpid/sys/posix/Socket.cpp:162)

Connection refused: localhost:5672 (qpid/sys/posix/Socket.cpp:162)

Connection refused: localhost:5672 (qpid/sys/posix/Socket.cpp:162)

Connection refused: localhost:5672 (qpid/sys/posix/Socket.cpp:162)

Connection refused: localhost:5672 (qpid/sys/posix/Socket.cpp:162)
1256.62 37.4777 2133.65 2.08364

Connection refused: localhost:5672 (qpid/sys/posix/Socket.cpp:162)
1143.96 35.388  2994.51 2.92432
[root@dhcp-lab-200 bz479326]# rpm -q qpidd
qpidd-0.5.752581-1.el5
[root@dhcp-lab-200 bz479326]# gdb `which qpidd` core.2
core.2674  core.2823
[root@dhcp-lab-200 bz479326]# gdb `which qpidd` core.2674
GNU gdb Red Hat Linux (6.5-37.el5_2.2rh)
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...(no debugging symbols
found)
Using host libthread_db library "/lib64/libthread_db.so.1".

...
Core was generated by `qpidd --auth no --log-enable debug+ --cluster-name
foobar --cluster-read-max 1'.
Program terminated with signal 6, Aborted.
#0  0x000000350b030155 in raise () from /lib64/libc.so.6
(gdb) thread apply all bt

Thread 12 (process 2674):
#0  0x000000350b0d1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x0000003549972e8d in qpid::sys::Poller::wait () from
/usr/lib64/libqpidcommon.so.0
#2  0x0000003549973c67 in qpid::sys::Poller::run () from
/usr/lib64/libqpidcommon.so.0
#3  0x0000003549eccb86 in qpid::broker::Broker::run () from
/usr/lib64/libqpidbroker.so.0
#4  0x0000000000406948 in qpid::log::Options::~Options ()
#5  0x0000000000405438 in __cxa_pure_virtual ()
#6  0x000000350b01d8b4 in __libc_start_main () from /lib64/libc.so.6
#7  0x0000000000404eb9 in __cxa_pure_virtual ()
#8  0x00007fff6f2b4ad8 in ?? ()
#9  0x0000000000000000 in ?? ()

Thread 11 (process 2677):
#0  0x000000350bc0a687 in pthread_cond_timedwait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
#1  0x0000003549f88e6f in qpid::broker::Timer::run () from
/usr/lib64/libqpidbroker.so.0
#2  0x000000354996ac4a in qpid::sys::AbsTime::AbsTime () from
/usr/lib64/libqpidcommon.so.0
#3  0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#4  0x000000350b0d1b6d in clone () from /lib64/libc.so.6

Thread 10 (process 2678):
#0  0x000000350bc0a687 in pthread_cond_timedwait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
#1  0x0000003549f88e6f in qpid::broker::Timer::run () from
/usr/lib64/libqpidbroker.so.0
#2  0x000000354996ac4a in qpid::sys::AbsTime::AbsTime () from
/usr/lib64/libqpidcommon.so.0
#3  0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#4  0x000000350b0d1b6d in clone () from /lib64/libc.so.6

Thread 9 (process 2679):
#0  0x000000350bc0a687 in pthread_cond_timedwait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
#1  0x0000003549f88e6f in qpid::broker::Timer::run () from
/usr/lib64/libqpidbroker.so.0
#2  0x000000354996ac4a in qpid::sys::AbsTime::AbsTime () from
/usr/lib64/libqpidcommon.so.0
#3  0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#4  0x000000350b0d1b6d in clone () from /lib64/libc.so.6

Thread 8 (process 2681):
#0  0x000000350b0d1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x0000003549972e8d in qpid::sys::Poller::wait () from
/usr/lib64/libqpidcommon.so.0
#2  0x0000003549973c67 in qpid::sys::Poller::run () from
/usr/lib64/libqpidcommon.so.0
#3  0x000000354996ac4a in qpid::sys::AbsTime::AbsTime () from
/usr/lib64/libqpidcommon.so.0
#4  0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
---Type <return> to continue, or q <return> to quit---
#5  0x000000350b0d1b6d in clone () from /lib64/libc.so.6

Thread 7 (process 2682):
#0  0x000000350bc0c999 in __lll_mutex_unlock_wake () from
/lib64/libpthread.so.0
#1  0x000000350bc09a59 in _L_mutex_unlock_59 () from /lib64/libpthread.so.0
#2  0x000000350bc0971b in __pthread_mutex_unlock_usercnt () from
/lib64/libpthread.so.0
#3  0x000000350b105da2 in dl_iterate_phdr () from /lib64/libc.so.6
#4  0x000000350fc0a626 in _Unwind_Find_FDE () from /lib64/libgcc_s.so.1
#5  0x000000350fc075b5 in _Unwind_GetIPInfo () from /lib64/libgcc_s.so.1
#6  0x000000350fc08ee9 in _Unwind_RaiseException () from /lib64/libgcc_s.so.1
#7  0x000000350fc09071 in _Unwind_Resume_or_Rethrow () from
/lib64/libgcc_s.so.1
#8  0x00000035108bced8 in __cxa_rethrow () from /usr/lib64/libstdc++.so.6
#9  0x00000035108bec5e in __gnu_cxx::__verbose_terminate_handler () from
/usr/lib64/libstdc++.so.6
#10 0x00000035108bce36 in std::set_unexpected () from /usr/lib64/libstdc++.so.6
#11 0x00000035108bce63 in std::terminate () from /usr/lib64/libstdc++.so.6
#12 0x00000035108bcf4a in __cxa_throw () from /usr/lib64/libstdc++.so.6
#13 0x00000035499c0eee in qpid::sys::ScopedLock<qpid::sys::Mutex>::~ScopedLock
() from /usr/lib64/libqpidcommon.so.0
#14 0x00000035499c07b6 in qpid::sys::DispatchHandle::processEvent () from
/usr/lib64/libqpidcommon.so.0
#15 0x0000003549973c93 in qpid::sys::Poller::run () from
/usr/lib64/libqpidcommon.so.0
#16 0x000000354996ac4a in qpid::sys::AbsTime::AbsTime () from
/usr/lib64/libqpidcommon.so.0
#17 0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#18 0x000000350b0d1b6d in clone () from /lib64/libc.so.6

Thread 6 (process 2683):
#0  0x000000350b0d1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x0000003549972e8d in qpid::sys::Poller::wait () from
/usr/lib64/libqpidcommon.so.0
#2  0x0000003549973c67 in qpid::sys::Poller::run () from
/usr/lib64/libqpidcommon.so.0
#3  0x000000354996ac4a in qpid::sys::AbsTime::AbsTime () from
/usr/lib64/libqpidcommon.so.0
#4  0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#5  0x000000350b0d1b6d in clone () from /lib64/libc.so.6

Thread 5 (process 2684):
#0  0x000000350b0d1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x0000003549972e8d in qpid::sys::Poller::wait () from
/usr/lib64/libqpidcommon.so.0
#2  0x0000003549973c67 in qpid::sys::Poller::run () from
/usr/lib64/libqpidcommon.so.0
#3  0x000000354996ac4a in qpid::sys::AbsTime::AbsTime () from
/usr/lib64/libqpidcommon.so.0
#4  0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#5  0x000000350b0d1b6d in clone () from /lib64/libc.so.6

Thread 4 (process 2685):
---Type <return> to continue, or q <return> to quit---
#0  0x000000350b0d1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x0000003549972e8d in qpid::sys::Poller::wait () from
/usr/lib64/libqpidcommon.so.0
#2  0x0000003549973c67 in qpid::sys::Poller::run () from
/usr/lib64/libqpidcommon.so.0
#3  0x000000354996ac4a in qpid::sys::AbsTime::AbsTime () from
/usr/lib64/libqpidcommon.so.0
#4  0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#5  0x000000350b0d1b6d in clone () from /lib64/libc.so.6

Thread 3 (process 2686):
#0  0x000000350b0d1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x0000003549972e8d in qpid::sys::Poller::wait () from
/usr/lib64/libqpidcommon.so.0
#2  0x0000003549973c67 in qpid::sys::Poller::run () from
/usr/lib64/libqpidcommon.so.0
#3  0x000000354996ac4a in qpid::sys::AbsTime::AbsTime () from
/usr/lib64/libqpidcommon.so.0
#4  0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#5  0x000000350b0d1b6d in clone () from /lib64/libc.so.6

Thread 2 (process 2688):
#0  0x000000350b0d1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x0000003549972e8d in qpid::sys::Poller::wait () from
/usr/lib64/libqpidcommon.so.0
#2  0x0000003549973c67 in qpid::sys::Poller::run () from
/usr/lib64/libqpidcommon.so.0
#3  0x000000354996ac4a in qpid::sys::AbsTime::AbsTime () from
/usr/lib64/libqpidcommon.so.0
#4  0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#5  0x000000350b0d1b6d in clone () from /lib64/libc.so.6

Thread 1 (process 2687):
#0  0x000000350b030155 in raise () from /lib64/libc.so.6
#1  0x000000350b031bf0 in abort () from /lib64/libc.so.6
#2  0x00000035108bec9f in __gnu_cxx::__verbose_terminate_handler () from
/usr/lib64/libstdc++.so.6
#3  0x00000035108bce36 in std::set_unexpected () from /usr/lib64/libstdc++.so.6
#4  0x00000035108bce63 in std::terminate () from /usr/lib64/libstdc++.so.6
#5  0x00000035108bcf4a in __cxa_throw () from /usr/lib64/libstdc++.so.6
#6  0x00000035499c0eee in qpid::sys::ScopedLock<qpid::sys::Mutex>::~ScopedLock
() from /usr/lib64/libqpidcommon.so.0
#7  0x00000035499c07b6 in qpid::sys::DispatchHandle::processEvent () from
/usr/lib64/libqpidcommon.so.0
#8  0x0000003549973c93 in qpid::sys::Poller::run () from
/usr/lib64/libqpidcommon.so.0
#9  0x000000354996ac4a in qpid::sys::AbsTime::AbsTime () from
/usr/lib64/libqpidcommon.so.0
#10 0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#11 0x000000350b0d1b6d in clone () from /lib64/libc.so.6


[root@dhcp-lab-200 bz479326]# gdb `which qpidd` core.2674
GNU gdb Red Hat Linux (6.5-37.el5_2.2rh)
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...(no debugging symbols
found)
Using host libthread_db library "/lib64/libthread_db.so.1".

Reading symbols from /usr/lib64/libqpidbroker.so.0...(no debugging symbols
found)...done.
...
found)...done.
Loaded symbols for /usr/lib64/libibverbs.so.1

Core was generated by `qpidd --auth no --log-enable debug+ --cluster-name
foobar --cluster-read-max 1'.
Program terminated with signal 6, Aborted.
#0  0x000000350b030155 in raise () from /lib64/libc.so.6
(gdb) bt
#0  0x000000350b030155 in raise () from /lib64/libc.so.6
#1  0x000000350b031bf0 in abort () from /lib64/libc.so.6
#2  0x00000035108bec9f in __gnu_cxx::__verbose_terminate_handler () from
/usr/lib64/libstdc++.so.6
#3  0x00000035108bce36 in std::set_unexpected () from /usr/lib64/libstdc++.so.6
#4  0x00000035108bce63 in std::terminate () from /usr/lib64/libstdc++.so.6
#5  0x00000035108bcf4a in __cxa_throw () from /usr/lib64/libstdc++.so.6
#6  0x00000035499c0eee in qpid::sys::ScopedLock<qpid::sys::Mutex>::~ScopedLock
() from /usr/lib64/libqpidcommon.so.0
#7  0x00000035499c07b6 in qpid::sys::DispatchHandle::processEvent () from
/usr/lib64/libqpidcommon.so.0
#8  0x0000003549973c93 in qpid::sys::Poller::run () from
/usr/lib64/libqpidcommon.so.0
#9  0x000000354996ac4a in qpid::sys::AbsTime::AbsTime () from
/usr/lib64/libqpidcommon.so.0
#10 0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#11 0x000000350b0d1b6d in clone () from /lib64/libc.so.6

Comment 1 Andrew Stitcher 2009-03-19 15:38:57 UTC
This bug appears to only occur when the qpidd aommand line paramter --cluster-read-max is set to 1 or 2. For values 3 or above I cannot get this failure to occur.

So it appears that a work around is never to set the values to 1 or 2

Removed blocking connection to previous bug as that bug was probabilistic and this one doesn't appear to be the same cause. As long as you don't choose 1 or 2 as the cluster read max that bug appears fixed.

Comment 2 Andrew Stitcher 2009-05-05 13:19:16 UTC
Fixed by refactoring DispatchHandle/Poller code responibilities

Comment 3 Andrew Stitcher 2009-05-05 14:30:00 UTC
*** Bug 492327 has been marked as a duplicate of this bug. ***

Comment 4 Frantisek Reznicek 2009-06-03 06:57:50 UTC
Status, issue still visible on qpid*-0.5.752581-10.el5 using openais-0.80.3-22.el5_3.7. See backtrace below...

GNU gdb Fedora (6.8-27.el5)
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...
Reading symbols from /usr/lib64/libqpidbroker.so.0...Reading symbols from /usr/lib/debug/usr/lib64/libqpidbroker.so.0.1.0.debug...done.
done.
...
Loaded symbols for /usr/lib64/libibverbs.so.1
Core was generated by `qpidd --cluster-name soakTestCluster_5cec5ec2-5279-4dd4-87e5-126dbd50eb96 --aut'.
Program terminated with signal 6, Aborted.
[New process 20140]
[New process 20153]
[New process 20152]
[New process 20151]
[New process 20150]
[New process 20149]
[New process 20148]
[New process 20147]
[New process 20146]
[New process 20145]
[New process 20144]
[New process 20143]
[New process 20142]
[New process 20141]
[New process 20139]
[New process 20137]
[New process 20135]
[New process 20134]
[New process 20133]
[New process 20128]
#0  0x00000036ffc30215 in raise () from /lib64/libc.so.6
(gdb)
Thread 20 (process 20128):
#0  0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:432
#2  0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398
#3  0x00000032b54cd116 in qpid::broker::Broker::run (this=<value optimized out>) at qpid/broker/Broker.cpp:319
#4  0x00000000004069b8 in QpiddBroker::execute (this=<value optimized out>, options=0x174f7d40) at posix/QpiddBroker.cpp:166
#5  0x00000000004054a8 in main (argc=17, argv=0x7fff84e51328) at qpidd.cpp:77

Thread 19 (process 20133):
#0  0x000000370040ab00 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000032b558f46f in qpid::broker::Timer::run (this=<value optimized out>) at qpid/sys/posix/Condition.h:69
#2  0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#3  0x0000003700406367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000036ffcd30ad in clone () from /lib64/libc.so.6

Thread 18 (process 20134):
#0  0x000000370040ab00 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000032b558f46f in qpid::broker::Timer::run (this=<value optimized out>) at qpid/sys/posix/Condition.h:69
#2  0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#3  0x0000003700406367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000036ffcd30ad in clone () from /lib64/libc.so.6

Thread 17 (process 20135):
#0  0x000000370040ab00 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000032b558f46f in qpid::broker::Timer::run (this=<value optimized out>) at qpid/sys/posix/Condition.h:69
#2  0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#3  0x0000003700406367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000036ffcd30ad in clone () from /lib64/libc.so.6

Thread 16 (process 20137):
#0  0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:432
#2  0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398
#3  0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#4  0x0000003700406367 in start_thread () from /lib64/libpthread.so.0
#5  0x00000036ffcd30ad in clone () from /lib64/libc.so.6

Thread 15 (process 20139):
#0  0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:432
#2  0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398
#3  0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#4  0x0000003700406367 in start_thread () from /lib64/libpthread.so.0
#5  0x00000036ffcd30ad in clone () from /lib64/libc.so.6

Thread 14 (process 20141):
#0  0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:432
#2  0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398
#3  0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#4  0x0000003700406367 in start_thread () from /lib64/libpthread.so.0
#5  0x00000036ffcd30ad in clone () from /lib64/libc.so.6

Thread 13 (process 20142):
#0  0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:432
#2  0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398
#3  0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#4  0x0000003700406367 in start_thread () from /lib64/libpthread.so.0
#5  0x00000036ffcd30ad in clone () from /lib64/libc.so.6

Thread 12 (process 20143):
#0  0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:432
#2  0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398
#3  0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#4  0x0000003700406367 in start_thread () from /lib64/libpthread.so.0
#5  0x00000036ffcd30ad in clone () from /lib64/libc.so.6

Thread 11 (process 20144):
#0  0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:432
#2  0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398
#3  0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#4  0x0000003700406367 in start_thread () from /lib64/libpthread.so.0
#5  0x00000036ffcd30ad in clone () from /lib64/libc.so.6

Thread 10 (process 20145):
#0  0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:432
#2  0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398
#3  0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#4  0x0000003700406367 in start_thread () from /lib64/libpthread.so.0
#5  0x00000036ffcd30ad in clone () from /lib64/libc.so.6

Thread 9 (process 20146):
#0  0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:432
#2  0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398
#3  0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#4  0x0000003700406367 in start_thread () from /lib64/libpthread.so.0
#5  0x00000036ffcd30ad in clone () from /lib64/libc.so.6

Thread 8 (process 20147):
#0  0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:432
#2  0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398
#3  0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#4  0x0000003700406367 in start_thread () from /lib64/libpthread.so.0
#5  0x00000036ffcd30ad in clone () from /lib64/libc.so.6

Thread 7 (process 20148):
#0  0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:432
#2  0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398
#3  0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#4  0x0000003700406367 in start_thread () from /lib64/libpthread.so.0
#5  0x00000036ffcd30ad in clone () from /lib64/libc.so.6

Thread 6 (process 20149):
#0  0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:432
#2  0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398
#3  0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#4  0x0000003700406367 in start_thread () from /lib64/libpthread.so.0
#5  0x00000036ffcd30ad in clone () from /lib64/libc.so.6

Thread 5 (process 20150):
#0  0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:432
#2  0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398
#3  0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#4  0x0000003700406367 in start_thread () from /lib64/libpthread.so.0
#5  0x00000036ffcd30ad in clone () from /lib64/libc.so.6

Thread 4 (process 20151):
#0  0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:432
#2  0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398
#3  0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#4  0x0000003700406367 in start_thread () from /lib64/libpthread.so.0
#5  0x00000036ffcd30ad in clone () from /lib64/libc.so.6

Thread 3 (process 20152):
#0  0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:432
#2  0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398
#3  0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#4  0x0000003700406367 in start_thread () from /lib64/libpthread.so.0
#5  0x00000036ffcd30ad in clone () from /lib64/libc.so.6

Thread 2 (process 20153):
#0  0x00000036ffcd3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00000032b4f74c9d in qpid::sys::Poller::wait (this=<value optimized out>, timeout=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:432
#2  0x00000032b4f75a77 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:398
#3  0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#4  0x0000003700406367 in start_thread () from /lib64/libpthread.so.0
#5  0x00000036ffcd30ad in clone () from /lib64/libc.so.6

Thread 1 (process 20140):
#0  0x00000036ffc30215 in raise () from /lib64/libc.so.6
#1  0x00000036ffc31cc0 in abort () from /lib64/libc.so.6
#2  0x0000003ee44bec44 in __gnu_cxx::__verbose_terminate_handler () from /usr/lib64/libstdc++.so.6
#3  0x0000003ee44bcdb6 in ?? () from /usr/lib64/libstdc++.so.6
#4  0x0000003ee44bcde3 in std::terminate () from /usr/lib64/libstdc++.so.6
#5  0x0000003ee44bceca in __cxa_throw () from /usr/lib64/libstdc++.so.6
#6  0x00000032b4fc352e in ~ScopedLock (this=<value optimized out>) at qpid/sys/posix/Mutex.h:120
#7  0x00000032b4fc2df6 in qpid::sys::DispatchHandle::processEvent (this=<value optimized out>, type=<value optimized out>)
    at qpid/sys/DispatchHandle.cpp:420
#8  0x00000032b4f75aa3 in qpid::sys::Poller::run (this=<value optimized out>) at qpid/sys/Poller.h:122
#9  0x00000032b4f6ca5a in runRunnable (p=<value optimized out>) at qpid/sys/posix/Thread.cpp:35
#10 0x0000003700406367 in start_thread () from /lib64/libpthread.so.0
#11 0x00000036ffcd30ad in clone () from /lib64/libc.so.6
(gdb) quit

Comment 5 Frantisek Reznicek 2009-07-07 08:40:00 UTC
There is another variant of the issue detected on RHEL 5.3 x86_64 by long term failover soak run (just to have more info about the issue):

GNU gdb Fedora (6.8-27.el5)
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...

...
Reading symbols from /usr/lib64/libplds4.so...done.
Loaded symbols for /usr/lib64/libplds4.so
Core was generated by `qpidd --cluster-name soakTestCluster_43b4edc7-7e88-4fe8-9e6c-a5aff382ffa7 --aut'.
Program terminated with signal 6, Aborted.
[New process 4404]
[New process 4410]
[New process 4409]
[New process 4408]
[New process 4407]
[New process 4406]
[New process 4405]
[New process 4403]
[New process 4401]
[New process 4400]
[New process 4399]
#0  0x00000036e0430215 in raise () from /lib64/libc.so.6
(gdb)
Thread 11 (process 4399):
#0  0x00000036e04d3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00002aafc56620dd in qpid::sys::Poller::wait (this=<value optimized out>,
    timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:439
#2  0x00002aafc5662c87 in qpid::sys::Poller::run (this=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:405
#3  0x00002aafc519c41e in qpid::broker::Broker::run (
    this=<value optimized out>) at qpid/broker/Broker.cpp:319
#4  0x00000000004069b8 in QpiddBroker::execute (this=<value optimized out>,
    options=0xb40870) at posix/QpiddBroker.cpp:166
#5  0x00000000004054a8 in main (argc=18, argv=0x7fffe59ecec8) at qpidd.cpp:77

Thread 10 (process 4400):
#0  0x00000036e100ab00 in pthread_cond_timedwait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
#1  0x00002aafc5264379 in qpid::broker::Timer::run (this=<value optimized out>)
    at qpid/sys/posix/Condition.h:69
#2  0x00002aafc5658cea in runRunnable (p=<value optimized out>)
    at qpid/sys/posix/Thread.cpp:35
#3  0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000036e04d30ad in clone () from /lib64/libc.so.6

Thread 9 (process 4401):
#0  0x00000036e100ab00 in pthread_cond_timedwait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
#1  0x00002aafc5264379 in qpid::broker::Timer::run (this=<value optimized out>)
    at qpid/sys/posix/Condition.h:69
#2  0x00002aafc5658cea in runRunnable (p=<value optimized out>)
    at qpid/sys/posix/Thread.cpp:35
#3  0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0
#4  0x00000036e04d30ad in clone () from /lib64/libc.so.6

Thread 8 (process 4403):
#0  0x00000036e04d3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00002aafc56620dd in qpid::sys::Poller::wait (this=<value optimized out>,
    timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:439
#2  0x00002aafc5662c87 in qpid::sys::Poller::run (this=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:405
#3  0x00002aafc5658cea in runRunnable (p=<value optimized out>)
    at qpid/sys/posix/Thread.cpp:35
#4  0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0
#5  0x00000036e04d30ad in clone () from /lib64/libc.so.6

Thread 7 (process 4405):
#0  0x00000036e04d3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00002aafc56620dd in qpid::sys::Poller::wait (this=<value optimized out>,
    timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:439
#2  0x00002aafc5662c87 in qpid::sys::Poller::run (this=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:405
#3  0x00002aafc5658cea in runRunnable (p=<value optimized out>)
    at qpid/sys/posix/Thread.cpp:35
#4  0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0
#5  0x00000036e04d30ad in clone () from /lib64/libc.so.6

Thread 6 (process 4406):
#0  0x00000036e04d3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00002aafc56620dd in qpid::sys::Poller::wait (this=<value optimized out>,
    timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:439
#2  0x00002aafc5662c87 in qpid::sys::Poller::run (this=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:405
#3  0x00002aafc5658cea in runRunnable (p=<value optimized out>)
    at qpid/sys/posix/Thread.cpp:35
#4  0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0
#5  0x00000036e04d30ad in clone () from /lib64/libc.so.6

Thread 5 (process 4407):
#0  0x00000036e04d3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00002aafc56620dd in qpid::sys::Poller::wait (this=<value optimized out>,
    timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:439
#2  0x00002aafc5662c87 in qpid::sys::Poller::run (this=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:405
#3  0x00002aafc5658cea in runRunnable (p=<value optimized out>)
    at qpid/sys/posix/Thread.cpp:35
#4  0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0
#5  0x00000036e04d30ad in clone () from /lib64/libc.so.6

Thread 4 (process 4408):
#0  0x00000036e04d4477 in semop () from /lib64/libc.so.6
#1  0x00002aafc5bf9bc4 in openais_msg_send_reply_receive (
    ipc_context=<value optimized out>, iov=<value optimized out>,
    iov_len=<value optimized out>, res_msg=<value optimized out>,
    res_len=<value optimized out>) at util.c:623
#2  0x00002aafc5bfa68e in cpg_mcast_joined (handle=<value optimized out>,
    guarantee=<value optimized out>, iovec=<value optimized out>,
    iov_len=<value optimized out>) at cpg.c:513
#3  0x00002aafc59a2754 in qpid::cluster::Cpg::mcast (this=0xb4ca10,
    iov=0x454f5560, iovLen=1) at qpid/cluster/Cpg.cpp:124
#4  0x00002aafc59bd19e in qpid::cluster::Multicaster::sendMcast (
    this=0xb4cb48, values=@0xb4cc48) at qpid/cluster/Multicaster.cpp:79
#5  0x00002aafc59bef96 in boost::detail::function::function_obj_invoker1<boost::_bi::bind_t<__gnu_cxx::__normal_iterator<qpid::cluster::Event const*, std::vector<qpid::cluster::Event, std::allocator<qpid::cluster::Event> > >, boost::_mfi::mf1<__gnu_cxx::__normal_iterator<qpid::cluster::Event const*, std::vector<qpid::cluster::Event, std::allocator<qpid::cluster::Event> > >, qpid::cluster::Multicaster, std::vector<qpid::cluster::Event, std::allocator<qpid::cluster::Event> > const&>, boost::_bi::list2<boost::_bi::value<qpid::cluster::Multicaster*>, boost::arg<1> > >, __gnu_cxx::__normal_iterator<qpid::cluster::Event const*, std::vector<qpid::cluster::Event, std::allocator<qpid::cluster::Event> > >, std::vector<qpid::cluster::Event, std::allocator<qpid::cluster::Event> > const&>::invoke (
    function_obj_ptr=<value optimized out>, a0=@0x454f4ec0)
    at /usr/include/boost/bind/mem_fn_template.hpp:149
#6  0x00002aafc5981aea in boost::function1<__gnu_cxx::__normal_iterator<qpid::cluster::Event const*, std::vector<qpid::cluster::Event, std::allocator<qpid::cluster::Event> > >, std::vector<qpid::cluster::Event, std::allocator<qpid::cluster::Event> > const&, std::allocator<void> >::operator() (this=0x0, a0=@0x454f4ec0)
    at /usr/include/boost/function/function_template.hpp:576
#7  0x00002aafc598829c in qpid::sys::PollableQueue<qpid::cluster::Event>::process (this=0xb4cb90) at qpid/sys/PollableQueue.h:153
#8  0x00002aafc5989c0d in qpid::sys::PollableQueue<qpid::cluster::Event>::dispatch (this=0xb4cb90, cond=@0xb4cc00) at qpid/sys/PollableQueue.h:138
#9  0x00002aafc566010f in boost::function1<void, qpid::sys::PollableCondition&, std::allocator<boost::function_base> >::operator() (
    this=<value optimized out>, a0=<value optimized out>)
    at /usr/include/boost/function/function_template.hpp:576
#10 0x00002aafc56aed07 in boost::function1<void, qpid::sys::DispatchHandle&, std::allocator<boost::function_base> >::operator() (this=<value optimized out>,
    a0=<value optimized out>)
    at /usr/include/boost/function/function_template.hpp:576
#11 0x00002aafc56aca0b in qpid::sys::DispatchHandle::processEvent (
    this=<value optimized out>, type=<value optimized out>)
    at qpid/sys/DispatchHandle.cpp:432
#12 0x00002aafc5662cb3 in qpid::sys::Poller::run (this=<value optimized out>)
    at qpid/sys/Poller.h:122
#13 0x00002aafc5658cea in runRunnable (p=<value optimized out>)
    at qpid/sys/posix/Thread.cpp:35
#14 0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0
#15 0x00000036e04d30ad in clone () from /lib64/libc.so.6

Thread 3 (process 4409):
#0  0x00000036e04d3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00002aafc56620dd in qpid::sys::Poller::wait (this=<value optimized out>,
    timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:439
#2  0x00002aafc5662c87 in qpid::sys::Poller::run (this=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:405
#3  0x00002aafc5658cea in runRunnable (p=<value optimized out>)
    at qpid/sys/posix/Thread.cpp:35
#4  0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0
#5  0x00000036e04d30ad in clone () from /lib64/libc.so.6

Thread 2 (process 4410):
#0  0x00000036e04d3498 in epoll_wait () from /lib64/libc.so.6
#1  0x00002aafc56620dd in qpid::sys::Poller::wait (this=<value optimized out>,
    timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:439
#2  0x00002aafc5662c87 in qpid::sys::Poller::run (this=<value optimized out>)
    at qpid/sys/epoll/EpollPoller.cpp:405
#3  0x00002aafc5658cea in runRunnable (p=<value optimized out>)
    at qpid/sys/posix/Thread.cpp:35
#4  0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0
#5  0x00000036e04d30ad in clone () from /lib64/libc.so.6

Thread 1 (process 4404):
#0  0x00000036e0430215 in raise () from /lib64/libc.so.6
#1  0x00000036e0431cc0 in abort () from /lib64/libc.so.6
#2  0x00000036ecabec44 in __gnu_cxx::__verbose_terminate_handler ()
   from /usr/lib64/libstdc++.so.6
#3  0x00000036ecabcdb6 in ?? () from /usr/lib64/libstdc++.so.6
#4  0x00000036ecabcde3 in std::terminate () from /usr/lib64/libstdc++.so.6
#5  0x00000036ecabceca in __cxa_throw () from /usr/lib64/libstdc++.so.6
#6  0x00002aafc56ac8d3 in qpid::sys::DispatchHandle::processEvent (
    this=<value optimized out>, type=<value optimized out>)
    at qpid/sys/posix/Mutex.h:120
#7  0x00002aafc5662cb3 in qpid::sys::Poller::run (this=<value optimized out>)
    at qpid/sys/Poller.h:122
#8  0x00002aafc5658cea in runRunnable (p=<value optimized out>)
    at qpid/sys/posix/Thread.cpp:35
#9  0x00000036e1006367 in start_thread () from /lib64/libpthread.so.0
#10 0x00000036e04d30ad in clone () from /lib64/libc.so.6
(gdb) quit

Reproducibility is very low (~1%), seen on packages:
[root@mrg-qe-02 qpid_ptest_cluster_failover_soak]# rpm -qa | egrep
'(qpid|rhm|openais)' | sort -u
openais-0.80.3-22.el5_3.8
openais-debuginfo-0.80.3-22.el5_3.8
python-qpid-0.5.752581-3.el5
qpidc-0.5.752581-22.el5
qpidc-debuginfo-0.5.752581-22.el5
qpidc-devel-0.5.752581-22.el5
qpidc-perftest-0.5.752581-22.el5
qpidc-rdma-0.5.752581-22.el5
qpidc-ssl-0.5.752581-22.el5
qpidd-0.5.752581-22.el5
qpidd-acl-0.5.752581-22.el5
qpidd-cluster-0.5.752581-22.el5
qpidd-devel-0.5.752581-22.el5
qpid-dotnet-0.4.738274-2.el5
qpidd-rdma-0.5.752581-22.el5
qpidd-ssl-0.5.752581-22.el5
qpidd-xml-0.5.752581-22.el5
qpid-java-client-0.5.751061-8.el5
qpid-java-common-0.5.751061-8.el5
rhm-0.5.3206-5.el5
rhm-docs-0.5.756148-1.el5

Comment 6 Justin Ross 2011-06-28 18:59:55 UTC
Is this fixed and verified?

Comment 7 Frantisek Reznicek 2011-06-30 10:30:40 UTC
The issue is going to be re-tested in stress test (low reproducibility).

Comment 8 Frantisek Reznicek 2011-07-18 07:09:42 UTC
Long term testing on RHEL5.6 i386/x86_64 proved that issue has been resolved.

python-qpid-0.10-1.el5.noarch
python-qpid-qmf-0.10-10.el5.x86_64
qpid-cpp-client-0.10-8.el5.x86_64
qpid-cpp-client-devel-0.10-8.el5.x86_64
qpid-cpp-client-devel-docs-0.10-8.el5.x86_64
qpid-cpp-client-ssl-0.10-8.el5.x86_64
qpid-cpp-mrg-debuginfo-0.9.1073306-1.el5.x86_64
qpid-cpp-server-0.10-8.el5.x86_64
qpid-cpp-server-cluster-0.10-8.el5.x86_64
qpid-cpp-server-devel-0.10-8.el5.x86_64
qpid-cpp-server-ssl-0.10-8.el5.x86_64
qpid-cpp-server-store-0.10-8.el5.x86_64
qpid-cpp-server-xml-0.10-8.el5.x86_64
qpid-java-client-0.10-6.el5.noarch
qpid-java-common-0.10-6.el5.noarch
qpid-java-example-0.10-6.el5.noarch
qpid-qmf-0.10-10.el5.x86_64
qpid-qmf-devel-0.10-10.el5.x86_64
qpid-tools-0.10-6.el5.noarch
rh-qpid-cpp-tests-0.10-8.el5.x86_64

-> VERIFIED


Note You need to log in before you can comment on or make changes to this bug.