Bug 476822 - clustered broker crash in Mutex dtor or unlock
clustered broker crash in Mutex dtor or unlock
Status: CLOSED NOTABUG
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp (Show other bugs)
1.1
All Linux
urgent Severity urgent
: 1.1.1
: ---
Assigned To: Alan Conway
Frantisek Reznicek
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-12-17 06:06 EST by mick
Modified: 2015-11-15 19:06 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-01-08 10:19:47 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description mick 2008-12-17 06:06:18 EST
1.  I started a 4-cluster on MRG 7, 8, 9, 10 with this command on each:

    /home/mgoulish/trunk/qpid/cpp/src/qpidd --no-module-dir          \
    --load-module /home/mgoulish/trunk/qpid/cpp/src/.libs/cluster.so \
    --cluster-name=mick_1 -p 5357 --no-data-dir --auth=no            \
    --mgmt-enable=no  --log-enable info+ --log-to-file ~/qpidd.log   \
    --cluster-read-max=1  --tcp-nodelay


2. Ran tsxtest on mrg10 in a loop like this:
   ( NOTE: tsxtest is available in mrg-team/testing/ironside )


    count=1
    while [ $count -lt 100 ]
    do
        echo "TEST  $count"
        count=$(( $count + 1 ))
       ./tsxtest --report --messages 30000  --rate 10000  --host 127.0.0.1 \
       -p 5357
        echo " "
        echo " "
        echo " "
    done



3. Whenever there was a crash, I would gather data, then halt all 4 
   brokers, then restart all 4 brokers before restarting the testing loop.



4. Out of 225 runs of tsxtest, the following cores appeared a total of
   6 times for a frequency of about 3%.
   Half of them are in the Mutex dtor (look at thread 1 in the core below
   for process 27235)  and half are in Mutex::unlock  ( see thread 1 in
   the next core, for process 3359.)



  (gdb) thread apply all where

Thread 11 (process 27235):
#0  0x0000003994cd1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x00002ad647cb6c3d in qpid::sys::Poller::wait ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#2  0x00002ad647cfa708 in qpid::sys::Dispatcher::run ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#3  0x00002ad64781eb6e in qpid::broker::Broker::run ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidbroker.so.0
#4  0x0000000000406898 in QpiddBroker::execute ()
#5  0x0000000000405478 in main ()

Thread 10 (process 27249):
#0  0x000000399580a687 in pthread_cond_timedwait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
#1  0x00002ad6478c74a6 in qpid::broker::Timer::run ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidbroker.so.0
#2  0x00002ad647caea3a in qpid::sys::(anonymous namespace)::runRunnable ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#3  0x00000039958062f7 in start_thread (arg=<value optimized out>) at pthread_create.c:296
#4  0x0000003994cd1b6d in clone () from /lib64/libc.so.6

Thread 9 (process 27250):
#0  0x000000399580a687 in pthread_cond_timedwait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
#1  0x00002ad6478c74a6 in qpid::broker::Timer::run ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidbroker.so.0
#2  0x00002ad647caea3a in qpid::sys::(anonymous namespace)::runRunnable ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#3  0x00000039958062f7 in start_thread (arg=<value optimized out>) at pthread_create.c:296
#4  0x0000003994cd1b6d in clone () from /lib64/libc.so.6

Thread 8 (process 27254):
#0  0x000000399580c999 in __lll_mutex_unlock_wake () from /lib64/libpthread.so.0
#1  0x0000003995809a59 in _L_mutex_unlock_59 () from /lib64/libpthread.so.0
#2  0x000000399580971b in __pthread_mutex_unlock_usercnt (mutex=0x2aaaac181380, decr=1)
    at pthread_mutex_unlock.c:63
#3  0x00002ad647cfbfa8 in qpid::sys::DispatchHandle::rewatchRead ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
---Type <return> to continue, or q <return> to quit---
#4  0x00002ad647cf88e0 in qpid::sys::AsynchIOHandler::giveReadCredit ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#5  0x00002ad647fc1513 in qpid::cluster::Cluster::deliveredEvent ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/cluster.so
#6  0x00002ad647fc4790 in qpid::cluster::Cluster::delivered ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/cluster.so
#7  0x00002ad647fcaeaa in boost::function1<void, std::deque<qpid::cluster::Event, std::allocator<qpid::cluster::Event> >&, std::allocator<void> >::operator() ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/cluster.so
#8  0x00002ad647fcece4 in qpid::sys::PollableQueue<qpid::cluster::Event>::dispatch ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/cluster.so
#9  0x00002ad647cfc8e9 in boost::function1<void, qpid::sys::DispatchHandle&, std::allocator<boost::function_base> >::operator() ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#10 0x00002ad647cfac74 in qpid::sys::DispatchHandle::processEvent ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#11 0x00002ad647cfa72e in qpid::sys::Dispatcher::run ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#12 0x00002ad647caea3a in qpid::sys::(anonymous namespace)::runRunnable ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#13 0x00000039958062f7 in start_thread (arg=<value optimized out>) at pthread_create.c:296
#14 0x0000003994cd1b6d in clone () from /lib64/libc.so.6


   Thread 7 (process 27255):
#0  0x0000003994cd1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x00002ad647cb6c3d in qpid::sys::Poller::wait ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#2  0x00002ad647cfa708 in qpid::sys::Dispatcher::run ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#3  0x00002ad647caea3a in qpid::sys::(anonymous namespace)::runRunnable ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#4  0x00000039958062f7 in start_thread (arg=<value optimized out>) at pthread_create.c:296
#5  0x0000003994cd1b6d in clone () from /lib64/libc.so.6

Thread 6 (process 27256):
#0  0x0000003994cd1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x00002ad647cb6c3d in qpid::sys::Poller::wait ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#2  0x00002ad647cfa708 in qpid::sys::Dispatcher::run ()
---Type <return> to continue, or q <return> to quit---
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#3  0x00002ad647caea3a in qpid::sys::(anonymous namespace)::runRunnable ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#4  0x00000039958062f7 in start_thread (arg=<value optimized out>) at pthread_create.c:296
#5  0x0000003994cd1b6d in clone () from /lib64/libc.so.6

Thread 5 (process 27257):
#0  0x0000003994cd1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x00002ad647cb6c3d in qpid::sys::Poller::wait ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#2  0x00002ad647cfa708 in qpid::sys::Dispatcher::run ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#3  0x00002ad647caea3a in qpid::sys::(anonymous namespace)::runRunnable ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#4  0x00000039958062f7 in start_thread (arg=<value optimized out>) at pthread_create.c:296
#5  0x0000003994cd1b6d in clone () from /lib64/libc.so.6

Thread 4 (process 27258):
#0  0x0000003994cd1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x00002ad647cb6c3d in qpid::sys::Poller::wait ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#2  0x00002ad647cfa708 in qpid::sys::Dispatcher::run ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#3  0x00002ad647caea3a in qpid::sys::(anonymous namespace)::runRunnable ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#4  0x00000039958062f7 in start_thread (arg=<value optimized out>) at pthread_create.c:296
#5  0x0000003994cd1b6d in clone () from /lib64/libc.so.6

Thread 3 (process 27259):
#0  0x0000003994cd1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x00002ad647cb6c3d in qpid::sys::Poller::wait ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#2  0x00002ad647cfa708 in qpid::sys::Dispatcher::run ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#3  0x00002ad647caea3a in qpid::sys::(anonymous namespace)::runRunnable ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#4  0x00000039958062f7 in start_thread (arg=<value optimized out>) at pthread_create.c:296
#5  0x0000003994cd1b6d in clone () from /lib64/libc.so.6

---Type <return> to continue, or q <return> to quit---
Thread 2 (process 27260):
#0  0x0000003994cd1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x00002ad647cb6c3d in qpid::sys::Poller::wait ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#2  0x00002ad647cfa708 in qpid::sys::Dispatcher::run ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#3  0x00002ad647caea3a in qpid::sys::(anonymous namespace)::runRunnable ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#4  0x00000039958062f7 in start_thread (arg=<value optimized out>) at pthread_create.c:296
#5  0x0000003994cd1b6d in clone () from /lib64/libc.so.6

Thread 1 (process 27253):
#0  0x0000003994c30155 in *__GI_raise (sig=<value optimized out>)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x0000003994c31bf0 in *__GI_abort () at abort.c:88
#2  0x00000039a46becc4 in __gnu_cxx::__verbose_terminate_handler ()
   from /usr/lib64/libstdc++.so.6
#3  0x00000039a46bce36 in std::set_unexpected () from /usr/lib64/libstdc++.so.6
#4  0x00000039a46bce63 in std::terminate () from /usr/lib64/libstdc++.so.6
#5  0x00000039a46bcf4a in __cxa_throw () from /usr/lib64/libstdc++.so.6
#6  0x00002ad64781d55e in qpid::sys::Mutex::~Mutex ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidbroker.so.0
#7  0x00002ad647cfa48c in qpid::sys::AsynchIOHandler::~AsynchIOHandler$delete ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#8  0x00002ad647cf9110 in qpid::sys::AsynchIOHandler::closedSocket ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#9  0x00002ad647cac0ba in boost::function2<void, qpid::sys::AsynchIO&, qpid::sys::Socket const&, std::allocator<boost::function_base> >::operator() ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#10 0x00002ad647ca56c1 in qpid::sys::posix::AsynchIO::writeable ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#11 0x00002ad647cfc8e9 in boost::function1<void, qpid::sys::DispatchHandle&, std::allocator<boost::function_base> >::operator() ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#12 0x00002ad647cfabcc in qpid::sys::DispatchHandle::processEvent ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#13 0x00002ad647cfa72e in qpid::sys::Dispatcher::run ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#14 0x00002ad647caea3a in qpid::sys::(anonymous namespace)::runRunnable ()
---Type <return> to continue, or q <return> to quit---
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#15 0x00000039958062f7 in start_thread (arg=<value optimized out>) at pthread_create.c:296
#16 0x0000003994cd1b6d in clone () from /lib64/libc.so.6








Thread 11 (process 3359):
#0  0x0000003994cd1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x00002b62c3dcbc3d in qpid::sys::Poller::wait ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#2  0x00002b62c3e0f708 in qpid::sys::Dispatcher::run ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#3  0x00002b62c3933b6e in qpid::broker::Broker::run ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidbroker.so.0
#4  0x0000000000406898 in QpiddBroker::execute ()
#5  0x0000000000405478 in main ()

Thread 10 (process 3373):
#0  0x000000399580a687 in pthread_cond_timedwait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
#1  0x00002b62c39dc4a6 in qpid::broker::Timer::run ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidbroker.so.0
#2  0x00002b62c3dc3a3a in qpid::sys::(anonymous namespace)::runRunnable ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#3  0x00000039958062f7 in start_thread (arg=<value optimized out>) at pthread_create.c:296
#4  0x0000003994cd1b6d in clone () from /lib64/libc.so.6

Thread 9 (process 3374):
#0  0x000000399580a687 in pthread_cond_timedwait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
#1  0x00002b62c39dc4a6 in qpid::broker::Timer::run ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidbroker.so.0
#2  0x00002b62c3dc3a3a in qpid::sys::(anonymous namespace)::runRunnable ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#3  0x00000039958062f7 in start_thread (arg=<value optimized out>) at pthread_create.c:296
#4  0x0000003994cd1b6d in clone () from /lib64/libc.so.6

Thread 8 (process 3377):
#0  0x0000003994cd1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x00002b62c3dcbc3d in qpid::sys::Poller::wait ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#2  0x00002b62c3e0f708 in qpid::sys::Dispatcher::run ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
---Type <return> to continue, or q <return> to quit---
#3  0x00002b62c3dc3a3a in qpid::sys::(anonymous namespace)::runRunnable ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#4  0x00000039958062f7 in start_thread (arg=<value optimized out>) at pthread_create.c:296
#5  0x0000003994cd1b6d in clone () from /lib64/libc.so.6

Thread 7 (process 3378):
#0  0x0000003994cd1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x00002b62c3dcbc3d in qpid::sys::Poller::wait ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#2  0x00002b62c3e0f708 in qpid::sys::Dispatcher::run ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#3  0x00002b62c3dc3a3a in qpid::sys::(anonymous namespace)::runRunnable ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#4  0x00000039958062f7 in start_thread (arg=<value optimized out>) at pthread_create.c:296
#5  0x0000003994cd1b6d in clone () from /lib64/libc.so.6

Thread 6 (process 3379):
#0  0x0000003994cd1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x00002b62c3dcbc3d in qpid::sys::Poller::wait ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#2  0x00002b62c3e0f708 in qpid::sys::Dispatcher::run ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#3  0x00002b62c3dc3a3a in qpid::sys::(anonymous namespace)::runRunnable ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#4  0x00000039958062f7 in start_thread (arg=<value optimized out>) at pthread_create.c:296
#5  0x0000003994cd1b6d in clone () from /lib64/libc.so.6

Thread 5 (process 3380):
#0  0x0000003994cd1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x00002b62c3dcbc3d in qpid::sys::Poller::wait ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#2  0x00002b62c3e0f708 in qpid::sys::Dispatcher::run ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#3  0x00002b62c3dc3a3a in qpid::sys::(anonymous namespace)::runRunnable ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#4  0x00000039958062f7 in start_thread (arg=<value optimized out>) at pthread_create.c:296
#5  0x0000003994cd1b6d in clone () from /lib64/libc.so.6

---Type <return> to continue, or q <return> to quit---
Thread 4 (process 3382):
#0  0x0000003994cd1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x00002b62c3dcbc3d in qpid::sys::Poller::wait ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#2  0x00002b62c3e0f708 in qpid::sys::Dispatcher::run ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#3  0x00002b62c3dc3a3a in qpid::sys::(anonymous namespace)::runRunnable ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#4  0x00000039958062f7 in start_thread (arg=<value optimized out>) at pthread_create.c:296
#5  0x0000003994cd1b6d in clone () from /lib64/libc.so.6

Thread 3 (process 3383):
#0  0x0000003994cd1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x00002b62c3dcbc3d in qpid::sys::Poller::wait ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#2  0x00002b62c3e0f708 in qpid::sys::Dispatcher::run ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#3  0x00002b62c3dc3a3a in qpid::sys::(anonymous namespace)::runRunnable ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#4  0x00000039958062f7 in start_thread (arg=<value optimized out>) at pthread_create.c:296
#5  0x0000003994cd1b6d in clone () from /lib64/libc.so.6

Thread 2 (process 3384):
#0  0x0000003994cd1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x00002b62c3dcbc3d in qpid::sys::Poller::wait ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#2  0x00002b62c3e0f708 in qpid::sys::Dispatcher::run ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#3  0x00002b62c3dc3a3a in qpid::sys::(anonymous namespace)::runRunnable ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#4  0x00000039958062f7 in start_thread (arg=<value optimized out>) at pthread_create.c:296
#5  0x0000003994cd1b6d in clone () from /lib64/libc.so.6

Thread 1 (process 3381):
#0  0x0000003994c30155 in *__GI_raise (sig=<value optimized out>)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x0000003994c31bf0 in *__GI_abort () at abort.c:88
#2  0x00000039a46becc4 in __gnu_cxx::__verbose_terminate_handler ()
---Type <return> to continue, or q <return> to quit---
   from /usr/lib64/libstdc++.so.6
#3  0x00000039a46bce36 in std::set_unexpected () from /usr/lib64/libstdc++.so.6
#4  0x00000039a46bce63 in std::terminate () from /usr/lib64/libstdc++.so.6
#5  0x00000039a46bcf4a in __cxa_throw () from /usr/lib64/libstdc++.so.6
#6  0x00002b62c38fc87b in qpid::sys::Mutex::unlock ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidbroker.so.0
#7  0x00002b62c3e0fbe9 in qpid::sys::DispatchHandle::processEvent ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#8  0x00002b62c3e0f72e in qpid::sys::Dispatcher::run ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#9  0x00002b62c3dc3a3a in qpid::sys::(anonymous namespace)::runRunnable ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#10 0x00000039958062f7 in start_thread (arg=<value optimized out>) at pthread_create.c:296
#11 0x0000003994cd1b6d in clone () from /lib64/libc.so.6

}
Comment 1 Alan Conway 2008-12-19 15:17:12 EST
Note in the above trace one crash happens in giveReadCredit:

#0  0x000000399580c999 in __lll_mutex_unlock_wake () from
/lib64/libpthread.so.0
#1  0x0000003995809a59 in _L_mutex_unlock_59 () from /lib64/libpthread.so.0
#2  0x000000399580971b in __pthread_mutex_unlock_usercnt (mutex=0x2aaaac181380,
decr=1)
    at pthread_mutex_unlock.c:63
#3  0x00002ad647cfbfa8 in qpid::sys::DispatchHandle::rewatchRead ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#4  0x00002ad647cf88e0 in qpid::sys::AsynchIOHandler::giveReadCredit ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/libqpidcommon.so.0
#5  0x00002ad647fc1513 in qpid::cluster::Cluster::deliveredEvent ()
   from /home/mgoulish/trunk/qpid/cpp/src/.libs/cluster.so

Here's what happens: an IO  thread wakes to read from client connection C, reducing its read credit. That thread initiates a multicast and returns. Later an IO thread wakes for a multicast delivery. It identifies the event as originating from connection C. It processes the event and calls giveReadCredit to enable C to continue reading.

How can we safely call giveReadCredit from a threading context other than an IO thread serving C?
Comment 2 Andrew Stitcher 2008-12-22 08:45:53 EST
By design it is safe to call giveReadCredit() in any thread - as far as I can see from inspecting the code it really is safe.

I think what is happening here is that a connection is being deleted when it is still in use and there is a race between the delete and giving credit to the connection.
Comment 3 Alan Conway 2009-01-07 17:38:37 EST
Race condition fixed in r732153.

Note You need to log in before you can comment on or make changes to this bug.