Bug 479326 - cluster broker crashes with race condition in DispatchHandle
Summary: cluster broker crashes with race condition in DispatchHandle
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: 1.1
Hardware: All
OS: Linux
urgent
urgent
Target Milestone: 1.3
: ---
Assignee: Andrew Stitcher
QA Contact: Frantisek Reznicek
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-01-08 21:12 UTC by Alan Conway
Modified: 2015-11-16 00:06 UTC (History)
4 users (show)

Fixed In Version: 1.2
Doc Type: Bug Fix
Doc Text:
An error in the 'DispatchHandle' class implementation may have caused a race condition to occur, causing a broker in a cluster to terminate unexpectedly with a core dump. With this update, the underlying code has been modified to prevent such event, and a clustered broker no longer crashes.
Clone Of:
Environment:
Last Closed: 2010-10-14 15:59:18 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2010:0773 0 normal SHIPPED_LIVE Moderate: Red Hat Enterprise MRG Messaging and Grid Version 1.3 2010-10-14 15:56:44 UTC

Description Alan Conway 2009-01-08 21:12:21 UTC
Description of problem:

Core dump when run as described below. 

Version-Release number of selected component (if applicable): revision 732768

How reproducible: easy

Steps to Reproduce:
In an svn build cd cpp/src/tests and:

[aconway@mrg10 tests]$ pkill qpidd ; ../qpidd --cluster-name foobar --cluster-read-max 1 --load-module ../.libs/cluster.so&
[aconway@mrg10 tests]$ while ./perftest --count 10  --npubs 10 --nsubs 10 -s ; do true ; done

Usually happens in one or two iterations.
  
Actual results:

[1]+  Aborted                 (core dumped) ../qpidd --cluster-name foobar --cluster-read-max 1 --load-module ../.libs/cluster.so

Info from core:

(gdb) where
#0  0x0000003994c30155 in *__GI_raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x0000003994c31bf0 in *__GI_abort () at abort.c:88
#2  0x00000039a46becc4 in __gnu_cxx::__verbose_terminate_handler () from /usr/lib64/libstdc++.so.6
#3  0x00000039a46bce36 in std::set_unexpected () from /usr/lib64/libstdc++.so.6
#4  0x00000039a46bce63 in std::terminate () from /usr/lib64/libstdc++.so.6
#5  0x00000039a46bcf4a in __cxa_throw () from /usr/lib64/libstdc++.so.6
#6  0x00002b02b225e217 in ~ScopedLock (this=<value optimized out>) at qpid/sys/posix/Mutex.h:120
#7  0x00002b02b225c67d in qpid::sys::DispatchHandle::processEvent (this=0x2aaaac0c6e98, type=qpid::sys::Poller::WRITABLE) at qpid/sys/DispatchHandle.cpp:353
#8  0x00002b02b2217943 in qpid::sys::Poller::run (this=0x6437b80) at ./qpid/sys/Poller.h:121
#9  0x00002b02b220e7da in qpid::sys::(anonymous namespace)::runRunnable () at /usr/include/boost/any.hpp:46
#10 0x00000039958062f7 in start_thread (arg=<value optimized out>) at pthread_create.c:296
#11 0x0000003994cd1b6d in clone () from /lib64/libc.so.6
(gdb) f 7
#7  0x00002b02b225c67d in qpid::sys::DispatchHandle::processEvent (this=0x2aaaac0c6e98, type=qpid::sys::Poller::WRITABLE) at qpid/sys/DispatchHandle.cpp:353
Current language:  auto; currently c++
(gdb) print type
$1 = qpid::sys::Poller::WRITABLE
(gdb) print state
$2 = qpid::sys::DispatchHandle::DELAYED_DELETE
(gdb) 


Expected results:

Test passes.

Additional info:

The key factors to reproducing this are low value of --cluster-read-max (1 above) and a large number of connections, given by high nsub/npub parameters. You still get the crash with cluster-read-max 2, if you push it up to 10 things run reliably. Also if you remove nsub/npub so there are just 2 connections we don't see it. 

It looks like this is caused by the cluster calling giveReadCredit concurrently with the IO layer closing the connection. There is locking in  cluster/OutputInterceptor.cpp intended to ensure that the real output handler is never called after we've received a closed() call from the network layer but it appears there's still a race.

Comment 1 Carl Trieloff 2009-01-21 20:11:08 UTC
life cycle fox for pollableQueue

Committed revision 736398.

Not the only issue here.

Comment 2 Andrew Stitcher 2009-03-02 18:53:27 UTC
The underlying issue here turned out to be a subtle problem in the lifecycle behaviour of the DispatchHandle.

Essentially, it was unsafe to delete it from outside an IO callback state. This pretty much never happened without the cluster code

Comment 4 Frantisek Reznicek 2009-03-16 14:33:43 UTC
I can confirm the issue is not fixed, because I can see it on RHEL 5.2 x86_64 running below script on packages: 
  [root@dhcp-lab-200 bz479326]# rpm -qa | egrep '(qpidd|openais)'
  openais-devel-0.80.3-22.el5_3.3
  qpidd-devel-0.5.752581-1.el5
  openais-0.80.3-22.el5_3.3
  qpidd-0.5.752581-1.el5
  qpidd-rdma-0.5.752581-1.el5
  qpidd-ssl-0.5.752581-1.el5
  qpidd-acl-0.5.752581-1.el5
  qpidd-xml-0.5.752581-1.el5
  qpidd-cluster-0.5.752581-1.el5


#!/bin/bash

service openais restart

ulimit -c unlimited

cluster_so=$(rpm -ql qpidd-cluster 2>/dev/null | head -1)

qpidd --auth no --log-enable debug+ --cluster-name foobar --cluster-read-max 1 \
      --no-module-dir --no-data-dir --load-module ${cluster_so} >qpidd.log 2>&1 &

netstat -nlp | grep qpidd


sleep 5

for ((i=0;i<10;i++)); do 
  perftest --count 10 --npubs 10 --nsubs 10 -s >>perftest.log
  echo -n "$?"
  
  pid=$(netstat -nlp | grep qpidd | awk '{print $NF}' | awk  -F/ '{print $(NF-1)}')
  if [ -z "${pid}"  ]; then
    break
  fi

  
done
echo

pid=$(netstat -nlp | grep qpidd | awk '{print $NF}' | awk  -F/ '{print $(NF-1)}')
if [ -n "${pid}"  ]; then
  kill %1
  sleep 5
fi

pid=$(netstat -nlp | grep qpidd | awk '{print $NF}' | awk  -F/ '{print $(NF-1)}')

if [ -n "${pid}"  ]; then
  echo -n "killing ..."
  kill ${pid}
fi
echo "exiting"

cat perftest.log

# eof



which results in core-dump (backtrace included):


[root@dhcp-lab-200 bz479326]# ./run.sh
Stopping OpenAIS daemon (aisexec):                         [  OK  ]
Starting OpenAIS daemon (aisexec):                         [  OK  ]
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
Error in shutdown: Connection closed
2009-mar-16 15:32:36 warning Connection closed
2009-mar-16 15:32:36 warning Connection closed
Error in shutdown: Connection closed
Error in shutdown: Connection closed
Error in shutdown: Connection closed
Error in shutdown: Connection closed
Error in shutdown: Connection closed
Error in shutdown: Connection closed
Error in shutdown: Connection closed
Error in shutdown: Connection closed
Error in shutdown: Connection closed
Error in shutdown: Connection closed
Error in shutdown: Connection closed
Error in shutdown: Connection closed
Error in shutdown: Connection closed
Error in shutdown: Connection closed
Error in shutdown: Connection closed
Error in shutdown: Connection closed
./run.sh: line 30:  2823 Aborted                 (core dumped) qpidd --auth no --log-enable debug+ --cluster-name foobar --cluster-read-max 1 --no-module-dir --no-data-dir --load-module ${cluster_so} > qpidd.log 2>&1
0
exiting
2408.44 28.9542 2270.51 2.21729
3320.2  21.9707 1665.36 1.62633

Connection closed

Connection refused: localhost:5672 (qpid/sys/posix/Socket.cpp:162)

Connection refused: localhost:5672 (qpid/sys/posix/Socket.cpp:162)

Connection refused: localhost:5672 (qpid/sys/posix/Socket.cpp:162)

Connection refused: localhost:5672 (qpid/sys/posix/Socket.cpp:162)

Connection refused: localhost:5672 (qpid/sys/posix/Socket.cpp:162)

Connection refused: localhost:5672 (qpid/sys/posix/Socket.cpp:162)

Connection refused: localhost:5672 (qpid/sys/posix/Socket.cpp:162)
1256.62 37.4777 2133.65 2.08364

Connection refused: localhost:5672 (qpid/sys/posix/Socket.cpp:162)
1143.96 35.388  2994.51 2.92432
[root@dhcp-lab-200 bz479326]# rpm -q qpidd
qpidd-0.5.752581-1.el5
[root@dhcp-lab-200 bz479326]# gdb `which qpidd` core.2
core.2674  core.2823
[root@dhcp-lab-200 bz479326]# gdb `which qpidd` core.2674
GNU gdb Red Hat Linux (6.5-37.el5_2.2rh)
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...(no debugging symbols found)
Using host libthread_db library "/lib64/libthread_db.so.1".

...
Core was generated by `qpidd --auth no --log-enable debug+ --cluster-name foobar --cluster-read-max 1'.
Program terminated with signal 6, Aborted.
#0  0x000000350b030155 in raise () from /lib64/libc.so.6
(gdb) thread apply all bt

Thread 12 (process 2674):
#0  0x000000350b0d1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x0000003549972e8d in qpid::sys::Poller::wait () from /usr/lib64/libqpidcommon.so.0
#2  0x0000003549973c67 in qpid::sys::Poller::run () from /usr/lib64/libqpidcommon.so.0
#3  0x0000003549eccb86 in qpid::broker::Broker::run () from /usr/lib64/libqpidbroker.so.0
#4  0x0000000000406948 in qpid::log::Options::~Options ()
#5  0x0000000000405438 in __cxa_pure_virtual ()
#6  0x000000350b01d8b4 in __libc_start_main () from /lib64/libc.so.6
#7  0x0000000000404eb9 in __cxa_pure_virtual ()
#8  0x00007fff6f2b4ad8 in ?? ()
#9  0x0000000000000000 in ?? ()

Thread 11 (process 2677):
#0  0x000000350bc0a687 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000003549f88e6f in qpid::broker::Timer::run () from /usr/lib64/libqpidbroker.so.0
#2  0x000000354996ac4a in qpid::sys::AbsTime::AbsTime () from /usr/lib64/libqpidcommon.so.0
#3  0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#4  0x000000350b0d1b6d in clone () from /lib64/libc.so.6

Thread 10 (process 2678):
#0  0x000000350bc0a687 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000003549f88e6f in qpid::broker::Timer::run () from /usr/lib64/libqpidbroker.so.0
#2  0x000000354996ac4a in qpid::sys::AbsTime::AbsTime () from /usr/lib64/libqpidcommon.so.0
#3  0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#4  0x000000350b0d1b6d in clone () from /lib64/libc.so.6

Thread 9 (process 2679):
#0  0x000000350bc0a687 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000003549f88e6f in qpid::broker::Timer::run () from /usr/lib64/libqpidbroker.so.0
#2  0x000000354996ac4a in qpid::sys::AbsTime::AbsTime () from /usr/lib64/libqpidcommon.so.0
#3  0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#4  0x000000350b0d1b6d in clone () from /lib64/libc.so.6

Thread 8 (process 2681):
#0  0x000000350b0d1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x0000003549972e8d in qpid::sys::Poller::wait () from /usr/lib64/libqpidcommon.so.0
#2  0x0000003549973c67 in qpid::sys::Poller::run () from /usr/lib64/libqpidcommon.so.0
#3  0x000000354996ac4a in qpid::sys::AbsTime::AbsTime () from /usr/lib64/libqpidcommon.so.0
#4  0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
---Type <return> to continue, or q <return> to quit---
#5  0x000000350b0d1b6d in clone () from /lib64/libc.so.6

Thread 7 (process 2682):
#0  0x000000350bc0c999 in __lll_mutex_unlock_wake () from /lib64/libpthread.so.0
#1  0x000000350bc09a59 in _L_mutex_unlock_59 () from /lib64/libpthread.so.0
#2  0x000000350bc0971b in __pthread_mutex_unlock_usercnt () from /lib64/libpthread.so.0
#3  0x000000350b105da2 in dl_iterate_phdr () from /lib64/libc.so.6
#4  0x000000350fc0a626 in _Unwind_Find_FDE () from /lib64/libgcc_s.so.1
#5  0x000000350fc075b5 in _Unwind_GetIPInfo () from /lib64/libgcc_s.so.1
#6  0x000000350fc08ee9 in _Unwind_RaiseException () from /lib64/libgcc_s.so.1
#7  0x000000350fc09071 in _Unwind_Resume_or_Rethrow () from /lib64/libgcc_s.so.1
#8  0x00000035108bced8 in __cxa_rethrow () from /usr/lib64/libstdc++.so.6
#9  0x00000035108bec5e in __gnu_cxx::__verbose_terminate_handler () from /usr/lib64/libstdc++.so.6
#10 0x00000035108bce36 in std::set_unexpected () from /usr/lib64/libstdc++.so.6
#11 0x00000035108bce63 in std::terminate () from /usr/lib64/libstdc++.so.6
#12 0x00000035108bcf4a in __cxa_throw () from /usr/lib64/libstdc++.so.6
#13 0x00000035499c0eee in qpid::sys::ScopedLock<qpid::sys::Mutex>::~ScopedLock () from /usr/lib64/libqpidcommon.so.0
#14 0x00000035499c07b6 in qpid::sys::DispatchHandle::processEvent () from /usr/lib64/libqpidcommon.so.0
#15 0x0000003549973c93 in qpid::sys::Poller::run () from /usr/lib64/libqpidcommon.so.0
#16 0x000000354996ac4a in qpid::sys::AbsTime::AbsTime () from /usr/lib64/libqpidcommon.so.0
#17 0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#18 0x000000350b0d1b6d in clone () from /lib64/libc.so.6

Thread 6 (process 2683):
#0  0x000000350b0d1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x0000003549972e8d in qpid::sys::Poller::wait () from /usr/lib64/libqpidcommon.so.0
#2  0x0000003549973c67 in qpid::sys::Poller::run () from /usr/lib64/libqpidcommon.so.0
#3  0x000000354996ac4a in qpid::sys::AbsTime::AbsTime () from /usr/lib64/libqpidcommon.so.0
#4  0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#5  0x000000350b0d1b6d in clone () from /lib64/libc.so.6

Thread 5 (process 2684):
#0  0x000000350b0d1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x0000003549972e8d in qpid::sys::Poller::wait () from /usr/lib64/libqpidcommon.so.0
#2  0x0000003549973c67 in qpid::sys::Poller::run () from /usr/lib64/libqpidcommon.so.0
#3  0x000000354996ac4a in qpid::sys::AbsTime::AbsTime () from /usr/lib64/libqpidcommon.so.0
#4  0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#5  0x000000350b0d1b6d in clone () from /lib64/libc.so.6

Thread 4 (process 2685):
---Type <return> to continue, or q <return> to quit---
#0  0x000000350b0d1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x0000003549972e8d in qpid::sys::Poller::wait () from /usr/lib64/libqpidcommon.so.0
#2  0x0000003549973c67 in qpid::sys::Poller::run () from /usr/lib64/libqpidcommon.so.0
#3  0x000000354996ac4a in qpid::sys::AbsTime::AbsTime () from /usr/lib64/libqpidcommon.so.0
#4  0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#5  0x000000350b0d1b6d in clone () from /lib64/libc.so.6

Thread 3 (process 2686):
#0  0x000000350b0d1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x0000003549972e8d in qpid::sys::Poller::wait () from /usr/lib64/libqpidcommon.so.0
#2  0x0000003549973c67 in qpid::sys::Poller::run () from /usr/lib64/libqpidcommon.so.0
#3  0x000000354996ac4a in qpid::sys::AbsTime::AbsTime () from /usr/lib64/libqpidcommon.so.0
#4  0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#5  0x000000350b0d1b6d in clone () from /lib64/libc.so.6

Thread 2 (process 2688):
#0  0x000000350b0d1f58 in epoll_wait () from /lib64/libc.so.6
#1  0x0000003549972e8d in qpid::sys::Poller::wait () from /usr/lib64/libqpidcommon.so.0
#2  0x0000003549973c67 in qpid::sys::Poller::run () from /usr/lib64/libqpidcommon.so.0
#3  0x000000354996ac4a in qpid::sys::AbsTime::AbsTime () from /usr/lib64/libqpidcommon.so.0
#4  0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#5  0x000000350b0d1b6d in clone () from /lib64/libc.so.6

Thread 1 (process 2687):
#0  0x000000350b030155 in raise () from /lib64/libc.so.6
#1  0x000000350b031bf0 in abort () from /lib64/libc.so.6
#2  0x00000035108bec9f in __gnu_cxx::__verbose_terminate_handler () from /usr/lib64/libstdc++.so.6
#3  0x00000035108bce36 in std::set_unexpected () from /usr/lib64/libstdc++.so.6
#4  0x00000035108bce63 in std::terminate () from /usr/lib64/libstdc++.so.6
#5  0x00000035108bcf4a in __cxa_throw () from /usr/lib64/libstdc++.so.6
#6  0x00000035499c0eee in qpid::sys::ScopedLock<qpid::sys::Mutex>::~ScopedLock () from /usr/lib64/libqpidcommon.so.0
#7  0x00000035499c07b6 in qpid::sys::DispatchHandle::processEvent () from /usr/lib64/libqpidcommon.so.0
#8  0x0000003549973c93 in qpid::sys::Poller::run () from /usr/lib64/libqpidcommon.so.0
#9  0x000000354996ac4a in qpid::sys::AbsTime::AbsTime () from /usr/lib64/libqpidcommon.so.0
#10 0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#11 0x000000350b0d1b6d in clone () from /lib64/libc.so.6


[root@dhcp-lab-200 bz479326]# gdb `which qpidd` core.2674
GNU gdb Red Hat Linux (6.5-37.el5_2.2rh)
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...(no debugging symbols found)
Using host libthread_db library "/lib64/libthread_db.so.1".

Reading symbols from /usr/lib64/libqpidbroker.so.0...(no debugging symbols found)...done.
...
found)...done.
Loaded symbols for /usr/lib64/libibverbs.so.1

Core was generated by `qpidd --auth no --log-enable debug+ --cluster-name foobar --cluster-read-max 1'.
Program terminated with signal 6, Aborted.
#0  0x000000350b030155 in raise () from /lib64/libc.so.6
(gdb) bt
#0  0x000000350b030155 in raise () from /lib64/libc.so.6
#1  0x000000350b031bf0 in abort () from /lib64/libc.so.6
#2  0x00000035108bec9f in __gnu_cxx::__verbose_terminate_handler () from /usr/lib64/libstdc++.so.6
#3  0x00000035108bce36 in std::set_unexpected () from /usr/lib64/libstdc++.so.6
#4  0x00000035108bce63 in std::terminate () from /usr/lib64/libstdc++.so.6
#5  0x00000035108bcf4a in __cxa_throw () from /usr/lib64/libstdc++.so.6
#6  0x00000035499c0eee in qpid::sys::ScopedLock<qpid::sys::Mutex>::~ScopedLock () from /usr/lib64/libqpidcommon.so.0
#7  0x00000035499c07b6 in qpid::sys::DispatchHandle::processEvent () from /usr/lib64/libqpidcommon.so.0
#8  0x0000003549973c93 in qpid::sys::Poller::run () from /usr/lib64/libqpidcommon.so.0
#9  0x000000354996ac4a in qpid::sys::AbsTime::AbsTime () from /usr/lib64/libqpidcommon.so.0
#10 0x000000350bc062f7 in start_thread () from /lib64/libpthread.so.0
#11 0x000000350b0d1b6d in clone () from /lib64/libc.so.6

Comment 5 Frantisek Reznicek 2009-03-16 14:57:14 UTC
comment #4 addressed as separate bug 490457. bug 490457 blocks this bug.

Comment 7 Frantisek Reznicek 2010-08-25 11:56:17 UTC
The issue has been fixed, tested on RHEL 5.5 i386 / x86_64 (as it involves --cluster-read-max) on packages:
python-qmf-0.7.946106-11.el5
python-qpid-0.7.946106-13.el5
qmf-*0.7.946106-12.el5
qpid-cpp-*0.7.946106-12.el5
qpid-java-*-0.7.946106-7.el5
qpid-tools-0.7.946106-9.el5
sesame-0.7.3918-6.el5


Reproway:
qpid*-0.4.732838-1.el5 vs qpid-cpp-*0.7.946106-12.el5
# message store has to not be loaded (i.e. --no-data-dir)
qpidd --cluster-name foo --cluster-read-max 1 --no-data-dir --auth no &> /dev/null &
while perftest --count 10  --npubs 10 --nsubs 10 -s ;do true ; done


-> VERIFIED

Comment 8 Jaromir Hradilek 2010-10-07 14:20:05 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Due to incorrect implementation of the DispatchHandle, a race condition may have occurred, causing a broker in a cluster to terminate unexpectedly with a core dump. With this update, the underlying code has been modified to prevent such event, and a clustered broker no longer crashes.

Comment 9 Douglas Silas 2010-10-11 13:44:28 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1 @@
-Due to incorrect implementation of the DispatchHandle, a race condition may have occurred, causing a broker in a cluster to terminate unexpectedly with a core dump. With this update, the underlying code has been modified to prevent such event, and a clustered broker no longer crashes.+An error in the 'DispatchHandle' function may have caused a race condition to occur occur, causing a broker in a cluster to terminate unexpectedly with a core dump. With this update, the underlying code has been modified to prevent such event, and a clustered broker no longer crashes.

Comment 10 Andrew Stitcher 2010-10-11 19:52:14 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1 @@
-An error in the 'DispatchHandle' function may have caused a race condition to occur occur, causing a broker in a cluster to terminate unexpectedly with a core dump. With this update, the underlying code has been modified to prevent such event, and a clustered broker no longer crashes.+An error in the 'DispatchHandle' class implementation may have caused a race condition to occur occur, causing a broker in a cluster to terminate unexpectedly with a core dump. With this update, the underlying code has been modified to prevent such event, and a clustered broker no longer crashes.

Comment 11 Jan Sarenik 2010-10-13 13:40:52 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1 @@
-An error in the 'DispatchHandle' class implementation may have caused a race condition to occur occur, causing a broker in a cluster to terminate unexpectedly with a core dump. With this update, the underlying code has been modified to prevent such event, and a clustered broker no longer crashes.+An error in the 'DispatchHandle' class implementation may have caused a race condition to occur, causing a broker in a cluster to terminate unexpectedly with a core dump. With this update, the underlying code has been modified to prevent such event, and a clustered broker no longer crashes.

Comment 13 errata-xmlrpc 2010-10-14 15:59:18 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0773.html


Note You need to log in before you can comment on or make changes to this bug.