Bug 882149 - Active-active clustered qpidd broker crashes under cluster stress by failover_soak in ha.so around qpid::ha::HaPlugin::initialize -> qpid::ha::HaBroker::initialize()
Active-active clustered qpidd broker crashes under cluster stress by failover...
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp (Show other bugs)
Development
Unspecified Unspecified
high Severity urgent
: 2.3
: ---
Assigned To: Alan Conway
Frantisek Reznicek
: OtherQA
Depends On:
Blocks: 875660
  Show dependency treegraph
 
Reported: 2012-11-30 04:36 EST by Frantisek Reznicek
Modified: 2015-11-15 20:14 EST (History)
3 users (show)

See Also:
Fixed In Version: qpid-cpp-0.18-13
Doc Type: Bug Fix
Doc Text:
Skip-errata: introduced & repaired since 2.2, not visible to customers.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-03-19 12:37:46 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Frantisek Reznicek 2012-11-30 04:36:32 EST
Description of problem:

Active-active clustered qpidd broker crashes under cluster stress by failover_soak in ha.so.

There are observed qpidd crashes during active-active clustering when new ha plugin installed but not requested (ha-cluster defaults to 0/no) in case when qpidd management is disabled.

This behavior seems to be caused by bug 875660 modification. This defect and bug 875660 are similar, but crash place/stack trace is different.


All qpidd crases (SIGSEGV) are located around qpid::ha::HaPlugin::initialize -> qpid::ha::HaBroker::initialize():

  (gdb)   4 Thread 11416  0x00e59402 in __kernel_vsyscall ()
    3 Thread 11418  0x00e59402 in __kernel_vsyscall ()
    2 Thread 11419  0x00e59402 in __kernel_vsyscall ()
  * 1 Thread 0xb7fe5950 (LWP 11415)  qpid::ha::HaBroker::initialize (this=0x0)
      at qpid/ha/HaBroker.cpp:89
  Thread 4 (Thread 11416):
  #0  0x00e59402 in __kernel_vsyscall ()
  #1  0x00589ff2 in pthread_cond_timedwait@@GLIBC_2.3.2 ()
     from /lib/libpthread.so.0
  #2  0x004fbd14 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libc.so.6
  #3  0x003162b3 in qpid::sys::Timer::run (this=0x94da3f0)
      at ../include/qpid/sys/posix/Condition.h:69
  #4  0x001f7a71 in qpid::sys::(anonymous namespace)::runRunnable(void*) ()
     from /usr/lib/libqpidcommon.so.8
  #5  0x00585912 in start_thread () from /lib/libpthread.so.0
  #6  0x004ef60e in clone () from /lib/libc.so.6
  Thread 3 (Thread 11418):
  #0  0x00e59402 in __kernel_vsyscall ()
  #1  0x00589cc5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
  #2  0x004fbccd in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libc.so.6
  #3  0x00315f67 in wait (this=0x953e120)
      at ../include/qpid/sys/posix/Condition.h:63
  #4  wait (this=0x953e120) at ../include/qpid/sys/Monitor.h:41
  #5  qpid::sys::Timer::run (this=0x953e120) at qpid/sys/Timer.cpp:142
  #6  0x001f7a71 in qpid::sys::(anonymous namespace)::runRunnable(void*) ()
     from /usr/lib/libqpidcommon.so.8
  #7  0x00585912 in start_thread () from /lib/libpthread.so.0
  #8  0x004ef60e in clone () from /lib/libc.so.6
  Thread 2 (Thread 11419):
  #0  0x00e59402 in __kernel_vsyscall ()
  #1  0x00589cc5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
  #2  0x004fbccd in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libc.so.6
  #3  0x00315f67 in wait (this=0x9540d40)
      at ../include/qpid/sys/posix/Condition.h:63
  #4  wait (this=0x9540d40) at ../include/qpid/sys/Monitor.h:41
  #5  qpid::sys::Timer::run (this=0x9540d40) at qpid/sys/Timer.cpp:142
  #6  0x001f7a71 in qpid::sys::(anonymous namespace)::runRunnable(void*) ()
     from /usr/lib/libqpidcommon.so.8
  #7  0x00585912 in start_thread () from /lib/libpthread.so.0
  #8  0x004ef60e in clone () from /lib/libc.so.6
  Thread 1 (Thread 0xb7fe5950 (LWP 11415)):
  #0  qpid::ha::HaBroker::initialize (this=0x0) at qpid/ha/HaBroker.cpp:89
  #1  0x00ca158f in qpid::ha::HaPlugin::initialize (this=0xce98e0, target=...)
      at qpid/ha/HaPlugin.cpp:87
  #2  0x002b6b81 in qpid::Plugin::initializeAll(qpid::Plugin::Target&) ()
     from /usr/lib/libqpidcommon.so.8
  #3  0x0362c600 in qpid::broker::Broker::Broker (this=0x94da3c8, conf=...)
      at qpid/broker/Broker.cpp:358
  #4  0x0804e71d in qpid::broker::QpiddBroker::execute (this=0xbf898ae7, 
      options=0x94d4fc8) at posix/QpiddBroker.cpp:196
  #5  0x0804cc18 in qpid::broker::run_broker (argc=22, argv=0xbf898bb4, 
      hidden=false) at qpidd.cpp:106
  #6  0x0804e33a in main (argc=59151144, argv=0x386934c)
      at posix/QpiddBroker.cpp:215
  (gdb) quit




Version-Release number of selected component (if applicable):
 python-qpid-0.18-4.el5
 python-qpid-qmf-0.18-9.el5
 python-saslwrapper-0.18-1.el5
 qpid-cpp-client-0.18-10.el5
 qpid-cpp-client-devel-0.18-10.el5
 qpid-cpp-client-devel-docs-0.18-10.el5
 qpid-cpp-client-rdma-0.18-10.el5
 qpid-cpp-client-ssl-0.18-10.el5
 qpid-cpp-server-0.18-10.el5
 qpid-cpp-server-cluster-0.18-10.el5
 qpid-cpp-server-devel-0.18-10.el5
 qpid-cpp-server-ha-0.18-10.el5
 qpid-cpp-server-rdma-0.18-10.el5
 qpid-cpp-server-ssl-0.18-10.el5
 qpid-cpp-server-store-0.18-10.el5
 qpid-cpp-server-xml-0.18-10.el5
 qpid-java-client-0.18-5.el5
 qpid-java-common-0.18-5.el5
 qpid-java-example-0.18-5.el5
 qpid-qmf-0.18-9.el5
 qpid-qmf-debuginfo-0.18-9.el5
 qpid-qmf-devel-0.18-9.el5
 qpid-tests-0.18-2.el5
 qpid-tools-0.18-7.el5
 ruby-qpid-qmf-0.18-9.el5
 ruby-saslwrapper-0.18-1.el5
 saslwrapper-0.18-1.el5
 saslwrapper-devel-0.18-1.el5
 sesame-1.0-7.el5


How reproducible:
80%

Steps to Reproduce:
See bug 875660 steps for details.
  
Actual results:
qpidd crashes.

Expected results:
qpidd should not crash.

Additional info:
Comment 1 Frantisek Reznicek 2012-11-30 04:46:47 EST
Reproducibility quite rapid on all supported platforms.
Comment 5 Frantisek Reznicek 2012-12-16 06:14:10 EST
The issue has been fixed, no other crashes detected.
Tested on RHEL5.9rc/6.3  i[36]86/x86_64 using packages:

  python-qpid-0.18-4.el6.noarch
  python-qpid-qmf-0.18-13.el6.i686
  python-saslwrapper-0.18-1.el6_3.i686
  qpid-cpp-client-0.18-13.el6.i686
  qpid-cpp-client-devel-0.18-13.el6.i686
  qpid-cpp-client-devel-docs-0.14-22.el6_3.noarch
  qpid-cpp-client-rdma-0.18-13.el6.i686
  qpid-cpp-client-ssl-0.18-13.el6.i686
  qpid-cpp-debuginfo-0.18-13.el6.i686
  qpid-cpp-server-0.18-13.el6.i686
  qpid-cpp-server-cluster-0.18-13.el6.i686
  qpid-cpp-server-devel-0.18-13.el6.i686
  qpid-cpp-server-ha-0.18-13.el6.i686
  qpid-cpp-server-rdma-0.18-13.el6.i686
  qpid-cpp-server-ssl-0.18-13.el6.i686
  qpid-cpp-server-store-0.18-13.el6.i686
  qpid-cpp-server-xml-0.18-13.el6.i686
  qpid-java-client-0.18-6.el6.noarch
  qpid-java-common-0.18-6.el6.noarch
  qpid-java-example-0.18-6.el6.noarch
  qpid-qmf-0.18-13.el6.i686
  qpid-qmf-debuginfo-0.18-13.el6.i686
  qpid-qmf-devel-0.18-13.el6.i686
  qpid-tests-0.18-2.el6.noarch
  qpid-tools-0.18-7.el6_3.noarch
  rhm-docs-0.10-2.el6.noarch
  rh-qpid-cpp-tests-0.18-13.el6.i686
  ruby-qpid-qmf-0.18-13.el6.i686
  ruby-saslwrapper-0.18-1.el6_3.i686
  saslwrapper-0.18-1.el6_3.i686
  saslwrapper-debuginfo-0.18-1.el6_3.i686
  saslwrapper-devel-0.18-1.el6_3.i686
  sesame-1.0-8.el6.i686
  sesame-debuginfo-1.0-8.el6.i686


-> VERIFIED

Note You need to log in before you can comment on or make changes to this bug.