Bug 875660
Summary: | Active-active clustered qpidd broker crashes under cluster stress by failover_soak in ha.so around qpid::ha::HaPlugin::earlyInitialize() -> qpid::ha::HaBroker::HaBroker() | ||
---|---|---|---|
Product: | Red Hat Enterprise MRG | Reporter: | Frantisek Reznicek <freznice> |
Component: | qpid-cpp | Assignee: | Alan Conway <aconway> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Frantisek Reznicek <freznice> |
Severity: | urgent | Docs Contact: | |
Priority: | high | ||
Version: | Development | CC: | esammons, iboverma, jross, lzhaldyb, mcressma |
Target Milestone: | 2.3 | Keywords: | OtherQA |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | qpid-cpp-0.18-10 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2013-03-19 16:37:34 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 882149 | ||
Bug Blocks: |
Description
Frantisek Reznicek
2012-11-12 10:18:28 UTC
Is it possible that management was disabled on the broker where these crashes occurred? I.e. configuration setting mgmt-enable=no There is bug in the case where mgmt-enable=no that would give exactly these results. (In reply to comment #4) > Is it possible that management was disabled on the broker where these > crashes occurred? I.e. configuration setting mgmt-enable=no > > There is bug in the case where mgmt-enable=no that would give exactly these > results. I can confirm that in all the cases when we saw it: - management was turned off (mgmt-enable=no) - tcp-no-delay was used 15:25:46] .running core test (./failover_soak ... ) MSG:256352, DUR:1, MDL:/usr/lib/qpid/daemon, MAN:no, QPIDD_CONFIG:all, QPIDD_NO_TCPDELAY:yes, N_QUEUES:3, N_BROKERS:3 ./runtest.sh: line 294: 8881 Aborted (core dumped) ./failover_soak $MODULES ./declare_queues ./replaying_sender ./resuming_receiver $MESSAGES $REPORT_FREQUENCY $VERBOSITY $DURABILITY $N_QUEUES $N_BROKERS > ${TEMP_FILE} 2>&1 [15:25:47] ..ERROR:core test failed! (ecode:1340000,client err_cnt:0|0,broker err_cnt:0|0) [15:32:42] .running core test (./failover_soak ... ) MSG:208389, DUR:0, MDL:/usr/lib/qpid/daemon, MAN:no, QPIDD_CONFIG:all, QPIDD_NO_TCPDELAY:yes, N_QUEUES:1, N_BROKERS:3 ./runtest.sh: line 294: 12637 Aborted (core dumped) ./failover_soak $MODULES ./declare_queues ./replaying_sender ./resuming_receiver $MESSAGES $REPORT_FREQUENCY $VERBOSITY $DURABILITY $N_QUEUES $N_BROKERS > ${TEMP_FILE} 2>&1 [15:32:43] ..ERROR:core test failed! (ecode:1340000,client err_cnt:0|0,broker err_cnt:0|0) [17:28:35] .running core test (./failover_soak ... ) MSG:277935, DUR:1, MDL:/usr/lib64/qpid/daemon, MAN:no, QPIDD_CONFIG:all, QPIDD_NO_TCPDELAY:yes, N_QUEUES:1, N_BROKERS:3 ./runtest.sh: line 294: 92195 Aborted (core dumped) ./failover_soak $MODULES ./declare_queues ./replaying_sender ./resuming_receiver $MESSAGES $REPORT_FREQUENCY $VERBOSITY $DURABILITY $N_QUEUES $N_BROKERS > ${TEMP_FILE} 2>&1 [17:28:36] ..ERROR:core test failed! (ecode:1340000,client err_cnt:0|0,broker err_cnt:0|0) [17:31:02] .running core test (./failover_soak ... ) MSG:119763, DUR:0, MDL:/usr/lib64/qpid/daemon, MAN:no, QPIDD_CONFIG:all, QPIDD_NO_TCPDELAY:yes, N_QUEUES:3, N_BROKERS:3 ./runtest.sh: line 294: 93937 Aborted (core dumped) ./failover_soak $MODULES ./declare_queues ./replaying_sender ./resuming_receiver $MESSAGES $REPORT_FREQUENCY $VERBOSITY $DURABILITY $N_QUEUES $N_BROKERS > ${TEMP_FILE} 2>&1 [17:31:03] ..ERROR:core test failed! (ecode:1340000,client err_cnt:0|0,broker err_cnt:0|0) [11:33:45] .running core test (./failover_soak ... ) MSG:239310, DUR:1, MDL:/usr/lib64/qpid/daemon, MAN:no, QPIDD_CONFIG:all, QPIDD_NO_TCPDELAY:yes, N_QUEUES:2, N_BROKERS:3 ./runtest.sh: line 294: 13690 Aborted (core dumped) ./failover_soak $MODULES ./declare_queues ./replaying_sender ./resuming_receiver $MESSAGES $REPORT_FREQUENCY $VERBOSITY $DURABILITY $N_QUEUES $N_BROKERS > ${TEMP_FILE} 2>&1 [11:33:45] ..ERROR:core test failed! (ecode:1340000,client err_cnt:0|0,broker err_cnt:0|0) [11:34:46] .running core test (./failover_soak ... ) MSG:162971, DUR:0, MDL:/usr/lib64/qpid/daemon, MAN:no, QPIDD_CONFIG:all, QPIDD_NO_TCPDELAY:yes, N_QUEUES:2, N_BROKERS:3 ./runtest.sh: line 294: 14418 Aborted (core dumped) ./failover_soak $MODULES ./declare_queues ./replaying_sender ./resuming_receiver $MESSAGES $REPORT_FREQUENCY $VERBOSITY $DURABILITY $N_QUEUES $N_BROKERS > ${TEMP_FILE} 2>&1 [11:34:46] ..ERROR:core test failed! (ecode:1340000,client err_cnt:0|0,broker err_cnt:0|0) ... Comment 8 patch seems to moved problem to other place, tracked as bug 882149. The issue has been fixed, no other crashes detected. Tested on RHEL5.9rc/6.3 i[36]86/x86_64 using packages: python-qpid-0.18-4.el6.noarch python-qpid-qmf-0.18-13.el6.i686 python-saslwrapper-0.18-1.el6_3.i686 qpid-cpp-client-0.18-13.el6.i686 qpid-cpp-client-devel-0.18-13.el6.i686 qpid-cpp-client-devel-docs-0.14-22.el6_3.noarch qpid-cpp-client-rdma-0.18-13.el6.i686 qpid-cpp-client-ssl-0.18-13.el6.i686 qpid-cpp-debuginfo-0.18-13.el6.i686 qpid-cpp-server-0.18-13.el6.i686 qpid-cpp-server-cluster-0.18-13.el6.i686 qpid-cpp-server-devel-0.18-13.el6.i686 qpid-cpp-server-ha-0.18-13.el6.i686 qpid-cpp-server-rdma-0.18-13.el6.i686 qpid-cpp-server-ssl-0.18-13.el6.i686 qpid-cpp-server-store-0.18-13.el6.i686 qpid-cpp-server-xml-0.18-13.el6.i686 qpid-java-client-0.18-6.el6.noarch qpid-java-common-0.18-6.el6.noarch qpid-java-example-0.18-6.el6.noarch qpid-qmf-0.18-13.el6.i686 qpid-qmf-debuginfo-0.18-13.el6.i686 qpid-qmf-devel-0.18-13.el6.i686 qpid-tests-0.18-2.el6.noarch qpid-tools-0.18-7.el6_3.noarch rhm-docs-0.10-2.el6.noarch rh-qpid-cpp-tests-0.18-13.el6.i686 ruby-qpid-qmf-0.18-13.el6.i686 ruby-saslwrapper-0.18-1.el6_3.i686 saslwrapper-0.18-1.el6_3.i686 saslwrapper-debuginfo-0.18-1.el6_3.i686 saslwrapper-devel-0.18-1.el6_3.i686 sesame-1.0-8.el6.i686 sesame-debuginfo-1.0-8.el6.i686 -> VERIFIED |