Red Hat Bugzilla – Bug 531833
FailoverExchangeMethod getNextBrokerDetails() loops infinitely after a total cluster failure or if the inital connect node is down
Last modified: 2010-10-14 12:10:05 EDT
Description of problem: If a total cluster failure happens or if the initial node connects but fails before it has got a chance to update the brokers list, the FailoverExchangeMethod will loop infinitely. Version-Release number of selected component (if applicable): 1.1.7 How reproducible: Always Steps to Reproduce: 1. Start a JMS client with FailoverExchangeMethod as the failover_method 2. Wait till the JMS client is connected properly 3. Kill all brokers in the cluster Actual results: The JMS client goes into an infinite loop trying to get next broker details Expected results: The JMS client should throw a connection exception once it tries all known brokers Additional info:
This is tracked upstream via QPID-1956 A fix for this is committed at rev 817487 on qpid trunk.
Tested: on qpid-java-client-0.5.751061-9.el5 bug appears on qpid-java-client-0.7.916826-2 does not. It has been fixed validated on RHEL 5.5 i386 / x86_64 and RHEL 4.8 i386 / x86_64 (cluster not possible but client behaviour reproducible) packages: # rpm -qa | grep -E '(qpid|openais|rhm)' | sort -u openais-0.80.6-16.el5 openais-debuginfo-0.80.6-8.el5_4.1 python-qpid-0.7.917557-4.el5 qpid-cpp-client-0.7.916826-2.el5 qpid-cpp-client-devel-0.7.916826-2.el5 qpid-cpp-client-devel-docs-0.7.916826-2.el5 qpid-cpp-client-ssl-0.7.916826-2.el5 qpid-cpp-mrg-debuginfo-0.7.916826-2.el5 qpid-cpp-server-0.7.916826-2.el5 qpid-cpp-server-cluster-0.7.916826-2.el5 qpid-cpp-server-devel-0.7.916826-2.el5 qpid-cpp-server-ssl-0.7.916826-2.el5 qpid-cpp-server-store-0.7.916826-2.el5 qpid-cpp-server-xml-0.7.916826-2.el5 qpid-dotnet-0.4.738274-2.el5 qpid-java-client-0.7.918215-1.el5 qpid-java-common-0.7.918215-1.el5 qpid-tests-0.7.917717-4.el5 qpid-tools-0.7.917557-4.el5 ->VERIFIED
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: If a total cluster failure occurred or if the initial node connected but failed before it could update the brokers list, the 'FailoverExchangeMethod' method looped infinitely. With this update, the JMS client throws a connection exception once it tries all known brokers.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2010-0773.html