Description of problem: The federated link's connection heartbeat interval appears to be hard coded to 120 seconds. This timeout will result in a recovery after approximately 240 seconds in the worst case scenario. In a high-availability environment, the heartbeat interval needs to be configurable to a lower value to reduce system unavailability. Version-Release number of selected component (if applicable): qpid-cpp-server-0.12-6_ptc_hotfix_3.el6.x86_64 How reproducible: 100% Steps to Reproduce: 1. Establish a federated link between two brokers over two hosts 2. Hard-kill one of the hosts Actual results: The surviving broker will take ~240 seconds to declare the link dead. Expected results: In a clustered HA environment, the surviving broker will failover to another broker within the cluster in a short period of time. Additional info:
Fixed upstream at revision 1347044.
Tested on RHEL6.5 (both i386 and x86_64). This feature was implemented and works as expected. The qpidd broker now has an option link-heartbeat-interval, which lets setting the heartbeat interval in seconds. Packages used for testing: python-qpid-0.22-9.el6 python-qpid-qmf-0.22-25.el6 qpid-cpp-client-0.22-31.el6 qpid-cpp-client-devel-0.22-31.el6 qpid-cpp-client-devel-docs-0.22-31.el6 qpid-cpp-client-rdma-0.22-31.el6 qpid-cpp-client-ssl-0.22-31.el6 qpid-cpp-server-0.22-31.el6 qpid-cpp-server-devel-0.22-31.el6 qpid-cpp-server-ha-0.22-31.el6 qpid-cpp-server-linearstore-0.22-31.el6 qpid-cpp-server-rdma-0.22-31.el6 qpid-cpp-server-ssl-0.22-31.el6 qpid-cpp-server-xml-0.22-31.el6 qpid-proton-c-0.6-1.el6 qpid-qmf-0.22-25.el6 qpid-tools-0.22-7.el6 -> VERIFIED
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-1296.html