Bug 824988 - [RFE] Federated link heartbeat interval is hard coded to 120 seconds
Summary: [RFE] Federated link heartbeat interval is hard coded to 120 seconds
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: 2.0
Hardware: Unspecified
OS: Unspecified
medium
unspecified
Target Milestone: 3.0
: ---
Assignee: Ted Ross
QA Contact: Leonid Zhaldybin
URL:
Whiteboard:
Depends On:
Blocks: 698367 957950
TreeView+ depends on / blocked
 
Reported: 2012-05-24 18:28 UTC by Jason Dillaman
Modified: 2014-11-09 22:38 UTC (History)
6 users (show)

Fixed In Version: qpid-0.18
Doc Type: Enhancement
Doc Text:
This Enhancement introduces a configurable link heartbeat interval for the qpidd broker. In a worst-case scenario, the previous heartbeat default of 120 seconds would result in a system recovery under 240 seconds. For High Availability environments, this amount of time was considered to be too long, and a user-configurable time in seconds was required. The qpid broker now has an option `link-heartbeat-interval`, which allows a custom heartbeat interval (in seconds) to be configured. This feature is documented in the "Broker HA Options" section in the Messaging Installation and Configuration Guide.
Clone Of:
: 957950 (view as bug list)
Environment:
Last Closed: 2014-09-24 15:04:19 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Apache JIRA QPID-4040 0 None None None 2012-06-05 15:40:06 UTC
Red Hat Bugzilla 850802 0 medium CLOSED Incorrect timing of destination federated broker connection closure 2021-02-22 00:41:40 UTC
Red Hat Product Errata RHEA-2014:1296 0 normal SHIPPED_LIVE Red Hat Enterprise MRG Messaging 3.0 Release 2014-09-24 19:00:06 UTC

Internal Links: 850802

Description Jason Dillaman 2012-05-24 18:28:27 UTC
Description of problem:
The federated link's connection heartbeat interval appears to be hard coded to 120 seconds.  This timeout will result in a recovery after approximately 240 seconds in the worst case scenario.  In a high-availability environment, the heartbeat interval needs to be configurable to a lower value to reduce system unavailability.

Version-Release number of selected component (if applicable):
qpid-cpp-server-0.12-6_ptc_hotfix_3.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Establish a federated link between two brokers over two hosts
2. Hard-kill one of the hosts
  
Actual results:
The surviving broker will take ~240 seconds to declare the link dead.

Expected results:
In a clustered HA environment, the surviving broker will failover to another broker within the cluster in a short period of time.

Additional info:

Comment 1 Ted Ross 2012-06-06 18:36:53 UTC
Fixed upstream at revision 1347044.

Comment 6 Leonid Zhaldybin 2014-01-09 16:20:55 UTC
Tested on RHEL6.5 (both i386 and x86_64). This feature was implemented and works as expected. The qpidd broker now has an option link-heartbeat-interval, which lets setting the heartbeat interval in seconds.

Packages used for testing:

python-qpid-0.22-9.el6
python-qpid-qmf-0.22-25.el6
qpid-cpp-client-0.22-31.el6
qpid-cpp-client-devel-0.22-31.el6
qpid-cpp-client-devel-docs-0.22-31.el6
qpid-cpp-client-rdma-0.22-31.el6
qpid-cpp-client-ssl-0.22-31.el6
qpid-cpp-server-0.22-31.el6
qpid-cpp-server-devel-0.22-31.el6
qpid-cpp-server-ha-0.22-31.el6
qpid-cpp-server-linearstore-0.22-31.el6
qpid-cpp-server-rdma-0.22-31.el6
qpid-cpp-server-ssl-0.22-31.el6
qpid-cpp-server-xml-0.22-31.el6
qpid-proton-c-0.6-1.el6
qpid-qmf-0.22-25.el6
qpid-tools-0.22-7.el6

-> VERIFIED

Comment 8 errata-xmlrpc 2014-09-24 15:04:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-1296.html


Note You need to log in before you can comment on or make changes to this bug.