Bug 605332

Summary: RFE: Need to support cluster infrastructure running over network links with higher than LAN latency conditions
Product: Red Hat Enterprise Linux 5 Reporter: Perry Myers <pmyers>
Component: openaisAssignee: Fabio Massimo Di Nitto <fdinitto>
Status: CLOSED WONTFIX QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: low    
Version: 5.5CC: ccaulfie, cluster-maint, edamato, lhh, rpeterso, ssaha, teigland
Target Milestone: rcKeywords: FutureFeature, TestOnly
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: 605331 Environment:
Last Closed: 2011-01-17 22:51:45 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 605331    
Bug Blocks:    

Description Perry Myers 2010-06-17 16:29:21 UTC
+++ This bug was initially created as a clone of Bug #605331 +++

Description of problem:
Right now RHEL HA (both RHCS and Pacemaker based stacks) require running on networks with LAN-like latency that we have defined to be <= 2ms.

The primary constraints on latency are in the membership which is done via Corosync.  In addition, plocks via GFS are also latency sensitive.

For the context of this bug, we are not concerned about GFS over high latency links, just the core cluster infrastructure.

What we need to do is simulate high latency links and test out the HA stacks to determine what is the highest latency that we can support w/o needing to make significant code or configuration (timeout) changes.

Then we can begin officially QE testing at this higher latency and support links with up to this delay.

This bug for the time being should be considered TestOnly, but it needs testing first from development perspective before QE can begin running more comprehensive tests.

The initial use case is to run stretch clusters with 2 sites and between 1 and 8 nodes at each site.  The membership list should be configured so that the Totem token does not bounce back and forth the high latency link, but crosses it minimally.  (i.e. nodes 1-8 on SiteA and 9-16 on SiteB, meaning token only crosses high latency link between nodes 8 and 9 and between nodes 16 and 1)

If specific code changes are required to support this, the engineers testing this feature should file dependent bugs on their components (for example a bug on Corosync)

Comment 4 Sayan Saha 2010-07-29 19:01:58 UTC
Based on further customer input RHEL PM has decided to modify the proposed initial split-site (stretched) cluster configuration for QE to be a 2X2 cluster where the cluster will be stretched between two sites each having a maximum of two nodes. This configuration shall hold irrespective of the latency between the sites. The membership list should be configured so that the
Totem token does not bounce back and forth the high latency link, but crosses
it minimally.  (i.e. nodes 1-2 on SiteA and 3-4 on SiteB, meaning token only
crosses high latency link between nodes 2 and 3 and between nodes 4 and 1)

Comment 5 RHEL Program Management 2011-01-11 20:45:02 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated in the
current release, Red Hat is unfortunately unable to address this
request at this time. Red Hat invites you to ask your support
representative to propose this request, if appropriate and relevant,
in the next release of Red Hat Enterprise Linux.

Comment 6 RHEL Program Management 2011-01-11 23:15:50 UTC
This request was erroneously denied for the current release of
Red Hat Enterprise Linux.  The error has been fixed and this
request has been re-proposed for the current release.

Comment 7 Subhendu Ghosh 2011-01-17 22:51:45 UTC
This enhancement request was evaluated by Red Hat Product Management for inclusion a Red Hat Enterprise Linux maintenance release.

Red Hat does not currently plan to provide this enhanced functionality in a Red Hat Enterprise Linux update for currently deployed products.

With the goal of minimizing risk of change for deployed systems, and in response to customer and partner requirements, Red Hat takes a conservative approach when evaluating enhancements for inclusion in maintenance updates for currently deployed products. The primary objectives of update releases are to enable new hardware platform support and to resolve critical defects.

For more information on Red Hat Enterprise Linux maintenance policies, please consult: https://access.redhat.com/support/policy/updates/errata/

Red Hat values your feedback and will take this enhancement request into consideration for future major releases of Red Hat Enterprise Linux.