Bug 591197

Summary: Need flag to enable quorum mode for even split stretch clusters
Product: Red Hat Enterprise Linux 6 Reporter: Perry Myers <pmyers>
Component: clusterAssignee: Christine Caulfield <ccaulfie>
Status: CLOSED CANTFIX QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: low    
Version: 6.0CC: ccaulfie, cluster-maint, lhh, rpeterso, teigland
Target Milestone: rcKeywords: FutureFeature
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
: 591198 (view as bug list) Environment:
Last Closed: 2010-05-11 17:26:21 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 591198    

Description Perry Myers 2010-05-11 16:11:36 UTC
Description of problem:
Today a 2 node cluster can be used in a stretch cluster configuration using cman 2 node mode, which effectively lets both nodes win and uses a fence race to break the tie.  If the fencing network is the same as the heartbeat network, then this means that the admin chooses which node wins by running fence_ack_manual to break the tie.

For 2x2, 3x3... stretch clusters we have the same problem but 2 node mode cannot work here.  Instead we need to allow a mode where quorum can be maintained by expected_votes/2 instead of (expected_votes/2)-1

Normally in a non-stretch environment qdisk would be used to handle this number of nodes failing, but qdisk cannot be used across a stretch cluster since it cannot be used with storage replication technologies and the qdisk cannot be located at just a single site.

So suggest is to create a cluster.conf option to enable 'stretch cluster mode' which can be used with all two site stretch clusters with an even node count.  (Note, stretch clusters by definition have to have an even node count)  This new option would change the number of required votes as described above, and can _only_ be used on a stretch cluster configuration and requires that the customer get engineering review of their configuration/architecture.

Comment 1 Perry Myers 2010-05-11 17:26:21 UTC
Further discussion uncovered that this cannot be done because a temporary ISL failure will result in each side having a fencing race which will then obliterate the cluster.  So closing this as CANTFIX