Bug 144838

Summary: Split brain if tiebreaker IP is not on network used for cluster communication
Product: [Retired] Red Hat Cluster Suite Reporter: Lon Hohberger <lhh>
Component: clumanagerAssignee: Lon Hohberger <lhh>
Status: CLOSED NOTABUG QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: medium    
Version: 3CC: cluster-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-01-11 20:55:33 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Lon Hohberger 2005-01-11 20:55:19 UTC
Description of problem:
A split brain can occur if the tiebreaker IP address is on one network
while the cluster is communicating over another.


Version-Release number of selected component (if applicable):
All, up to and including 1.2.22


How reproducible:
100%


Steps to Reproduce:
1. Configure a 2-node cluster using either multicast or broadcast with
"clumembd%primary_only" enabled.
2. Set communication paths (node names) on one network
3. Set tiebreaker IP on another network which is separate from the
communication network (one which uses different physical NICs).
4. Start both nodes
5. Wait for quorum
6. Unplug one node's cluster communication link.

  
Actual results:
Split brain.  Both nodes ping the IP tiebreaker and believe they are
the sole member of the cluster, even though the tiebreaker is on the
wrong network.


Expected results:
Unknown;  this is a misconfiguration.  Cluster Manager requires that
all members coexist on the same fully connected subnet and that the
link(s) used for cluster communication are the same link(s) used to
monitor the tiebreaker IP address.


Additional info:
We could possibly detect that the IP tiebreaker is on the same network
and refuse to use it if it is not, but this would be an undesirable
enhancement in some cases where users use something outside their LAN
for the tiebreaker IP address (for instance, their ISP's router).

* This behavior does not occur in 2-node clusters where the disk
tiebreaker is in use.

* This behavior does not occur in 2-node clusters utilizing broadcast
heartbeat (with clumembd%primary_only disabled) in Cluster Manager
versions 1.2.23 and higher.