Bug 472979 - [RFE] - Documentation - Further explanation of failover domains is needed such as more detail on the behavior of exclusive services.
Summary: [RFE] - Documentation - Further explanation of failover domains is needed suc...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: doc
Version: 4
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Adam Strong
QA Contact: ecs-bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-11-25 21:48 UTC by Stuart R. Kirk
Modified: 2018-11-14 20:19 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-01-05 19:01:10 UTC
Embargoed:


Attachments (Terms of Use)

Description Stuart R. Kirk 2008-11-25 21:48:36 UTC
* Given a 3 node cluster:  "Node A", "Node B", "Node C"
* Node "A" and Node "B" are part of failover domain "FD1"
* Node "B" and Node "C" are part of failover domain "FD2"

* An HTTP Resource and IP Address Rresource make up a service "SV1" which is confined to failover domain "FD1" and is running exclusive.

* A GFS filesystem Resource makes up a service "SV2" which is confined to failover domain "FD2" and is running as exclusive.

With the above scenario in mind, assuming "SV1" is running on "Node A" and "SV2" is running on "Node B", what will occur if there is a hardware failure in "Node A"?  The expected behavior would be that "SV2" would relocate to "Node C" and "SV1" would relocate to "Node B" given the failover domains, and the 'run-exclusive' restriction.

Is this the case?  If so, can documentation be provided to illustrate this example?  In essence the desired function is to define a service as needing to run exclusive, but also being able to define the same service as more critical than another service such that if a relocate needs to occur associated with an exclusive service, the less-critical service located on another node which the critical service wishes to relocate to will either relocate to another node in the cluster, or disables itself so that the more critical exclusive service is able to start-up on the failover node.

Comment 4 Lon Hohberger 2013-09-17 15:40:31 UTC
rgmanager does not run iterative service placement in the event of failures; it lacks the dependency engine processing required.  For example, in the example:

* Given a 3 node cluster:  "Node A", "Node B", "Node C"
* Node "A" and Node "B" are part of failover domain "FD1"
* Node "B" and Node "C" are part of failover domain "FD2"
* SV1 is in FD1 and running on Node A
* SV2 is in FD2 and running on Node B

If A fails, SV1 will not recover; rgmanager does not build a list of actions to take at service failures that are related to one-another.

There was code upstream at one point to give this sort of multi-action ability to rgmanager, but it was removed long ago:

https://git.fedorahosted.org/cgit/cluster.git/commit/?h=STABLE32&id=52685e0f57063583d6af537d1ea2e1539e437372

Instead, the Pacemaker Cluster Resource manager will provide functionality similar to this.

Comment 6 Lon Hohberger 2016-01-05 19:01:10 UTC
Cluster Suite is no longer supported.


Note You need to log in before you can comment on or make changes to this bug.