1. Proposed title of this feature request
A way to implement a more dymanic fence delay so that in the event of a network split, the cluster will fence the node running the fewest resources.
2. Who is the customer behind the request?
Account: EBS Financial Technologies Ltd (5695203)
TAM customer: no
SRM customer: no
3. What is the nature and description of the request?
The customer has a fence delay as described in https://access.redhat.com/solutions/54829 to avoid fence races.
The customer described a scenario in which a network split would result in the fencing of the node that is hosting a majority of cluster resources if the resources happen to be running on the node whose stonith resource is not configured with a delay. We discussed that currently the delay attribute is something that must be configured to decide the winner of a fence race ahead of time. The customer has requested that we file an RFE to look into ways for the delay to apply to the node that is hosting a majority of resource to minimize the impact of a fence event that results from a network split.
4. Why does the customer need this? (List the business requirements here)
The goal would be to minimize disruption caused by fence events
5. How would the customer like to achieve this? (List the functional requirements here)
We discussed this on the phone. For now their team is interested in setting up a cron job that will check where resources are running, check that the node running has a stonith delay set, and set one if there isn't one set. For the RFE, it may be good to have it as an option somewhere in the cluster configuration so that we are not relying on a custom cron job.
6. For each functional requirement listed, specify how Red Hat and the customer can test to confirm the requirement is successfully implemented.
Once a solution is devised a series of network failure tests with varying cluster resource placement should confirm or deny if the solution works.
7. Is there already an existing RFE upstream or in Red Hat Bugzilla?
I was unable to locate one.
8. Does the customer have any specific time-line dependencies and which release would they like to target?
The customer is running RHEL 7. I'm filing this for RHEL 8 due to the fact that RHEL 7 is in Maintenance Support 1.
9. Is the sales team involved in this request and do they have any additional input?
10. List any affected packages or components.
pcs, pacemaker, fence-agents
11. Would the customer be able to assist in testing this functionality if implemented?
Coincidentally, there was a discussion upstream about this topic recently. The proposed solution design is:
* Users would use the existing "priority" resource meta-attribute to weight each resource.
* A new cluster property "priority-fencing-delay" would set the specified delay on all cluster-initiated fencing targeting the node with the highest cumulative priority of all resources active on it. (It would not apply to manual fencing via stonith_admin, or externally initiated fencing by something like DLM.)
For a simple resource count as described here, it would be sufficient to give every resource a priority of 1. Of course if some resources are more important, they could be given higher priority.
This has been fixed upstream as of commit 3dec930.
A new cluster property, priority-fencing-delay, will default to -1 (meaning disabled) and can be set to a time duration.
This is really only useful for 2-node clusters. If the new property is set, and the "priority" meta-attribute is configured for at least one resource, then in a split-brain situation, the node with the highest combined priority of all resources running on it will be more likely to survive.
As an example, with:
pcs resource defaults priority=1
pcs property set priority-fencing-delay=15s
and no other priorities, then the node running the most resources will be more likely to survive, because the other node will wait 15 seconds before initiating fencing. If a particular resource is more important than the rest, you can give it a higher priority.
The node running the master role of a promotable clone will get an extra 1 point, if a priority has been configured for that clone.
If both nodes have equal priority, or fencing is needed for some reason other than node loss (e.g. on-fail=fencing for some monitor), then the usual delay properties apply (pcmk_delay_base, etc.). Otherwise priority-fencing-delay takes precedence over other delay properties.
qa_ack+, feature described in comment#2
The Doc Text has been updated with the final behavior, which is slightly different from the description in Comment 2
We might be able to get this into RHEL 7.9 as well, which will be tracked as Bug 1840407
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (pacemaker bug fix and enhancement update), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.