Red Hat Bugzilla – Bug 1290850
High Availability "start-failure-is-fatal" definition is unclear
Last modified: 2016-11-07 16:35:26 EST
Document URL: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/High_Availability_Add-On_Reference/ch-clusteropts-HAAR.html
Section Number and Name: 10.1. Summary of Cluster Properties and Options
Describe the issue: The definition of start-failure-is-fatal is unclear about what "fatal" means.
Suggestions for improvement: Indicates whether a failure to start a resource on a particular node prevents further start attempts on that node. If FALSE, the cluster will decide whether to try starting on the same node again based on the resource's current failure count and migration-threshold.
Additional information: Also, section "7.2. Moving Resources Due to Failure" says: "Start failures cause the failcount to be set to INFINITY and thus always cause the resource to move immediately." That should begin, "If the cluster property start-failure-is-fatal is set to true (which is the default), ...".
When this is updated and reviewed, we need to make the same changes to the RHEL 6 Pacemaker document as well. (I'll either clone this BZ or add this as a topic to the more general RHEL 6.8 BZ on general backport of RHEL 7 updates to RHEL 6.)
I've made this changes to the draft of the document. When I next build this document I will send them for review, after which I will make the same changes to the RHEL 6 document.
These changes as noted are in the draft of the document for both RHEL 7 and RHEL 6.8.