Bug 2062359

Summary: [RFE] Additional configurable failure recovery options for pacemaker managed resources
Product: Red Hat Enterprise Linux 9 Reporter: Shane Bradley <sbradley>
Component: pacemakerAssignee: Ken Gaillot <kgaillot>
Status: NEW --- QA Contact: cluster-qe <cluster-qe>
Severity: medium Docs Contact:
Priority: medium    
Version: 9.0CC: cluster-maint, nwahl, sbradley
Target Milestone: rcKeywords: FutureFeature, Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Feature Request
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Shane Bradley 2022-03-09 15:56:54 UTC
Description of problem:
Requesting additional configurable failure recovery options for pacemaker managed resources.

For example a customer requested:
  "RFE to add something like a retry and/or retry_attempts option for pacemaker 
   resource monitor operations."


Version-Release number of selected component (if applicable):
Latest 8.5 pacemaker

How reproducible:
Does not apply

Steps to Reproduce:
Does not apply

Actual results:
Currently a monitor failure of a resource results in pacemaker performing the "on-fail" value (restart, ignore, fence, etc). 

Expected results:
Provide more options to pacemaker to handle monitor resource failures such as "retry X times before considering the resource monitor a failure". 

Additional info:

We spoke with engineering about this issue and they state there are some other bugzilla that are related to this RFE:

  - 1747559 – Allow operation failure timeouts to be configured per operation in Pacemaker 
    https://bugzilla.redhat.com/show_bug.cgi?id=1747559

  - 1328448 – RFE: start-failure-is-fatal as per-resource parameter instead of global property 
    https://bugzilla.redhat.com/show_bug.cgi?id=1328448