Bug 1707069 - A `ping` resource does not use the default timeout value when an operational (monitor, start) is not declared or set
Summary: A `ping` resource does not use the default timeout value when an operational ...
Keywords:
Status: NEW
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: pacemaker
Version: 8.0
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: pre-dev-freeze
: ---
Assignee: Ken Gaillot
QA Contact: cluster-qe
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-05-06 17:53 UTC by Shane Bradley
Modified: 2023-08-10 15:40 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Feature Request
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1831953 0 unspecified NEW [RFE] 'pcs resource create' does not copy 'op timeout' value from resource agent's metadata to CIB if 'op interval' is s... 2023-08-10 15:41:36 UTC
Red Hat Knowledge Base (Solution) 4109851 0 None None None 2019-05-06 17:53:51 UTC

Internal Links: 1831953

Comment 2 Ken Gaillot 2019-05-08 18:15:55 UTC
It's definitely confusing, but pacemaker ignores the timeouts listed in a resource agent's meta-data.

The timeouts (and intervals) in the meta-data are "hints" to the user (and UIs such as pcs) as to what's a reasonable value to use. It's expected that actually desirable values will vary by deployment and so should be tested and adjusted by the user.

By contrast, pacemaker uses (in order of preference): any timeout explicitly set in the operation configuration in the CIB; any timeout set in op_defaults in the CIB (i.e. pcs resource op defaults); or 20 seconds.

Partly this is imposed by pacemaker's scheduling model -- the scheduler only has access to the CIB, which currently doesn't include agent meta-data. We could consider having pacemaker save agent meta-data hints in the CIB, but that could hurt scalability in clusters with many different resource types. There's also a problem with different versions of the same agent installed on different nodes -- we'd potentially need to store the hints per node, which would be even worse for scalability.

This would be considered a new feature, so it would be RHEL 8 only.

(BTW, the ocf:pacemaker: agents are part of the pacemaker component, not resource-agents.)

Comment 4 Ken Gaillot 2021-07-26 15:48:01 UTC
I have thought of an implementation that could scale: instead of feeding agent meta-data to the scheduler, the controller could keep the default timeout values in its meta-data cache, and the scheduler could add a flag to scheduled actions when the timeout is the "default default" (i.e. not explicitly specified in either the action configuration or op_defaults). The controller would then override the scheduler's timeout value with the one from cache when available.

Unfortunately there is still a large backlog, so I would not expect a fix in the next couple of point releases.


Note You need to log in before you can comment on or make changes to this bug.