+++ This bug was initially created as a clone of Bug #2209433 +++ Description of problem: Monitor and Stop operations for the "ocf:heartbeat:Delay" resource fail at default settings ( out of the box config ). This is because the default "mondelay" and "stopdelay" timeouts are the exact same as the timeout period for monitor and stop operations in the cluster. Version-Release number of selected component (if applicable): # rpm -q resource-agents kernel resource-agents-4.9.0-29.el8_7.3.x86_64 kernel-4.18.0-425.3.1.el8.x86_64 How reproducible: Monitor failures appear to be 100% faillure. I have had a couple of successful stop operations, but most fail at default configuration: Steps to Reproduce: 1. Created resource with default settings ( no additional options ), and disabled to run "debug-<operation>" test: ~~~ [root@rhel8-node1 ~]# pcs resource create test-delay Delay Assumed agent name 'ocf:heartbeat:Delay' (deduced from 'Delay') [root@rhel8-node1 ~]# pcs resource disable test-delay ~~~ 2. Start operation is successful with default settings ( successful ): ~~~ [root@rhel8-node2 ~]# pcs resource debug-start test-delay Operation force-start for test-delay (ocf:heartbeat:Delay) returned 0 (ok) ~~~ 3. Monitor operation times out with default settings: ~~~ [root@rhel8-node1 ~]# pcs resource debug-monitor test-delay Operation force-check for test-delay (ocf:heartbeat:Delay) could not be executed (Timed Out: Resource agent did not exit within specified timeout) crm_resource: Error performing operation: Error occurred ~~~ 4. Stop operations time out with default settings: ~~~ # Can only be ran after a "debug-start" to start the resource. Otherwise reports as already down: [root@rhel8-node2 ~]# pcs resource debug-stop test-delay Operation force-stop for test-delay (ocf:heartbeat:Delay) could not be executed (Timed Out: Resource agent did not exit within specified timeout) crm_resource: Error performing operation: Error occurred ~~~~~~ - Current default monitor and stop delay times in the RA, match the default timeout periods for "monitor" and "stop" operations: ~~~ [root@rhel8-node1 ~]# rpm -q resource-agents kernel resource-agents-4.9.0-29.el8_7.3.x86_64 ~~~ ~~~ $ vim /usr/lib/ocf/resource.d/heartbeat/Delay ----------------------->8-------------------------- 33 OCF_RESKEY_startdelay_default="20" 34 OCF_RESKEY_stopdelay_default="30" 35 OCF_RESKEY_mondelay_default="30" 36 37 : ${OCF_RESKEY_startdelay=${OCF_RESKEY_startdelay_default}} 38 : ${OCF_RESKEY_stopdelay=${OCF_RESKEY_stopdelay_default}} 39 : ${OCF_RESKEY_mondelay=${OCF_RESKEY_mondelay_default}} ~~~ ~~~ [root@rhel8-node2 ~]# pcs config show ----------------------->8-------------------------- Resource: test-delay (class=ocf provider=heartbeat type=Delay) Attributes: test-delay-instance_attributes mondelay=10 Operations: monitor: test-delay-monitor-interval-10s interval=10s timeout=30s <--- start: test-delay-start-interval-0s interval=0s timeout=30s <--- stop: test-delay-stop-interval-0s interval=0s timeout=30s <--- ~~~ So we should probably reduce the default delay for the resource agent for both of these operations. Otherwise they will fail out of the box. Actual results: Start operations are successful. Monitor operations timed out. Stop operations timed out. Expected results: All operations ( start, stop monitor ) should be successful with a default configuration. Additional info: - Shane Bradley++ has pointed out that the resource description is also incorrect. Stop and monitor delays are not set to the same as the start delay. Not sure if this should be update here or in another Bugzilla ~~~ [root@rhel8-node1 ~]# pcs resource describe Delay Assumed agent name 'ocf:heartbeat:Delay' (deduced from 'Delay') ocf:heartbeat:Delay - Waits for a defined timespan This script is a test resource for introducing delay. Resource options: startdelay: How long in seconds to delay on start operation. stopdelay: How long in seconds to delay on stop operation. Defaults to "startdelay" if unspecified. <--- mondelay: How long in seconds to delay on monitor operation. Defaults to "startdelay" if unspecified. <--- ~~~ - Both the description discrepancy and this issue were likely introduced in this change, which set default timeouts for stopdelay and startdelay to 30s: https://github.com/ClusterLabs/resource-agents/commit/baa4cdf6afb9df801d40895f2a9ffcf7d2c8fdae