Bug 1371576

Summary:	RFE: Add equivalent functionality of __independent_subtree="2" and __max_restarts="n" to pacemaker (Critical Resources)
Product:	Red Hat Enterprise Linux 8	Reporter:	Josef Zimek <pzimek>
Component:	pacemaker	Assignee:	Ken Gaillot <kgaillot>
Status:	CLOSED ERRATA	QA Contact:	cluster-qe <cluster-qe>
Severity:	medium	Docs Contact:	Steven J. Levine <slevine>
Priority:	high
Version:	8.0	CC:	cfeist, christianelwin.romein, cluster-maint, ctowsley, jruemker, kgaillot, ldelouw, mnovacek, msmazova, redhat-bugzilla, robert.scheck
Target Milestone:	pre-dev-freeze	Keywords:	FutureFeature, Triaged
Target Release:	8.4
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	pacemaker-2.0.5-5.el8	Doc Type:	Enhancement
Doc Text:	.Noncritical resources in colocation constraints are now supported With this enhancement, you can configure a colocation constraint such that if the dependent resource of the constraint reaches its migration threshold for failure, Pacemaker will leave that resource offline and keep the primary resource on its current node rather than attempting to move both resources to another node. To support this behavior, colocation constraints now have an `influence` option, which can be set to `true` or `false`, and resources have a `critical` meta-attribute, which can also be set to `true` or `false`. The value of the `critical` resource meta option determines the default value of the `influence` option for all colocation constraints involving the resource as a dependent resource. When the `influence` colocation constraint option has a value of `true` Pacemaker will attempt to keep both the primary and dependent resource active. If the dependent resource reaches its migration threshold for failures, both resources will move to another node, if possible. When the `influence` colocation option has a value of `false`, Pacemaker will avoid moving the primary resource as a result of the status of the dependent resource. In this case, if the dependent resource reaches its migration threshold for failures, it will stop if the primary resource is active and can remain on its current node. By default, the value of the `critical` resource meta option is set to `true`, which in turn determines that the default value of the `influence` option is `true`. This preserves the previous behavior where Pacemaker attempted to keep both resources active.	Story Points:	---
Clone Of:
Clones:	1482621 (view as bug list)		Environment:
Last Closed:	2021-05-18 15:26:41 UTC	Type:	Feature Request
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1420851, 1482621, 1546815, 1672748, 1679810, 1894575, 1916011

Description Josef Zimek 2016-08-30 13:58:54 UTC

Description of problem:

Rgmanager has possibility to set following parameters to resource:

__independent_subtree="2"
__max_restarts="n"

Some customers are requiring the same in pacemaker based clusters. So far we weren't able to achieve this with existing pacemaker parameters.

The aim is to make a resource non critical, allowing the cluster to restart it locally for a maximum of n times. If all the local restart fails the resource should be simply left offline without causing a service switch.

How do I make a RHEL HA resource non-critical?
https://access.redhat.com/solutions/30187

Client need to disable a non-critical resources after failure, but before that he expect that the resource should be restarted 'n' number of times. In rgmanager this functionality was provided by max_restarts option along with independent_subtree=2.

The near equivalent in pacemaker is migration-threshold, but it needs on-fail to be set as restart in order to work. When on-fail is set to restart the resource gets relocated along with the entire service group to other node after restarting the resource 'n' number of times in the node where it failed which the customer does not want. The service should not be relocated and only resource should be kept disabled in the node where it was previously running.

Pacemaker options below, does not mention regarding the possibility of restarting the resource before disabling the resource.

===
pcs resource create <name> op monitor on-fail=ignore stop on-fail=stop
or
pcs resource update <name> op monitor on-fail=ignore stop on-fail=stop
===

Tests done so far:

When trying to add non-critical resource (VirtualIP in test environment) outside the service group and set the migration threshold to 3 along with on-fail=restart. In addition I set one colocation constraint stating that the non critical resource should run where the entire service runs. For example node1 runs the service and this non critical resource and I make the ethernet interface down and on identifying the failure pacemaker restarts the non critical resource three times. I expected that the non critical resource will fail after attempting three restarts on the node where it was running (node1) since the service group is running on node1 but after three restarts still the non critical resource moved to the other node along with the entire service group which means the colocation constraint works in a reverse order as well (Run the entire service group in colocation with the non critical resource).

Is there any option which we can suggest customer equivalent to achieve the max_restarts behaviour of independent_subtree?

// Additional Testing Done //

We tried to remove the group and all the resources were out of the group. We also set the start-failure-is-fatal in order for the migration-threshold to work since when the monitor fails it will try a start and stop on the resource and when the start operation fails it considers it as fatal and relocates without honouring the migration-threshold.

We noticed that when migration-threshold is set for monitor operation with on-fail=restart when the first monitor operation for the resource fails, the resource is stopped and started which is the restart operation. This is done for three times as the migration-threshold is set to 3 after which the resource gets relocated to other node.

So we tried setting on-fail=stop on start operation for the resource and when the first monitor operation fails for the resource it restarts the resource which is a stop and start operation on the resource and when the start operation fails due to the non availability of the IP nic and on-fail is set to stop for start operation it stops the resource in the node and hence migration-threshold in the monitor operation is not honoured.

We thought of setting migration-threshold to start operation instead of monitor operation but it would require on-fail=restart for start operation and hence when the start operation fails for three times it restarts and after three failure attempts on start operation still it would relocate to the other node and there is no way to stop the resource in the node where it was running.

Comment 3 Ken Gaillot 2016-08-31 17:43:59 UTC

I'm thinking we could potentially replace migration-threshold and on-fail with:


hard-fail-threshold: Until this count is reached, the cluster will consider an operation failure to be "soft" and try to recover the operation on the same node. After this many failures, the cluster will consider a failure to be "hard", and recover according to hard-fail-action. For monitor operations, a soft recovery is a stop+start; for all other operations, a soft recovery is repeating the operation.

hard-fail-action: What to do when the operation reaches hard-fail-threshold. "ban" would work like current "restart" (i.e. move the resource to another node), and ignore/block/stop/standby/fence would work the same as on-fail now.


The difference is that hard-fail-threshold would apply regardless of hard-fail-action (as opposed to just on-fail=restart currently). Also, hard-fail-threshold would probably be an operation property rather than a resource property. The options have new names to better reflect the new behavior. With these, you could achieve your goal with op monitor hard-fail-threshold=3 hard-fail-action=ignore (or block or stop).

We could possibly use this to replace start-failure-is-fatal as well; hard-fail-threshold could default to 1 for start and stop, and INFINITY for all other operations. That would have the additional benefit of allowing it to be set per resource (Bug 1328448).

The old options would be deprecated, but would retain their behavior as closely as possible.

Comment 4 Andrew Beekhof 2016-09-05 03:00:26 UTC

(In reply to Josef Zimek from comment #0)
> The aim is to make a resource non critical, allowing the cluster to restart
> it locally for a maximum of n times. If all the local restart fails the
> resource should be simply left offline without causing a service switch.

Why would you not allow one of the other nodes to attempt to host the service?
Do any other services depend on this one?

It doesn't so too different from a service with no dependants, migration-threshold=n and a location constraint that fixes it to a particular node.

Comment 5 Sridhar 2016-09-06 16:56:36 UTC

(In reply to Andrew Beekhof from comment #4)
> (In reply to Josef Zimek from comment #0)
> > The aim is to make a resource non critical, allowing the cluster to restart
> > it locally for a maximum of n times. If all the local restart fails the
> > resource should be simply left offline without causing a service switch.
> 
> Why would you not allow one of the other nodes to attempt to host the
> service?
> Do any other services depend on this one?
> 
> It doesn't so too different from a service with no dependants,
> migration-threshold=n and a location constraint that fixes it to a
> particular node.

Hello Andrew,

By setting location constraint we can restrict the resource from running on the other node but the thing is we don't know which node is always going to run this service group.

For eg if node1 is running the service and we set migration-threshold to the non-critical resource in the service with a location constraint stating that it should only run on node1 then yes I hope that the resource will be restarted 'n' number of times on failure and then due to location constraint which has been specified it will be stopped on node1 since it is not allowed to run on other node.

But the scenario is when the node1 fails to run the service, then entire service should fail over to node2, which will get prevented due to the fact that the location constraint has been set on the non critical resource on node2 and hence the entire service will refuse to start on that node.

While starting a service even though the resource is non critical due to the property in cluster start-failure-is-fatal=true the start of the non-critical resource in the group will lead to a relocation of the entire service group back to node1 which is already down or is rebooting at this point of time.

So it adds much additional complexity I believe. Is there anything else which we can try for achieving this behaviour?

Comment 7 Andrew Beekhof 2016-09-07 02:44:51 UTC

Ok, I think I understand now

Comment 9 Ken Gaillot 2016-10-05 20:34:17 UTC

After discussion upstream, the tentative plan is to replace the "on-fail", "migration-threshold" and "start-failure-is-fatal" options with a new design:

Each operation would take new options max-fail-ignore, max-fail-restart, and fail-escalation.

Fail counts would be tracked per operation (not per-resource as currently). That is, when determining how to handle a failure, Pacemaker would consider only how many times this particular operation has failed for this resource on this node, not how many times *any* operation has failed for this resource on this node.

The first "max-fail-ignore" failures would be reported but ignored.

Once max-fail-ignore failures occurred, the next "max-fail-restart" failures would be handled by attempting to restart the resource.

Once that threshold was reached, the handling specified by fail-escalation would be taken. This would be accept the current on-fail values, except not including "restart", and adding "ban" to force the resource off the node with the failures.

Defaults would mimic current default behavior: max-fail-ignore would default to 0, max-fail-restart would default to 0 for stop and start and INFINITY for other operations, and fail-escalation would default to block or fence for stop and ban for other operations.

Examples of how current options would translate to the new ones:

on-fail=ignore -> max-fail-ignore=INFINITY
on-fail=restart migration-threshold=3 -> max-fail-restart=3

The original example of attempting 'n' restarts then leaving the resource stopped would be:

max-fail-ignore='n' fail-escalation=stop

The details are still up for discussion, if any changes are desired.

Comment 10 Sridhar 2016-10-06 17:34:51 UTC

Hello Ken,

Thanks for the making us aware of the new options which are in discussion. I have one query though in your update.

>>> The first "max-fail-ignore" failures would be reported but ignored.

>>> The original example of attempting 'n' restarts then leaving the resource stopped would be:

 max-fail-ignore='n' fail-escalation=stop

So in the above two statements first statement states that "max-fail-ignore" number of failures would be ignored and not stating it will be restarted.

But in the second statement you have mentioned that max-fail-ignore='n' as the option for attempting 'n' number of restarts which is contradictory for the first statement.

Please correct me if I am wrong in my understanding.

Comment 11 Ken Gaillot 2016-10-06 18:37:59 UTC

> The original example of attempting 'n' restarts then leaving the resource stopped would be:
>
>  max-fail-ignore='n' fail-escalation=stop

Whoops, typo. That should be:

The original example of attempting 'n' restarts then leaving the resource stopped would be:

 max-fail-restart='n' fail-escalation=stop

Comment 12 Sridhar 2016-10-06 18:40:07 UTC

Thanks Ken. That makes it clear.

Comment 15 Ken Gaillot 2017-03-15 17:20:30 UTC

While a serious effort was made at implementing this, and a substantial amount of prerequisite work has been integrated upstream, the user-visible portion will not be ready in the 7.4 timeframe, so I am pushing this back to 7.5.

The part that will be included in 7.4 is per-operation failure tracking. This will be mostly invisible to the end user, but instead of the previous per-resource fail count attributes such as fail-count-httpd, there will be a fail count attribute for each operation and recurring interval, such as fail-count-httpd#start_0 and fail-count-httpd#monitor_20000. The cluster will still use only the per-resource fail count (by summing the individual operation fail counts), but the per-operation counts will be visible via tools such as crm_resource --cleanup and crm_failcount. Bug 1427273 will expose the equivalent functionality in pcs resource cleanup and pcs resource failcount.

Comment 17 John Ruemker 2017-08-08 22:06:12 UTC

Ken,
This may already be part of your plan, but I wanted to mention something that occurred to me while considering this today...

In order to avoid breaking any existing setups, I would hope it could be possible to have the new settings translate any value that may already exist in the CIB for the previously-available settings, and to have a period of deprecation for those settings before they go away.  That is:

- start-failure-is-fatal=false would cause op start max-fail-restart to be treated as INFINITY. 

- op * on-fail=<non-restart value> would cause fail-escalation to take that value

- migration-threshold=<non-zero> would cause max-fail-restart= that value

Each one should probably result in a logged message mentioning the deprecation and/or translation when its checked. 

Do you have something like this in mind, or if not do you think you can include something like that?  If you have an alternate plan that attempts to avoid breaking existing settings, that could be fine instead. 

Side note: planning anything resembling failure-timeout per-op?  

Thanks

-John

Comment 18 Ken Gaillot 2017-08-09 16:46:57 UTC

Definitely, we always strive for full backward compatibility. I haven't decided yet whether to formally deprecate the old options, or just say they provide resource-wide defaults. But either way, there will be an intuitive mapping from the old options to the new (and the new will take precedence if both are specified).

Yes, there will be a per-op "fail-timeout" that will actually be the first new option implemented, since it will be easiest.

Due to the short 7.5 horizon and the upcoming HA summit, I'm afraid this is likely to be bumped to 7.6, though.

Also for completeness, I should mention that the per-operation fail counts mentioned in Comment 15 were held back from 7.4, but are likely to make it into 7.5.

Comment 20 Ken Gaillot 2017-10-09 17:13:31 UTC

Due to time constraints, this will not be completed for 7.5.

Per-operation fail counts will make it into 7.5.

Comment 23 christianelwin.romein 2018-07-25 10:44:32 UTC

You have any update around this Bug?

When will it be fully implemented ?

Thank you in advance !

Comment 24 Chris Feist 2018-07-25 13:20:29 UTC

We are working on it, but it is a complex fix.  Unfortunately we do not have an ETA as to when it will be fixed.

Comment 25 Robert Scheck 2018-07-25 14:17:39 UTC

Chris, it's really sad to read this, given we opened this issue about two
years ago already (including GSS ticket). We would like to hear about the
reasons for "no progress" either here or via GSS.

Comment 26 John Ruemker 2018-08-01 21:03:53 UTC

(In reply to Robert Scheck from comment #25)
> Chris, it's really sad to read this, given we opened this issue about two
> years ago already (including GSS ticket). We would like to hear about the
> reasons for "no progress" either here or via GSS.

Hi Robert, 
Please reach out to your Red Hat Support representatives to discuss the status of this request for enhancement and your needs.  Your support case team will be able to provide you a status update and work with you on a plan for addressing your needs within the product's existing functionality while development on the product continues.  If you have questions or concerns, the support team will be able to assist you with those.  

We need to keep this bugzilla focused on the technical details of this development effort.  

Thanks,
John Ruemker
Principal Software Maintenance Engineer
Red Hat Platform Product Experience for RHEL High Availability

Comment 27 Ken Gaillot 2018-11-19 19:19:51 UTC

Because this will require new configuration syntax, for technical reasons this will only be addressed in RHEL 8

Comment 29 Ken Gaillot 2019-08-30 20:16:33 UTC

Due to the complexity of this proposed feature, I am separating it into multiple BZs. The currently planned design is somewhat different from earlier comments. In planned order of implementation:

Bug 1747553 - pacemaker will calculate the cluster-recheck-interval dynamically so users don't have to worry about it when configuring a failure timeout. This is not strictly a requirement, but will simplify using failure timeouts and make them more intuitive.

Bug 1747559 - allow failure timeout to be configured per operation as well as per resource. The option might be renamed as part of this change. Bug 1747560 covers the pcs interface.

This BZ - Implement a new colocation constraint option to indicate that the colocated resource should not be considered when placing the colocated-with resource. This usual consideration is why groups move when one member is unable to stay on the current node. With the new option, when the colocated resource reaches its migration-threshold but must stay colocated with another resource, it will be unable to be placed on a node, causing it to stay stopped. In the example use case for this bz, the "noncritical" resource would need to be taken out of the group and instead colocated with the group (and ordered relative to it as desired).

Bug 1328448 - Allow start-failure-is-fatal to be configurable per resource. The current plan is to implement 2 new operation meta-attributes, failure-restart and failure-escalation, to replace start-failure-is-fatal, migration-threshold, and on-fail (which would still be supported for backward compatibility). The first failure-restart=<N> failures would result in restart attempts, and if all failed, the response in failure-escalation would be taken (equivalent to the current on-fail values, except "restart", and adding "ban" to force the resource off its current node). Thus a start action with failure-restart set to 0 would be equivalent to start-failure-is-fatal="true", and a start with action with failure-restart set to a positive number would be equivalent to start-failure-is-fatal="false" with migration_threshold set to that number. Bug 1747563 covers the pcs interface.

Comment 30 Steven J. Levine 2019-08-30 21:43:36 UTC

Ken,

Even with the splitting up into multiple BZs, would it still be reasonable to summarize the feature as a single release note?  Were you planning on closing out this BZ now that it's split like this?  

Steven

Comment 31 Ken Gaillot 2019-09-03 16:25:57 UTC

(In reply to Steven J. Levine from comment #30)
> Ken,
> 
> Even with the splitting up into multiple BZs, would it still be reasonable
> to summarize the feature as a single release note?  Were you planning on
> closing out this BZ now that it's split like this?  
> 
> Steven

All 4 BZs mentioned in Comment 29 are still under development. They're quite independent from the user's perspective, and each is significant enough for its own release note.

Comment 37 Ken Gaillot 2021-01-14 22:10:23 UTC

Feature added upstream as of commit 7ae21be9

Comment 38 Ken Gaillot 2021-01-15 15:28:45 UTC

QA: The interface ended up like this:

A new colocation constraint option "influence", which defaults to "true" to match existing behavior, determines whether the location preferences of the dependent resource affect the placement of the primary resource. influence=true maximizes the chance that both resources will be able to run; if the dependent resource reaches its migration-threshold in failures and wants to move to another node, both resource will move. influence=false minimizes the impact on the primary resource; if the dependent resource wants to move, it will instead have to stop so the primary resource can remain where it is.

(pcs does not yet support influence, so you'll have to create the constraint then edit the CIB XML to add influence="true/false".)

A new resource meta-attribute "critical" can be used as a resource-wide default for influence in colocation constraints involving this resource as the dependent resource, as well as the implicit colocation created when the resource is in a group (which is the primary use case desired for this bz).

(critical can be set with pcs since pcs does not limit what meta-attributes can be specified.)

Comment 56 Markéta Smazová 2021-02-15 18:32:12 UTC

>   [root@virt-175 ~]# rpm -q pacemaker
>   pacemaker-2.0.5-6.el8.x86_64


CASE 1
--------
Testing default behavior - `influence=true` on colocation constraint.

Setup cluster, setup two Dummy resources and create colocation constraint: resource rsc_2 is dependent on rsc_1

>   [root@virt-175 ~]# pcs resource create rsc_1 ocf:pacemaker:Dummy
>   [root@virt-175 ~]# pcs resource create rsc_2 ocf:pacemaker:Dummy
>   [root@virt-175 ~]# pcs constraint colocation add rsc_2 with rsc_1 id=r2-with-r1

Update `migration-threshold` to 3 on rsc_2 (the resource may fail only 3 times on respective node and after that moves 
to another node), set cluster property `start-failure-is-fatal` to false:

>   [root@virt-175 ~]# pcs resource update rsc_2 meta migration-threshold=3
>   [root@virt-175 ~]# pcs property set start-failure-is-fatal=false
>   [root@virt-175 ~]# pcs property list
>   Cluster Properties:
>    cluster-infrastructure: corosync
>    cluster-name: STSRHTS6310
>    dc-version: 2.0.5-6.el8-ba59be7122
>    have-watchdog: false
>    last-lrm-refresh: 1613381633
>    start-failure-is-fatal: false

Check colocation constraint setup and resource rsc_2 configuration:

>   [root@virt-175 ~]# pcs constraint colocation show --full
>   Colocation Constraints:
>     rsc_2 with rsc_1 (score:INFINITY) (id:r2-with-r1)

>   [root@virt-175 ~]# pcs resource config rsc_2
>    Resource: rsc_2 (class=ocf provider=pacemaker type=Dummy)
>     Meta Attrs: migration-threshold=3
>     Operations: migrate_from interval=0s timeout=20s (rsc_2-migrate_from-interval-0s)
>                 migrate_to interval=0s timeout=20s (rsc_2-migrate_to-interval-0s)
>                 monitor interval=10s timeout=20s (rsc_2-monitor-interval-10s)
>                 reload interval=0s timeout=20s (rsc_2-reload-interval-0s)
>                 start interval=0s timeout=20s (rsc_2-start-interval-0s)
>                 stop interval=0s timeout=20s (rsc_2-stop-interval-0s)

Resources rsc_1 and rsc_2 are colocated on node virt-175:

>   [root@virt-175 ~]# pcs status
>   Cluster name: STSRHTS6310
>   Cluster Summary:
>     * Stack: corosync
>     * Current DC: virt-176 (version 2.0.5-6.el8-ba59be7122) - partition with quorum
>     * Last updated: Mon Feb 15 11:29:36 2021
>     * Last change:  Mon Feb 15 11:29:29 2021 by root via cibadmin on virt-175
>     * 2 nodes configured
>     * 4 resource instances configured

>   Node List:
>     * Online: [ virt-175 virt-176 ]

>   Full List of Resources:
>     * fence-virt-175	(stonith:fence_xvm):	 Started virt-175
>     * fence-virt-176	(stonith:fence_xvm):	 Started virt-176
>     * rsc_1	(ocf::pacemaker:Dummy):	 Started virt-175
>     * rsc_2	(ocf::pacemaker:Dummy):	 Started virt-175

>   Daemon Status:
>     corosync: active/disabled
>     pacemaker: active/disabled
>     pcsd: active/enabled

Fail rsc_2 three times on node virt-175:

>   [root@virt-175 ~]# for i in {1..3}; do rm -f /run/Dummy-rsc_2.state; sleep 30; done

After failing three times, rsc_2 moved together with rsc_1 to node virt-176.

>   [root@virt-175 ~]# pcs status --full
>   Cluster name: STSRHTS6310
>   Cluster Summary:
>     * Stack: corosync
>     * Current DC: virt-176 (2) (version 2.0.5-6.el8-ba59be7122) - partition with quorum
>     * Last updated: Mon Feb 15 11:31:07 2021
>     * Last change:  Mon Feb 15 11:29:29 2021 by root via cibadmin on virt-175
>     * 2 nodes configured
>     * 4 resource instances configured

>   Node List:
>     * Online: [ virt-175 (1) virt-176 (2) ]

>   Full List of Resources:
>     * fence-virt-175	(stonith:fence_xvm):	 Started virt-175
>     * fence-virt-176	(stonith:fence_xvm):	 Started virt-176
>     * rsc_1	(ocf::pacemaker:Dummy):	 Started virt-176
>     * rsc_2	(ocf::pacemaker:Dummy):	 Started virt-176

>   Migration Summary:
>     * Node: virt-175 (1):
>       * rsc_2: migration-threshold=3 fail-count=3 last-failure='Mon Feb 15 11:30:41 2021'

>   Failed Resource Actions:
>     * rsc_2_monitor_10000 on virt-175 'not running' (7): call=1162, status='complete', exitreason='', last-rc-change='2021-02-15 11:30:41 +01:00', queued=0ms, exec=0ms

>   Tickets:

>   PCSD Status:
>     virt-175: Online
>     virt-176: Online

>   Daemon Status:
>     corosync: active/disabled
>     pacemaker: active/disabled
>     pcsd: active/enabled

Refresh the resources and delete the colocation constraint:

>   [root@virt-175 ~]# pcs resource refresh
>   Waiting for 1 reply from the controller
>   ... got reply (done)

>   [root@virt-175 ~]# pcs status
>   Cluster name: STSRHTS6310
>   Cluster Summary:
>     * Stack: corosync
>     * Current DC: virt-176 (version 2.0.5-6.el8-ba59be7122) - partition with quorum
>     * Last updated: Mon Feb 15 11:31:42 2021
>     * Last change:  Mon Feb 15 11:29:29 2021 by root via cibadmin on virt-175
>     * 2 nodes configured
>     * 4 resource instances configured

>   Node List:
>     * Online: [ virt-175 virt-176 ]

>   Full List of Resources:
>     * fence-virt-175	(stonith:fence_xvm):	 Started virt-175
>     * fence-virt-176	(stonith:fence_xvm):	 Started virt-176
>     * rsc_1	(ocf::pacemaker:Dummy):	 Started virt-176
>     * rsc_2	(ocf::pacemaker:Dummy):	 Started virt-176

>   Daemon Status:
>     corosync: active/disabled
>     pacemaker: active/disabled
>     pcsd: active/enabled

>   [root@virt-175 ~]# pcs constraint delete r2-with-r1

CASE 2
--------
Testing new behavior: `influence=false` on colocation constraint.

Same configuration as in CASE 1: two Dummy resources with colocation constraint: Resource rsc_2 is dependent on rsc_1, but
`influence` is set to false on that colocation constraint.

Create the colocation constraint with `influence=false`:

>   [root@virt-175 ~]# pcs cluster cib > cib-original.xml
>   [root@virt-175 ~]# cp cib-original.xml cib-new.xml
>   [root@virt-175 ~]# pcs -f cib-new.xml constraint colocation add rsc_2 with rsc_1 influence=false id=r2-with-r1-inf-false
>   [root@virt-175 ~]# pcs cluster cib-push cib-new.xml diff-against=cib-original.xml
>   CIB updated

Influence setting is now visible in colocation constraint view:

>   [root@virt-175 ~]# pcs constraint colocation show --full
>   Colocation Constraints:
>     rsc_2 with rsc_1 (score:INFINITY) (influence:false) (id:r2-with-r1-inf-false)

Check resource rsc_2 configuration and cluster properties. Resource rsc_2 still has `migration-threshold` set to 3 
and cluster property `start-failure-is-fatal` is set to false.

>   [root@virt-175 ~]# pcs resource config rsc_2
>    Resource: rsc_2 (class=ocf provider=pacemaker type=Dummy)
>     Meta Attrs: migration-threshold=3
>     Operations: migrate_from interval=0s timeout=20s (rsc_2-migrate_from-interval-0s)
>                 migrate_to interval=0s timeout=20s (rsc_2-migrate_to-interval-0s)
>                 monitor interval=10s timeout=20s (rsc_2-monitor-interval-10s)
>                 reload interval=0s timeout=20s (rsc_2-reload-interval-0s)
>                 start interval=0s timeout=20s (rsc_2-start-interval-0s)
>                 stop interval=0s timeout=20s (rsc_2-stop-interval-0s)

>   [root@virt-175 ~]# pcs property list
>   Cluster Properties:
>    cluster-infrastructure: corosync
>    cluster-name: STSRHTS6310
>    dc-version: 2.0.5-6.el8-ba59be7122
>    have-watchdog: false
>    last-lrm-refresh: 1613381633
>    start-failure-is-fatal: false

Resources rsc_1 and rsc_2 are colocated on node virt-176:

>   [root@virt-175 ~]# pcs status
>   Cluster name: STSRHTS6310
>   Cluster Summary:
>     * Stack: corosync
>     * Current DC: virt-176 (version 2.0.5-6.el8-ba59be7122) - partition with quorum
>     * Last updated: Mon Feb 15 11:31:50 2021
>     * Last change:  Mon Feb 15 11:31:47 2021 by root via cibadmin on virt-175
>     * 2 nodes configured
>     * 4 resource instances configured

>   Node List:
>     * Online: [ virt-175 virt-176 ]

>   Full List of Resources:
>     * fence-virt-175	(stonith:fence_xvm):	 Started virt-175
>     * fence-virt-176	(stonith:fence_xvm):	 Started virt-176
>     * rsc_1	(ocf::pacemaker:Dummy):	 Started virt-176
>     * rsc_2	(ocf::pacemaker:Dummy):	 Started virt-176

>   Daemon Status:
>     corosync: active/disabled
>     pacemaker: active/disabled
>     pcsd: active/enabled

Fail rsc_2 three times on node virt-176:

>   [root@virt-176 ~]# for i in {1..3}; do rm -f /run/Dummy-rsc_2.state; sleep 30; done

After failing three times, rsc_2 was stopped on node virt-176 instead of moving together with rsc_1 to another node.

>   [root@virt-175 ~]# pcs status --full
>   Cluster name: STSRHTS6310
>   Cluster Summary:
>     * Stack: corosync
>     * Current DC: virt-176 (2) (version 2.0.5-6.el8-ba59be7122) - partition with quorum
>     * Last updated: Mon Feb 15 11:33:22 2021
>     * Last change:  Mon Feb 15 11:31:47 2021 by root via cibadmin on virt-175
>     * 2 nodes configured
>     * 4 resource instances configured

>   Node List:
>     * Online: [ virt-175 (1) virt-176 (2) ]

>   Full List of Resources:
>     * fence-virt-175	(stonith:fence_xvm):	 Started virt-175
>     * fence-virt-176	(stonith:fence_xvm):	 Started virt-176
>     * rsc_1	(ocf::pacemaker:Dummy):	 Started virt-176
>     * rsc_2	(ocf::pacemaker:Dummy):	 Stopped

>   Migration Summary:
>     * Node: virt-176 (2):
>       * rsc_2: migration-threshold=3 fail-count=3 last-failure='Mon Feb 15 11:33:00 2021'

>   Failed Resource Actions:
>     * rsc_2_monitor_10000 on virt-176 'not running' (7): call=1102, status='complete', exitreason='', last-rc-change='2021-02-15 11:33:00 +01:00', queued=0ms, exec=0ms

>   Tickets:

>   PCSD Status:
>     virt-175: Online
>     virt-176: Online

>   Daemon Status:
>     corosync: active/disabled
>     pacemaker: active/disabled
>     pcsd: active/enabled

After refreshing the resources, rsc_2 i started again on virt-176:

>   [root@virt-175 ~]# pcs resource refresh
>   Waiting for 1 reply from the controller
>   ... got reply (done)
>   [root@virt-175 ~]# pcs status
>   Cluster name: STSRHTS6310
>   Cluster Summary:
>     * Stack: corosync
>     * Current DC: virt-176 (version 2.0.5-6.el8-ba59be7122) - partition with quorum
>     * Last updated: Mon Feb 15 11:33:56 2021
>     * Last change:  Mon Feb 15 11:31:47 2021 by root via cibadmin on virt-175
>     * 2 nodes configured
>     * 4 resource instances configured

>   Node List:
>     * Online: [ virt-175 virt-176 ]

>   Full List of Resources:
>     * fence-virt-175	(stonith:fence_xvm):	 Started virt-175
>     * fence-virt-176	(stonith:fence_xvm):	 Started virt-176
>     * rsc_1	(ocf::pacemaker:Dummy):	 Started virt-176
>     * rsc_2	(ocf::pacemaker:Dummy):	 Started virt-176

>   Daemon Status:
>     corosync: active/disabled
>     pacemaker: active/disabled
>     pcsd: active/enabled

Delete the colocation constraint:

>   [root@virt-175 ~]# pcs constraint delete r2-with-r1-inf-false

CASE 3
-------
Testing default behavior (`critical=true`), on a resource that is part of resource group:

Create another Dummy resource and put all 3 resources to group:

>   [root@virt-175 ~]# pcs resource create rsc_3 ocf:pacemaker:Dummy
>   [root@virt-175 ~]# pcs resource group add dummy-group rsc_1 rsc_2 rsc_3

Check resource configuration and cluster properties. Resource rsc_2 still has `migration-threshold` set to 3 
and cluster property `start-failure-is-fatal` is set to false.

>   [root@virt-175 ~]# pcs resource config
>    Group: dummy-group
>     Resource: rsc_1 (class=ocf provider=pacemaker type=Dummy)
>      Operations: migrate_from interval=0s timeout=20s (rsc_1-migrate_from-interval-0s)
>                  migrate_to interval=0s timeout=20s (rsc_1-migrate_to-interval-0s)
>                  monitor interval=10s timeout=20s (rsc_1-monitor-interval-10s)
>                  reload interval=0s timeout=20s (rsc_1-reload-interval-0s)
>                  start interval=0s timeout=20s (rsc_1-start-interval-0s)
>                  stop interval=0s timeout=20s (rsc_1-stop-interval-0s)
>     Resource: rsc_2 (class=ocf provider=pacemaker type=Dummy)
>      Meta Attrs: migration-threshold=3
>      Operations: migrate_from interval=0s timeout=20s (rsc_2-migrate_from-interval-0s)
>                  migrate_to interval=0s timeout=20s (rsc_2-migrate_to-interval-0s)
>                  monitor interval=10s timeout=20s (rsc_2-monitor-interval-10s)
>                  reload interval=0s timeout=20s (rsc_2-reload-interval-0s)
>                  start interval=0s timeout=20s (rsc_2-start-interval-0s)
>                  stop interval=0s timeout=20s (rsc_2-stop-interval-0s)
>     Resource: rsc_3 (class=ocf provider=pacemaker type=Dummy)
>      Operations: migrate_from interval=0s timeout=20s (rsc_3-migrate_from-interval-0s)
>                  migrate_to interval=0s timeout=20s (rsc_3-migrate_to-interval-0s)
>                  monitor interval=10s timeout=20s (rsc_3-monitor-interval-10s)
>                  reload interval=0s timeout=20s (rsc_3-reload-interval-0s)
>                  start interval=0s timeout=20s (rsc_3-start-interval-0s)
>                  stop interval=0s timeout=20s (rsc_3-stop-interval-0s)

>   [root@virt-175 ~]# pcs property list
>   Cluster Properties:
>    cluster-infrastructure: corosync
>    cluster-name: STSRHTS6310
>    dc-version: 2.0.5-6.el8-ba59be7122
>    have-watchdog: false
>    last-lrm-refresh: 1613381633
>    start-failure-is-fatal: false

The implicit colocation is created when the resource part of a resource group and `pcs constraint colocation` doesn't display this:

>   [root@virt-175 ~]# pcs constraint colocation show --full
>   Colocation Constraints:

Resource group is running on node virt-176:

>   [root@virt-175 ~]# pcs status
>   Cluster name: STSRHTS6310
>   Cluster Summary:
>     * Stack: corosync
>     * Current DC: virt-176 (version 2.0.5-6.el8-ba59be7122) - partition with quorum
>     * Last updated: Mon Feb 15 11:34:05 2021
>     * Last change:  Mon Feb 15 11:34:01 2021 by root via cibadmin on virt-175
>     * 2 nodes configured
>     * 5 resource instances configured

>   Node List:
>     * Online: [ virt-175 virt-176 ]

>   Full List of Resources:
>     * fence-virt-175	(stonith:fence_xvm):	 Started virt-175
>     * fence-virt-176	(stonith:fence_xvm):	 Started virt-176
>     * Resource Group: dummy-group:
>       * rsc_1	(ocf::pacemaker:Dummy):	 Started virt-176
>       * rsc_2	(ocf::pacemaker:Dummy):	 Started virt-176
>       * rsc_3	(ocf::pacemaker:Dummy):	 Started virt-176

>   Daemon Status:
>     corosync: active/disabled
>     pacemaker: active/disabled
>     pcsd: active/enabled

Fail rsc_2 three times on node virt-176:

>   [root@virt-176 ~]# for i in {1..3}; do rm -f /run/Dummy-rsc_2.state; sleep 30; done

Resource rsc_2 failed three times and resource group moved to node virt-175:

>   [root@virt-175 ~]# pcs status --full
>   Cluster name: STSRHTS6310
>   Cluster Summary:
>     * Stack: corosync
>     * Current DC: virt-176 (2) (version 2.0.5-6.el8-ba59be7122) - partition with quorum
>     * Last updated: Mon Feb 15 11:35:37 2021
>     * Last change:  Mon Feb 15 11:34:01 2021 by root via cibadmin on virt-175
>     * 2 nodes configured
>     * 5 resource instances configured

>   Node List:
>     * Online: [ virt-175 (1) virt-176 (2) ]

>   Full List of Resources:
>     * fence-virt-175	(stonith:fence_xvm):	 Started virt-175
>     * fence-virt-176	(stonith:fence_xvm):	 Started virt-176
>     * Resource Group: dummy-group:
>       * rsc_1	(ocf::pacemaker:Dummy):	 Started virt-175
>       * rsc_2	(ocf::pacemaker:Dummy):	 Started virt-175
>       * rsc_3	(ocf::pacemaker:Dummy):	 Started virt-175

>   Migration Summary:
>     * Node: virt-176 (2):
>       * rsc_2: migration-threshold=3 fail-count=3 last-failure='Mon Feb 15 11:35:14 2021'

>   Failed Resource Actions:
>     * rsc_2_monitor_10000 on virt-176 'not running' (7): call=1188, status='complete', exitreason='', last-rc-change='2021-02-15 11:35:14 +01:00', queued=0ms, exec=0ms

>   Tickets:

>   PCSD Status:
>     virt-175: Online
>     virt-176: Online

>   Daemon Status:
>     corosync: active/disabled
>     pacemaker: active/disabled
>     pcsd: active/enabled

Refresh the resources:

>   [root@virt-175 ~]# pcs resource refresh
>   Waiting for 1 reply from the controller
>   ... got reply (done)

>   [root@virt-175 ~]# pcs status
>   Cluster name: STSRHTS6310
>   Cluster Summary:
>     * Stack: corosync
>     * Current DC: virt-176 (version 2.0.5-6.el8-ba59be7122) - partition with quorum
>     * Last updated: Mon Feb 15 11:36:12 2021
>     * Last change:  Mon Feb 15 11:34:01 2021 by root via cibadmin on virt-175
>     * 2 nodes configured
>     * 5 resource instances configured

>   Node List:
>     * Online: [ virt-175 virt-176 ]

>   Full List of Resources:
>     * fence-virt-175	(stonith:fence_xvm):	 Started virt-175
>     * fence-virt-176	(stonith:fence_xvm):	 Started virt-176
>     * Resource Group: dummy-group:
>       * rsc_1	(ocf::pacemaker:Dummy):	 Started virt-175
>       * rsc_2	(ocf::pacemaker:Dummy):	 Started virt-175
>       * rsc_3	(ocf::pacemaker:Dummy):	 Started virt-175

>   Daemon Status:
>     corosync: active/disabled
>     pacemaker: active/disabled
>     pcsd: active/enabled

CASE 4
-------
Testing new behavior - a resource meta-attribute `critical` that can be used as default for `influence` in 
implicit colocation created when the resource is part of the resource group. 

Update rsc_2 `migration-threshold` back to the default value (INFINITY). Set rsc_3 `migration-threshold` to 3 and also update
its meta-attribute `critical` to false:

>   [root@virt-175 ~]# pcs resource update rsc_2 meta migration-threshold=INFINITY
>   [root@virt-175 ~]# pcs resource update rsc_3 meta migration-threshold=3 critical=false
>   [root@virt-175 ~]# pcs resource config rsc_3
>    Resource: rsc_3 (class=ocf provider=pacemaker type=Dummy)
>     Meta Attrs: critical=false migration-threshold=3
>     Operations: migrate_from interval=0s timeout=20s (rsc_3-migrate_from-interval-0s)
>                 migrate_to interval=0s timeout=20s (rsc_3-migrate_to-interval-0s)
>                 monitor interval=10s timeout=20s (rsc_3-monitor-interval-10s)
>                 reload interval=0s timeout=20s (rsc_3-reload-interval-0s)
>                 start interval=0s timeout=20s (rsc_3-start-interval-0s)
>                 stop interval=0s timeout=20s (rsc_3-stop-interval-0s)

Check that property `start-failure-is-fatal` is false:

>   [root@virt-175 ~]# pcs property list
>   Cluster Properties:
>    cluster-infrastructure: corosync
>    cluster-name: STSRHTS6310
>    dc-version: 2.0.5-6.el8-ba59be7122
>    have-watchdog: false
>    last-lrm-refresh: 1613381633
>    start-failure-is-fatal: false

Meta-attribute `critical` is set to false, but `pcs constraint colocation` does not show any indication that `influence` has been disabled.

>   [root@virt-175 ~]# pcs constraint colocation show --full
>   Colocation Constraints:

Resource group is running on node virt-175:

>   [root@virt-175 ~]# pcs status
>   Cluster name: STSRHTS6310
>   Cluster Summary:
>     * Stack: corosync
>     * Current DC: virt-176 (version 2.0.5-6.el8-ba59be7122) - partition with quorum
>     * Last updated: Mon Feb 15 11:36:19 2021
>     * Last change:  Mon Feb 15 11:36:15 2021 by root via cibadmin on virt-175
>     * 2 nodes configured
>     * 5 resource instances configured

>   Node List:
>     * Online: [ virt-175 virt-176 ]

>   Full List of Resources:
>     * fence-virt-175	(stonith:fence_xvm):	 Started virt-175
>     * fence-virt-176	(stonith:fence_xvm):	 Started virt-176
>     * Resource Group: dummy-group:
>       * rsc_1	(ocf::pacemaker:Dummy):	 Started virt-175
>       * rsc_2	(ocf::pacemaker:Dummy):	 Started virt-175
>       * rsc_3	(ocf::pacemaker:Dummy):	 Started virt-175

>   Daemon Status:
>     corosync: active/disabled
>     pacemaker: active/disabled
>     pcsd: active/enabled

Fail rsc_3 three times on node virt-175:

>   [root@virt-175 ~]# for i in {1..3}; do rm -f /run/Dummy-rsc_3.state; sleep 30; done

After failing three times, rsc_3 was stopped on node virt-175. The group did not move to another node.

>   [root@virt-175 ~]# pcs status --full
>   Cluster name: STSRHTS6310
>   Cluster Summary:
>     * Stack: corosync
>     * Current DC: virt-176 (2) (version 2.0.5-6.el8-ba59be7122) - partition with quorum
>     * Last updated: Mon Feb 15 11:37:50 2021
>     * Last change:  Mon Feb 15 11:36:15 2021 by root via cibadmin on virt-175
>     * 2 nodes configured
>     * 5 resource instances configured

>   Node List:
>     * Online: [ virt-175 (1) virt-176 (2) ]

>   Full List of Resources:
>     * fence-virt-175	(stonith:fence_xvm):	 Started virt-175
>     * fence-virt-176	(stonith:fence_xvm):	 Started virt-176
>     * Resource Group: dummy-group:
>       * rsc_1	(ocf::pacemaker:Dummy):	 Started virt-175
>       * rsc_2	(ocf::pacemaker:Dummy):	 Started virt-175
>       * rsc_3	(ocf::pacemaker:Dummy):	 Stopped

>   Migration Summary:
>     * Node: virt-175 (1):
>       * rsc_3: migration-threshold=3 fail-count=3 last-failure='Mon Feb 15 11:37:24 2021'

>   Failed Resource Actions:
>     * rsc_3_monitor_10000 on virt-175 'not running' (7): call=1327, status='complete', exitreason='', last-rc-change='2021-02-15 11:37:24 +01:00', queued=0ms, exec=0ms

>   Tickets:

>   PCSD Status:
>     virt-175: Online
>     virt-176: Online

>   Daemon Status:
>     corosync: active/disabled
>     pacemaker: active/disabled
>     pcsd: active/enabled

After refreshing the resources, rsc_3 is started again on virt-175:

>   [root@virt-175 ~]# pcs resource refresh
>   Waiting for 1 reply from the controller
>   ... got reply (done)

>   [root@virt-175 ~]# pcs status
>   Cluster name: STSRHTS6310
>   Cluster Summary:
>     * Stack: corosync
>     * Current DC: virt-176 (version 2.0.5-6.el8-ba59be7122) - partition with quorum
>     * Last updated: Mon Feb 15 11:38:25 2021
>     * Last change:  Mon Feb 15 11:36:15 2021 by root via cibadmin on virt-175
>     * 2 nodes configured
>     * 5 resource instances configured

>   Node List:
>     * Online: [ virt-175 virt-176 ]

>   Full List of Resources:
>     * fence-virt-175	(stonith:fence_xvm):	 Started virt-175
>     * fence-virt-176	(stonith:fence_xvm):	 Started virt-176
>     * Resource Group: dummy-group:
>       * rsc_1	(ocf::pacemaker:Dummy):	 Started virt-175
>       * rsc_2	(ocf::pacemaker:Dummy):	 Started virt-175
>       * rsc_3	(ocf::pacemaker:Dummy):	 Started virt-175

>   Daemon Status:
>     corosync: active/disabled
>     pacemaker: active/disabled
>     pcsd: active/enabled

marking verified in pacemaker-2.0.5-6.el8

Comment 63 errata-xmlrpc 2021-05-18 15:26:41 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (pacemaker bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2021:1782