Bug 1578789

Summary: Fix moving / banning clone and bundle resources
Product: Red Hat Enterprise Linux 8 Reporter: Tomas Jelinek <tojeline>
Component: pcsAssignee: Tomas Jelinek <tojeline>
Status: NEW --- QA Contact: cluster-qe <cluster-qe>
Severity: unspecified Docs Contact:
Priority: medium    
Version: 8.3CC: cluster-maint, idevat, mlisik, mpospisi, nwahl, omular, sbradley, tojeline
Target Milestone: rcKeywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1578820    
Bug Blocks: 1520665, 1621899    

Description Tomas Jelinek 2018-05-16 11:55:03 UTC
Description of problem:
* moving and banning an inner bundle resource
  * pcs should exit with an error saying moving / banning inner bundle resources is not allowed
* moving / banning a pacemaker internal bundle resource
  * works OK, pcs says the resource doesn't exist (it's not in the CIB)
* moving and banning a bundle resource
  * does not currently work in crm_resource even for 1-instance bundles:
    Resource 'DummyBundle' not moved: active in 2 locations.
* moving and banning a clone resource
  * works in crm_resource for 1-instance clones for both a clone and its inner resource
  * for multi-instance clones, moving a resource in a clone creates a constraint,  whereas moving a clone resource exits with an error
  * pcs should not allow moving / banning a resource in a clone, then it can simply run crm_resource every time and based on its output inform a user if a clone could or couldn't be moved / banned
  * another option is to automatically switch to moving / banning a bundle / clone if the user tries to move / ban a resource in the bundle / clone


Version-Release number of selected component (if applicable):
pcs-0.9.162-5.el7.x86_64


How reproducible:
always, easily


Steps to Reproduce:

dummy0 is a resource inside DummyBundle bundle with replicas=1
--------------------------------------------------------------

[root@virt-143 ~]# pcs resource move dummy0
Warning: Creating location constraint cli-ban-dummy0-on-DummyBundle-0 with a score of -INFINITY for resource dummy0 on node DummyBundle-0.
This will prevent dummy0 from running on DummyBundle-0 until the constraint is removed. This will be the case even if DummyBundle-0 is the last node in the cluster.
[root@virt-143 ~]# pcs constraint --full
Location Constraints:
  Resource: dummy0
    Disabled on: DummyBundle-0 (score:-INFINITY) (role: Started) (id:cli-ban-dummy0-on-DummyBundle-0)

[root@virt-143 ~]# crm_resource --resource dummy0 --move
WARNING: Creating rsc_location constraint 'cli-ban-dummy0-on-DummyBundle-0' with a score of -INFINITY for resource dummy0 on DummyBundle-0.
	This will prevent dummy0 from running on DummyBundle-0 until the constraint is removed using the 'crm_resource --clear' command or manually with cibadmin
	This will be the case even if DummyBundle-0 is the last node in the cluster
	This message can be disabled with --quiet
[root@virt-143 ~]# pcs constraint --full
Location Constraints:
  Resource: dummy0
    Disabled on: DummyBundle-0 (score:-INFINITY) (role: Started) (id:cli-ban-dummy0-on-DummyBundle-0)

> Constraint has been created, but resource WAS NOT moved. The reason is obvious as DummyBundle-0 is not an actual node. This is a possible place for improvement.


[root@virt-143 ~]# pcs resource move DummyBundle-0
Error: DummyBundle-0 is not a valid resource

[root@virt-143 ~]# crm_resource --resource DummyBundle-0 --move
WARNING: Creating rsc_location constraint 'cli-ban-DummyBundle-0-on-virt-143' with a score of -INFINITY for resource DummyBundle-0 on virt-143.
	This will prevent DummyBundle-0 from running on virt-143 until the constraint is removed using the 'crm_resource --clear' command or manually with cibadmin
	This will be the case even if virt-143 is the last node in the cluster
	This message can be disabled with --quiet
Error performing operation: Update does not conform to the configured schema

> Neither pcs not crm_resource is able to target the DummyBundle-0 pseudo resource which is expected.


[root@virt-143 ~]# pcs resource move DummyBundle
Error: cannot move cloned resources

[root@virt-143 ~]# crm_resource --resource DummyBundle --move
Resource 'DummyBundle' not moved: active in 2 locations.
You can prevent 'DummyBundle' from running on a specific location with: --ban --node <name>
Error performing operation: Invalid argument

> In this situation pcs simply shouts for trying to move a clone, while crm_resource gives us something else - a misinformation about the number of locations being active. There should only be one active due to replicas=1. Let's see how well the pure clones will do:


dummy1 is a clone with clone-max=1
----------------------------------

[root@virt-143 ~]# pcs resource move dummy1
Error: cannot move cloned resources

[root@virt-143 ~]# crm_resource --resource dummy1 --move
WARNING: Creating rsc_location constraint 'cli-ban-dummy1-on-virt-145' with a score of -INFINITY for resource dummy1 on virt-145.
	This will prevent dummy1 from running on virt-145 until the constraint is removed using the 'crm_resource --clear' command or manually with cibadmin
	This will be the case even if virt-145 is the last node in the cluster
	This message can be disabled with --quiet
[root@virt-143 ~]# pcs constraint --full
Location Constraints:
  Resource: dummy1
    Disabled on: virt-145 (score:-INFINITY) (role: Started) (id:cli-ban-dummy1-on-virt-145)

> A clone with single instance was refused to move by pcs, but actually passed and WAS moved with crm_resource. There might be a room for improvement in pcs.


[root@virt-143 ~]# pcs resource move dummy1-clone
Error: cannot move cloned resources

[root@virt-143 ~]# crm_resource --resource dummy1-clone --move
WARNING: Creating rsc_location constraint 'cli-ban-dummy1-clone-on-virt-143' with a score of -INFINITY for resource dummy1-clone on virt-143.
	This will prevent dummy1-clone from running on virt-143 until the constraint is removed using the 'crm_resource --clear' command or manually with cibadmin
	This will be the case even if virt-143 is the last node in the cluster
	This message can be disabled with --quiet
[root@virt-143 ~]# pcs constraint --full
Location Constraints:
  Resource: dummy1-clone
    Disabled on: virt-143 (score:-INFINITY) (role: Started) (id:cli-ban-dummy1-clone-on-virt-143)

> When targeting clone ID, once again a clone with single instance was refused to move by pcs, but passed and WAS moved with crm_resource.



dummy1 is a clone with clone-max=2
----------------------------------

[root@virt-143 ~]# pcs resource move dummy1
Error: cannot move cloned resources

[root@virt-143 ~]# crm_resource --resource dummy1 --move
WARNING: Creating rsc_location constraint 'cli-ban-dummy1-on-virt-143' with a score of -INFINITY for resource dummy1 on virt-143.
	This will prevent dummy1 from running on virt-143 until the constraint is removed using the 'crm_resource --clear' command or manually with cibadmin
	This will be the case even if virt-143 is the last node in the cluster
	This message can be disabled with --quiet
[root@virt-143 ~]# pcs constraint --full
Location Constraints:
  Resource: dummy1
    Disabled on: virt-143 (score:-INFINITY) (role: Started) (id:cli-ban-dummy1-on-virt-143)


[root@virt-143 ~]# crm_resource --resource dummy1 --move
WARNING: Creating rsc_location constraint 'cli-ban-dummy1-on-virt-145' with a score of -INFINITY for resource dummy1 on virt-145.
	This will prevent dummy1 from running on virt-145 until the constraint is removed using the 'crm_resource --clear' command or manually with cibadmin
	This will be the case even if virt-145 is the last node in the cluster
	This message can be disabled with --quiet
[root@virt-143 ~]# pcs constraint --full
Location Constraints:
  Resource: dummy1
    Disabled on: virt-143 (score:-INFINITY) (role: Started) (id:cli-ban-dummy1-on-virt-143)
    Disabled on: virt-145 (score:-INFINITY) (role: Started) (id:cli-ban-dummy1-on-virt-145)

> With more than one clone instance crm_resource does surprisingly perform a constraint-based move by moving ONE of the instance. Even if ran twice in a row.


[root@virt-143 ~]# pcs resource move dummy1-clone
Error: cannot move cloned resources

[root@virt-143 ~]# crm_resource --resource dummy1-clone --move
Resource 'dummy1-clone' not moved: active in 2 locations.
You can prevent 'dummy1-clone' from running on a specific location with: --ban --node <name>
Error performing operation: Invalid argument

> Here with clone ID and two instances crm_resource won't do us a favor anymore. Pcs keeps printing the same error as before.


So, the conclusion for 'resource move':
 - pcs is not really up in sync with crm_resource behavior in terms of being able to move a single clone instance
 - support for moving a bundle in crm_resource is not on par with being able to move a clone, at least in case of just one instance being present due to:
     - 2 active locations being improperly reported 
     - the move constraint of a primitive is incorrectly targeted to a non-existent DummyBundle-0 node.

It is worth pointing out that creating a pure location constraint for DummyBundle will move the only existing bundle instance to the desired node as one would expect. I yet have to test the behavior of 'resource ban'.




Performing tests of move with node specification. 

dummy0 is a resource inside DummyBundle bundle with replicas=1
--------------------------------------------------------------

[root@virt-143 ~]# pcs resource move dummy0 virt-145
[root@virt-143 ~]# pcs constraint --full
Location Constraints:
  Resource: dummy0
    Enabled on: virt-145 (score:INFINITY) (role: Started) (id:cli-prefer-dummy0)

[root@virt-143 ~]# crm_resource --resource dummy0 --move --node virt-145
[root@virt-143 ~]# pcs constraint --full
Location Constraints:
  Resource: dummy0
    Enabled on: virt-145 (score:INFINITY) (role: Started) (id:cli-prefer-dummy0)

> Constraint has been created, but resource WAS NOT moved.


[root@virt-143 ~]# pcs resource move DummyBundle-0 virt-145
Error: DummyBundle-0 is not a valid resource

[root@virt-143 ~]# crm_resource --resource DummyBundle-0 --move --node virt-145
Error performing operation: Update does not conform to the configured schema

> Neither pcs not crm_resource is able to target the DummyBundle-0 pseudo resource which is expected.


[root@virt-143 ~]# pcs resource move DummyBundle virt-145
Error: cannot move cloned resources

[root@virt-143 ~]# crm_resource --resource DummyBundle --move --node virt-145
Resource 'DummyBundle' not moved: active on multiple nodes
Error performing operation: Invalid argument

> Just like in the previous case the crm_resource gives us a misinformation about the number of locations being active. There should only be one active due to replicas=1.


dummy1 is a clone with clone-max=1
----------------------------------

[root@virt-143 ~]# pcs resource move dummy1 virt-145
Error: cannot move cloned resources

[root@virt-143 ~]# crm_resource --resource dummy1 --move --node virt-145
[root@virt-143 ~]# pcs constraint --full
Location Constraints:
  Resource: dummy1
    Enabled on: virt-145 (score:INFINITY) (role: Started) (id:cli-prefer-dummy1)

> A clone with single instance was refused to move by pcs, but actually passed and WAS moved with crm_resource. There might be a room for improvement in pcs.


[root@virt-143 ~]# pcs resource move dummy1-clone virt-146
Error: cannot move cloned resources

[root@virt-143 ~]# crm_resource --resource dummy1-clone --move --node virt-146
[root@virt-143 ~]# pcs constraint --full
Location Constraints:
  Resource: dummy1-clone
    Enabled on: virt-146 (score:INFINITY) (role: Started) (id:cli-prefer-dummy1-clone)

> When targeting clone ID, once again a clone with single instance was refused to move by pcs, but passed and WAS moved with crm_resource.


dummy1 is a clone with clone-max=2
----------------------------------

[root@virt-143 ~]# pcs resource move dummy1 virt-145
Error: cannot move cloned resources

[root@virt-143 ~]# crm_resource --resource dummy1 --move --node virt-145
[root@virt-143 ~]# pcs constraint --full
Location Constraints:
  Resource: dummy1
    Enabled on: virt-145 (score:INFINITY) (role: Started) (id:cli-prefer-dummy1)

> With more than one clone instance crm_resource does perform a constraint-based move by moving ONE of the instance.


[root@virt-143 ~]# pcs resource move dummy1-clone virt-146
Error: cannot move cloned resources
[root@virt-143 ~]# 
[root@virt-143 ~]# crm_resource --resource dummy1-clone --move --node virt-146
[root@virt-143 ~]# pcs constraint --full
Location Constraints:
  Resource: dummy1-clone
    Enabled on: virt-146 (score:INFINITY) (role: Started) (id:cli-prefer-dummy1-clone)

> Here with clone ID and two instances crm_resource won't do us a favor anymore. Pcs keeps printing the same error as before.


The conclusion for 'resource move' with node specification is pretty much aligned with the previous one:
 - pcs is not really up in sync with crm_resource behavior in terms of being able to move a single clone instance
 - support for moving a bundle in crm_resource is not on par with being able to move a clone, at least in case of just one instance being present due to:
     - 2 active locations being improperly reported 
     - the move constraint of a primitive is incorrectly targeted to a non-existent DummyBundle-0 node.




Performing tests of ban with crm_resource is pretty much identical to simple move (with no node specification), so I am going to mention pcs differences only:

dummy0 is a resource inside DummyBundle bundle with replicas=1
--------------------------------------------------------------

[root@virt-143 ~]# pcs resource ban DummyBundle
Error: error moving/banning/clearing resource
Resource 'DummyBundle' not moved: active in 2 locations.
You can prevent 'DummyBundle' from running on a specific location with: --ban --node <name>
Error performing operation: Invalid argument

[root@virt-143 ~]# crm_resource --resource DummyBundle --ban
Resource 'DummyBundle' not moved: active in 2 locations.
You can prevent 'DummyBundle' from running on a specific location with: --ban --node <name>
Error performing operation: Invalid argument

> The difference is in pcs output when targetting DummyBundle which no longer complains about a clone but instead passes the crm_resource message as is. We'll need a more user friendly message in pcs.


dummy1 is a clone with clone-max=1
----------------------------------

[root@virt-143 ~]# pcs resource ban dummy1
Warning: Creating location constraint cli-ban-dummy1-on-virt-143 with a score of -INFINITY for resource dummy1 on node virt-143.
This will prevent dummy1 from running on virt-143 until the constraint is removed. This will be the case even if virt-143 is the last node in the cluster.
[root@virt-143 ~]# pcs constraint --full
Location Constraints:
  Resource: dummy1
    Disabled on: virt-143 (score:-INFINITY) (role: Started) (id:cli-ban-dummy1-on-virt-143)

[root@virt-143 ~]# crm_resource --resource dummy1 --ban
WARNING: Creating rsc_location constraint 'cli-ban-dummy1-on-virt-144' with a score of -INFINITY for resource dummy1 on virt-144.
	This will prevent dummy1 from running on virt-144 until the constraint is removed using the 'crm_resource --clear' command or manually with cibadmin
	This will be the case even if virt-144 is the last node in the cluster
	This message can be disabled with --quiet
[root@virt-143 ~]# pcs constraint --full
Location Constraints:
  Resource: dummy1
    Disabled on: virt-144 (score:-INFINITY) (role: Started) (id:cli-ban-dummy1-on-virt-144)

> The resource WAS moved in both cases so pcs and crm_resource are in sync regarding behavior.


[root@virt-143 ~]# pcs resource ban dummy1-clone
Warning: Creating location constraint cli-ban-dummy1-clone-on-virt-143 with a score of -INFINITY for resource dummy1-clone on node virt-143.
This will prevent dummy1-clone from running on virt-143 until the constraint is removed. This will be the case even if virt-143 is the last node in the cluster.
[root@virt-143 ~]# pcs constraint --full
Location Constraints:
  Resource: dummy1-clone
    Disabled on: virt-143 (score:-INFINITY) (role: Started) (id:cli-ban-dummy1-clone-on-virt-143)

[root@virt-143 ~]# crm_resource --resource dummy1-clone --ban
WARNING: Creating rsc_location constraint 'cli-ban-dummy1-clone-on-virt-144' with a score of -INFINITY for resource dummy1-clone on virt-144.
	This will prevent dummy1-clone from running on virt-144 until the constraint is removed using the 'crm_resource --clear' command or manually with cibadmin
	This will be the case even if virt-144 is the last node in the cluster
	This message can be disabled with --quiet
[root@virt-143 ~]# pcs constraint --full
Location Constraints:
  Resource: dummy1-clone
    Disabled on: virt-144 (score:-INFINITY) (role: Started) (id:cli-ban-dummy1-clone-on-virt-144)

> Again the resource WAS moved in both cases so pcs and crm_resource are in sync regarding behavior.



dummy1 is a clone with clone-max=2
----------------------------------

[root@virt-143 ~]# pcs resource ban dummy1
Warning: Creating location constraint cli-ban-dummy1-on-virt-145 with a score of -INFINITY for resource dummy1 on node virt-145.
This will prevent dummy1 from running on virt-145 until the constraint is removed. This will be the case even if virt-145 is the last node in the cluster.
[root@virt-143 ~]# pcs constraint --full
Location Constraints:
  Resource: dummy1
    Disabled on: virt-145 (score:-INFINITY) (role: Started) (id:cli-ban-dummy1-on-virt-145)

[root@virt-143 ~]# crm_resource --resource dummy1 --ban
WARNING: Creating rsc_location constraint 'cli-ban-dummy1-on-virt-143' with a score of -INFINITY for resource dummy1 on virt-143.
	This will prevent dummy1 from running on virt-143 until the constraint is removed using the 'crm_resource --clear' command or manually with cibadmin
	This will be the case even if virt-143 is the last node in the cluster
	This message can be disabled with --quiet
[root@virt-143 ~]# pcs constraint --full
Location Constraints:
  Resource: dummy1
    Disabled on: virt-143 (score:-INFINITY) (role: Started) (id:cli-ban-dummy1-on-virt-143)

> One of the resource instances WAS moved in both cases so pcs and crm_resource are in sync regarding behavior.


[root@virt-143 ~]# pcs resource ban dummy1-clone
Error: error moving/banning/clearing resource
Resource 'dummy1-clone' not moved: active in 2 locations.
You can prevent 'dummy1-clone' from running on a specific location with: --ban --node <name>
Error performing operation: Invalid argument

[root@virt-143 ~]# 
[root@virt-143 ~]# crm_resource --resource dummy1-clone --ban
Resource 'dummy1-clone' not moved: active in 2 locations.
You can prevent 'dummy1-clone' from running on a specific location with: --ban --node <name>
Error performing operation: Invalid argument

> The difference is in pcs output when targetting DummyBundle which no longer complains about a clone but instead passes the crm_resource message as is. We'll need a more user friendly message in pcs.


The conclusion for 'resource ban' is the same for crm_resource, things are only different for pcs:
 - pcs now doesn't scream about clones anymore but instead pass crm_resource's error message directly, which we might need to modify a little
 - support for banning a bundle in crm_resource is not on par with being able to move a clone, at least in case of just one instance being present due to:
     - 2 active locations being improperly reported 
     - the move constraint of a primitive is incorrectly targeted to a non-existent DummyBundle-0 node

Comment 3 Tomas Jelinek 2020-05-06 07:41:57 UTC
The bz1578820 we depend on got moved to RHEL 8 only.

Comment 7 Tomas Jelinek 2023-07-25 13:23:54 UTC
Summary after bz1578820 has been resolved:

'pcs resource ban' works as described in comment 0. Such behavior is fine and there is nothing to be fixed. Pcs delegates the banning to crm_resource, including detection of number of location a resource is active in.

'pcs resource move-with constraint' and 'pcs resource move' is needs to be fixed:
* Moving / banning an inner bundle resource from its bundle should be disallowed in pcs. Currently, pcs doesn't check for this and creates a constraint. The constraint has no effect, as expected.
* Moving a 1-replica bundle works, but pcs does not allow it. There is a check in a validator in 'pcs resource move' command which prevents moving all bundle resources. As moving a 1-replica bundle works in pacemaker, an moving multi-replica bundles is prevented by pacemaker, the check in pcs can be removed completely.
* Moving a clone resource works the same as moving a bundle resource and is disallowed by pcs in the same way as well. Removing this limitation from pcs should be considered.