1816852 – [RFE] 'pcs (resource|stonith) (create|update)' commands fail to call the respective agents 'validate-all' action

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1816852 - [RFE] 'pcs (resource|stonith) (create|update)' commands fail to call the respective agents 'validate-all' action

Summary: [RFE] 'pcs (resource|stonith) (create|update)' commands fail to call the resp...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 8
Classification:	Red Hat
Component:	pcs
Sub Component:
Version:	8.3
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	8.8
Assignee:	Ondrej Mular
QA Contact:	cluster-qe@redhat.com
Docs Contact:	Steven J. Levine
URL:
Whiteboard:
Duplicates (3):	1553712 1954085 2149113 (view as bug list)
Depends On:	1553712 1636036 1644628 1955792 2102292 2112271 2157873
Blocks:	1377970 2112270 2159455
TreeView+	depends on / blocked

Reported:	2020-03-24 21:43 UTC by Heinz Mauelshagen
Modified:	2024-12-20 19:01 UTC (History)
CC List:	14 users (show)
Fixed In Version:	pcs-0.10.14-6.el8
Doc Type:	Enhancement
Doc Text:	.`pcs` can now run the `validate-all` action of resource and stonith agents When creating or updating a resource or a STONITH device, you can now specify the `--agent-validation` option. With this option, `pcs` uses an agent's `validate-all` action, when it is available, in addition to the validation done by `pcs` based on the agent's metadata.
Clone Of:
Clones:	2112270 (view as bug list)
Environment:
Last Closed:	2023-05-16 08:12:42 UTC
Type:	---
Target Upstream Version:
Embargoed:
Dependent Products:
Flags:	pm-rhel: mirror+

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	CLUSTERQE-6137	None	None	None	2022-11-11 20:31:20 UTC
Red Hat Knowledge Base (Solution)	6110181	None	None	None	2022-08-09 15:08:29 UTC
Red Hat Product Errata	RHBA-2023:2738	None	None	None	2023-05-16 08:13:16 UTC

Description Heinz Mauelshagen 2020-03-24 21:43:41 UTC

Description of problem:

The above pcs commands should call the used resource agents 'validate-all' 
action so that the OCF intended environment checks can be performed and, more
importantly, consistency checks on the resource are possible ahead of starting it,
e.g. if it is an active-active one thus allowing cloning or if an active-passive
resource to be created is already active and thus has to be deactivated to
avoid data-corruption.

Comment 1 Ken Gaillot 2020-03-24 21:59:27 UTC

A question that should be resolved before having pcs call validate-all is what that action should do. (The OCF standard is not completely clear.)

validate-all should definitely check for legal syntax of proposed parameter values, as well as self-consistency (e.g. if parameter X is specified then parameter Y can only take such-and-such values).

However whether validate-all should check whether the parameter values are suitable for the local host is another question. A resource can legitimately be created that can't run on the local host (for example, because the resource is constrained to run only on certain nodes, or because the local host is a GUI host that's not part of the cluster). Also, validating that the resource could run on the local host doesn't guarantee it can run on any other node.

Comment 2 Heinz Mauelshagen 2020-03-24 22:28:35 UTC

A misconfigured, not startable and, worst case, data corrupting resource should be rejected to create, update or clone.

This is what we want to achieve using 'validate-all' in the MD resource agent Nigel's developing by checking
that a non-clustered RAID array isn't allowed creating a resource for when it's already active on a node as
a result of the "mdadm -C ..." command or tried cloning.  Mind the so created MD array can pre exist already
and now becomes a pacemaker cluster resource potentially causing data-corruption by having it active,
open and being updated on one node when pacemaker resource management decides to bring it up on another.

Comment 4 Tomas Jelinek 2020-04-06 12:19:39 UTC

I agree with Ken on this one. Validate-all should definitely check for syntax and consistency.

I believe, that doing "live" checks, i.e. interacting with an underlying cluster / OS, should be avoided for several reasons. Adding to reasons presented by Ken: CIB can be edited in-file (pcs -f) even outside the cluster. In that case, interacting with underlying OS or (in this case) non-existing cluster is just wrong and it will present false responses and possibly cause more troubles.

If those live checks are really needed, they should be implemented outside of validate-all and validate-all-xml actions. Also, it looks like it may be wanted to run those checks in other cases than just modifying CIB (i.e. when a resource is being started by pacemaker), which is another argument for implementing them somewhere else.

Comment 5 Heinz Mauelshagen 2020-04-08 12:06:32 UTC

(In reply to Tomas Jelinek from comment #4)
> I agree with Ken on this one. Validate-all should definitely check for
> syntax and consistency.
> 
> I believe, that doing "live" checks, i.e. interacting with an underlying
> cluster / OS, should be avoided for several reasons. Adding to reasons
> presented by Ken: CIB can be edited in-file (pcs -f) even outside the
> cluster. In that case, interacting with underlying OS or (in this case)
> non-existing cluster is just wrong and it will present false responses and
> possibly cause more troubles.
> 
> If those live checks are really needed, they should be implemented outside
> of validate-all and validate-all-xml actions. Also, it looks like it may be
> wanted to run those checks in other cases than just modifying CIB (i.e. when
> a resource is being started by pacemaker), which is another argument for
> implementing them somewhere else.

How does this argue against my comment #2 aiming to protect from data corruption as early
as possible in the CIB update process apart from the fact, that "pcs -f ..." followed by a
"pcs cluster cib-push ..." would run validate-all at CIB push time (any resource relying on
live checks on e.g. devices only available on the live cluster will have to carry them out
when deployed there which I presime to be covered by this bs request to call validate-all
from pacemaker)?

FWIW: I'm not arguing against action validate-all-xml being valuable to catch problems early
(though I did not grep any existing (upstream) resource agent implementing it; neither any pacemaker
executable proving implementaton of it; this bz also applies to the latter as of this).
As a result of those deficiencies, neither "pcs -f ..." nor "pcs cluster cib-push ..."
causes the validate-all-xml action to be called as of my testing.


IOW: if user misconfigures a resource on create/update either by direct means (e.g. via pcs)
or delayed ones using "pcs -f ...;pcs cluster cib-push ..." or by means of resource cloning in
case the resource is not suitable for active-active deployment, this should be caught ASAP
rather than at start time when it can be too late to avoid data corruption; I presume
validate-all and validate-all-xml to be the means to ensure this


In general terms, the general rational for this bz is, that validate-all and validate-all-xml
(btw: hard to find sepc for the latter in ocf docs?) are actions defined as a subset of the
resource agents API which help to avoid worst case scenarios such as data corruption.
As they are part of the resource agent API, they should be called by pacemaker directly and
not just as internal workaround calls by agents which then fail at e.g. start time causing
respective error messages in turn requiring cleanup which can be avoided altogether by
allowing for the API being fully implemented.

Also, supporting those missing API action calls does not disallow any argued on semantics needed at
start time (i.e. do at validate time what the respective resource agent requires to validate
and check at start time what's mandatory for the resource agent in that context) although it'll
need a patch series updating the current workaround processing of validate-all in a lot of existing
(upstream) resource agents conditionally discontinuing calling based on version.

Comment 6 Tomas Jelinek 2020-04-08 15:47:04 UTC

(In reply to Heinz Mauelshagen from comment #5)
I agree with you that protecting from data corruption caused by badly created / updated / cloned resources is a good thing and should be implemented. I just pointed out another corner-case which needs to be dealt with. Running validate-all-xml on 'pcs resource create' (and update, clone) has been one of our goals for some time. However, support from pacemaker and agents is needed for that, so it has not been implemented yet.

It is important to figure out what validations can and should be done in what cases and which cluster component should be running them. As Ken and I pointed out, not all validations can be run in all cases. That is what the discussion is about.
You originally started with validation in 'pcs resource create | clone | update'. Then in comment 5 you also added validating in 'pcs cluster cib-push', which is a bit of a different feature. Yes, it aims for the same goal, but the implementation and circumstances are completely different. In resource create | update, we have one resource to deal with and we know its settings because they were just provided by the pcs user. In cib-push, we have either a whole CIB or an original CIB and a diff, so it must work differently. Later on in comment 5, you are actually saying it is pacemaker, not pcs, who should be running the validation:
> do at validate time what the respective resource agent requires to validate and check at start time what's mandatory for the resource agent in that context
Pcs can handle the "validate time" part but hardly the "start time" part as it is not involved in any way in starting resources, that's pacemaker's job.

You cannot find any traces of validate-all-xml anywhere, because it has not been implemented yet.

Comment 7 Heinz Mauelshagen 2020-04-16 13:51:34 UTC

(In reply to Tomas Jelinek from comment #6)
> (In reply to Heinz Mauelshagen from comment #5)
> I agree with you that protecting from data corruption caused by badly
> created / updated / cloned resources is a good thing and should be
> implemented. I just pointed out another corner-case which needs to be dealt
> with. Running validate-all-xml on 'pcs resource create' (and update, clone)
> has been one of our goals for some time. However, support from pacemaker and
> agents is needed for that, so it has not been implemented yet.

Ok, is there any product plan for it as yet?

> 
> It is important to figure out what validations can and should be done in
> what cases and which cluster component should be running them. As Ken and I
> pointed out, not all validations can be run in all cases. That is what the
> discussion is about.

Agreed, we need conditionals covering in which context the respective validat-all* action is processed.

> You originally started with validation in 'pcs resource create | clone |
> update'.

Yes, I did that because of the UI used, not relative to internal technical design and implementation.
I.e., user doesn't care which components in the architecture processes the validation, that is up
to the technical implementer.  The latter has to find the adequate component to be enhanced or added
with the goal to idealy find a central one which manages the validation checks rather than running them
in multiple places suffering from related potential consistency issues in return.

>  Then in comment 5 you also added validating in 'pcs cluster
> cib-push', which is a bit of a different feature.

Indeed, this came up relative to applying validation checks in either cib create/update/... scenarios,
no matter what the UIs causing those are.

> Yes, it aims for the same
> goal, but the implementation and circumstances are completely different. In
> resource create | update, we have one resource to deal with and we know its
> settings because they were just provided by the pcs user. In cib-push, we
> have either a whole CIB or an original CIB and a diff, so it must work
> differently.

Yes, that's a 1 change (simplifying here), i.e. create/update/restart/foobar a single resource
versus an N change scenaio.  The latter being a list if the former with more interrelations
because of potential bulk creations/updates on cib-push.

> Later on in comment 5, you are actually saying it is pacemaker,
> not pcs, who should be running the validation:
> do at validate time what the respective resource agent requires to validate and check at start time what's mandatory for the resource agent in that context
> Pcs can handle the "validate time" part but hardly the "start time" part as
> it is not involved in any way in starting resources, that's pacemaker's job.

Sure, I don't argue not having start checks. We need them to ensure the same line of thought,
e.g. prevent data corruption at start if only a start time check is suitable to avoid it.

Pacemaker core seems to be the architecturally common place, whatever the proper component therein,
to carry out checks on either 'immediate' creates/updates/... via pcs/crm_resource/... or 'delayed'
ones via e.g. 'pcs -f ...;pcs cib_push ...' (potentially running on different machines, the former
command even outside the cluster the future cib push is aiming to occur on).

At the architectural level: what would prevent such central component (existing one to be enhanced or new one)
in pacemaker to carry out validation checks?

> 
> You cannot find any traces of validate-all-xml anywhere, because it has not
> been implemented yet.

Thanks, I gathered as much testing it and mentioned above saying "though I did not grep...neither any pacemaker executable...".

Comment 11 Tomas Jelinek 2021-04-29 08:15:28 UTC

*** Bug 1954085 has been marked as a duplicate of this bug. ***

Comment 14 Ken Gaillot 2021-04-30 19:45:56 UTC

The OCF Resource Agent API 1.1 standard [1] was recently released, and addresses these issues under "Global OCF Attributes" and "Check Levels".

The standard specifies OCF_OUTPUT_FORMAT and OCF_CHECK_LEVEL environment variables that an agent can check in its validate-all action. The agent may optionally support outputting XML when OCF_OUTPUT_FORMAT is "xml". If OCF_CHECK_LEVEL is 0 or unset, the agent should do only an internal consistency check, and if OCF_CHECK_LEVEL is 10, it may additionally validate the suitability of the local host.

I believe this work remains:

* Pacemaker's crm_resource must pass OCF_OUTPUT_FORMAT="xml" to agents when called with --output-as="xml" --validate (Bug 1644628), and support an option for an OCF_CHECK_LEVEL to pass to agents (Bug 1955792).

* OCF 1.1 leaves the specifics of validate-all XML output undefined, so we need to come up with a good schema design that we can use and can be added to the next standard version.

* Any resource agents desired to have this behavior must be updated to support OCF 1.1, including OCF_OUTPUT_FORMAT, OCF_CHECK_LEVEL, and the new schema with the validate-all action. Bug 1936696 (pacemaker, including the ocf:pacemaker agents), Bug 1937025 (resource-agents), Bug 1937026 (resource-agents-sap), Bug 1937027 (resource-agents-sap-hana), and Bug 1937028 (resource-agents-sap-hana-scaleout) cover most agents shipped with RHEL as far as supporting OCF 1.1 goes, though not specifically the new validate-all behaviors since they are optional. We could comment on those bzs that the new validate-all behavior is desired, or open separate bzs if particular agents are of interest.

* pcs has to call crm_resource --validate with --output-as="xml" and the desired check level, and parse the new output.

Pacemaker itself doesn't call validate-all before starting resources, because it's at a lower level where maximum flexibility is prioritized, and because the agent itself can perform validation as part of its start action (which many do). The start should be sufficient -- doing a separate validation would just add overhead compared to letting the start fail (the end result is the same, a failed resource). If the agent returns success for start despite the parameters being invalid, that's an agent issue from Pacemaker's point of view.

[1] https://github.com/ClusterLabs/OCF-spec/blob/master/ra/1.1/resource-agent-api.md

Comment 17 RHEL Program Management 2021-09-24 07:26:57 UTC

After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 18 Heinz Mauelshagen 2022-05-19 12:06:52 UTC

Still needed as e.g. start action time checks are too late to impose create/update time constraints!

Comment 24 Tomas Jelinek 2022-10-04 12:41:16 UTC

*** Bug 1553712 has been marked as a duplicate of this bug. ***

Comment 25 Ondrej Mular 2022-10-06 07:17:02 UTC

Upstream commit: https://github.com/ClusterLabs/pcs/commit/d0d9ac19ba2c3a993bff4fdbc4fdcc93d6bb1430

Test:
[root@rhel87-beta1 pcs]# pcs resource create test_ip ocf:heartbeat:IPaddr2 ip=192.168.1.5
Error: Validation result from agent (use --force to override):
  Oct 06 09:12:00 ERROR: Unable to find nic or netmask.
  ocf-exit-reason:[findif] failed
Error: Errors have occurred, therefore pcs is unable to continue
[root@rhel87-beta1 pcs]# echo $?
1

Comment 26 Miroslav Lisik 2022-10-26 07:38:10 UTC

DevTestResults:

[root@r88-1 ~]# rpm -q pcs
pcs-0.10.14-6.el8.x86_64

[root@r88-1 ~]# pcs resource create test_ip ocf:heartbeat:IPaddr2 ip=192.168.1.5
Error: Validation result from agent (use --force to override):
  Oct 25 14:01:03 ERROR: Unable to find nic or netmask.
  ocf-exit-reason:[findif] failed
Error: Errors have occurred, therefore pcs is unable to continue
[root@r88-1 ~]# echo $?
1

Comment 30 Michal Mazourek 2022-12-05 16:20:39 UTC

BEFORE:
=======

[root@virt-042 ~]# rpm -q pcs
pcs-0.10.14-5.el8.x86_64

[root@virt-042 ~]# pcs resource create test_ip ocf:heartbeat:IPaddr2 ip=1.1.1.1
[root@virt-042 ~]# echo $?
0
[root@virt-042 ~]# pcs resource status test_ip 
  * test_ip	(ocf::heartbeat:IPaddr2):	 Stopped


[root@virt-042 ~]# pcs resource create dummy ocf:heartbeat:Dummy --debug | grep "\-\-validate"
[root@virt-042 ~]# echo $?
1

> validation is not present


AFTER:
======

[root@virt-137 ~]# rpm -q pcs
pcs-0.10.14-6.el8.x86_64


### Checking in debug mode that pcs is calling the validation

## Creating resource

[root@virt-137 ~]# pcs resource create dummy ocf:heartbeat:Dummy --debug | grep "\-\-validate"
Running: /usr/sbin/crm_resource --validate --output-as xml --class ocf --agent Dummy --provider heartbeat
Finished running: /usr/sbin/crm_resource --validate --output-as xml --class ocf --agent Dummy --provider heartbeat
<pacemaker-result api-version="2.25" request="/usr/sbin/crm_resource --validate --output-as xml --class ocf --agent Dummy --provider heartbeat">
[root@virt-137 ~]# pcs resource status dummy
  * dummy	(ocf::heartbeat:Dummy):	 Started virt-137

> OK: Validation is visible in debug mode when creating a resource


## Creating stonith device

[root@virt-137 ~]# pcs stonith create sbd_fencing fence_sbd devices=invalid --debug | grep "\-\-validate"
Running: /usr/sbin/stonith_admin --validate --output-as xml --agent fence_sbd --option devices=invalid
Finished running: /usr/sbin/stonith_admin --validate --output-as xml --agent fence_sbd --option devices=invalid
<pacemaker-result api-version="2.25" request="/usr/sbin/stonith_admin --validate --output-as xml --agent fence_sbd --option devices=invalid">
[root@virt-137 ~]# pcs stonith status sbd_fencing
  * sbd_fencing	(stonith:fence_sbd):	 Stopped

> OK: Validation is visible in debug mode when creating a stonith device
> NOTE: Validation of the devices for fence_sbd is currently not working, which is not a pcs problem, bz was filed for the fence-agents: bz2136227


## Updating resource

[root@virt-137 ~]# pcs resource update dummy fake=1 --debug | grep "\-\-validate"
Running: /usr/sbin/crm_resource --validate --output-as xml --class ocf --agent Dummy --provider heartbeat
Finished running: /usr/sbin/crm_resource --validate --output-as xml --class ocf --agent Dummy --provider heartbeat
<pacemaker-result api-version="2.25" request="/usr/sbin/crm_resource --validate --output-as xml --class ocf --agent Dummy --provider heartbeat">
Running: /usr/sbin/crm_resource --validate --output-as xml --class ocf --agent Dummy --provider heartbeat --option fake=1
Finished running: /usr/sbin/crm_resource --validate --output-as xml --class ocf --agent Dummy --provider heartbeat --option fake=1
<pacemaker-result api-version="2.25" request="/usr/sbin/crm_resource --validate --output-as xml --class ocf --agent Dummy --provider heartbeat --option fake=1">
[root@virt-137 ~]# echo $?
0
[root@virt-137 ~]# pcs resource config dummy | grep fake
    fake=1

> OK: Validation is visible in debug mode when updating a resource


## Updating stonith device

[root@virt-137 ~]# pcs stonith update sbd_fencing fence_sbd devices=updated --debug | grep "\-\-validate" 
Running: /usr/sbin/stonith_admin --validate --output-as xml --agent fence_sbd --option devices=invalid
Finished running: /usr/sbin/stonith_admin --validate --output-as xml --agent fence_sbd --option devices=invalid
<pacemaker-result api-version="2.25" request="/usr/sbin/stonith_admin --validate --output-as xml --agent fence_sbd --option devices=invalid">
Running: /usr/sbin/stonith_admin --validate --output-as xml --agent fence_sbd --option devices=updated
Finished running: /usr/sbin/stonith_admin --validate --output-as xml --agent fence_sbd --option devices=updated
<pacemaker-result api-version="2.25" request="/usr/sbin/stonith_admin --validate --output-as xml --agent fence_sbd --option devices=updated">
[root@virt-137 ~]# echo $?
0
[root@virt-137 ~]# pcs stonith config sbd_fencing | grep devices
    devices=updated

> OK: Validation is visible in debug mode when updating a stonith device


### Checking cases with invalid options

## Creating resource with invalid option

[root@virt-137 ~]# pcs resource create test_ip ocf:heartbeat:IPaddr2 ip=1.1.1.1
Error: Validation result from agent (use --force to override):
  Nov 24 17:28:46 ERROR: Unable to find nic or netmask.
  ocf-exit-reason:[findif] failed
Error: Errors have occurred, therefore pcs is unable to continue
[root@virt-137 ~]# echo $?
1
[root@virt-137 ~]# pcs resource status test_ip
Error: resource or tag id 'test_ip' not found
[root@virt-137 ~]# echo $?
1

> OK

# Overriding the validation with --force option
[root@virt-137 ~]# pcs resource create test_ip ocf:heartbeat:IPaddr2 ip=1.1.1.1 --force
Warning: Validation result from agent:
  Nov 24 17:29:45 ERROR: Unable to find nic or netmask.
  ocf-exit-reason:[findif] failed
[root@virt-137 ~]# echo $?
0
[root@virt-137 ~]# pcs resource status test_ip 
  * test_ip	(ocf::heartbeat:IPaddr2):	 Stopped

> OK


## Creating stonith device with invalid option

[root@virt-137 ~]# pcs stonith create test_stonith fence_xvm ip_family=test
Error: Validation result from agent (use --force to override):

Error: Errors have occurred, therefore pcs is unable to continue
[root@virt-137 ~]# echo $?
1
[root@virt-137 ~]# pcs stonith config test_stonith
Warning: Unable to find stonith device 'test_stonith'
Error: No stonith device found
[root@virt-137 ~]# echo $?
1

> OK

# with the valid option
[root@virt-137 ~]# pcs stonith create test_stonith fence_xvm ip_family=ipv4
[root@virt-137 ~]# echo $?
0
[root@virt-137 ~]# pcs stonith status test_stonith
  * test_stonith	(stonith:fence_xvm):	 Started virt-140

> OK


## Updating resource with invalid option

[root@virt-137 ~]# pcs resource create test_ip_1 ocf:heartbeat:IPaddr2 ip=192.168.2.17
[root@virt-137 ~]# pcs resource status test_ip_1
  * test_ip_1	(ocf::heartbeat:IPaddr2):	 Started virt-137
[root@virt-137 ~]# pcs resource update test_ip_1 ip=invalid
Error: Validation result from agent (use --force to override):
  Nov 28 15:10:47 ERROR: IP address [invalid] not valid.
  ocf-exit-reason:[findif] failed
[root@virt-137 ~]# echo $?
1
[root@virt-137 ~]# pcs resource update test_ip_1 ip=1.2.3.4
Error: Validation result from agent (use --force to override):
  Nov 28 15:11:09 ERROR: Unable to find nic or netmask.
  ocf-exit-reason:[findif] failed
[root@virt-137 ~]# echo $?
1
[root@virt-137 ~]# pcs resource delete test_ip_1
Attempting to stop: test_ip_1... Stopped

> OK


## Updating stonith device with invalid option

[root@virt-137 ~]# pcs stonith status test_stonith
  * test_stonith	(stonith:fence_xvm):	 Started virt-140
[root@virt-137 ~]# pcs stonith update test_stonith ip_family=test
Error: Validation result from agent (use --force to override):

[root@virt-137 ~]# echo $?
1

> OK


## Trying to create resource with invalid option in combination with group, promotable, disable

[root@virt-137 ~]# pcs resource create test_ip_1 ocf:heartbeat:IPaddr2 ip=invalid --group g1
Error: Validation result from agent (use --force to override):
  Nov 29 14:55:26 ERROR: IP address [invalid] not valid.
  ocf-exit-reason:[findif] failed
Error: Errors have occurred, therefore pcs is unable to continue
[root@virt-137 ~]# echo $?
1

[root@virt-137 ~]# pcs resource create test_ip_1 ocf:heartbeat:IPaddr2 ip=invalid promotable
Error: Validation result from agent (use --force to override):
  Nov 29 14:55:40 ERROR: IP address [invalid] not valid.
  ocf-exit-reason:[findif] failed
Error: Errors have occurred, therefore pcs is unable to continue
[root@virt-137 ~]# echo $?
1

[root@virt-137 ~]# pcs resource create test_ip_1 ocf:heartbeat:IPaddr2 ip=invalid --disable
Error: Validation result from agent (use --force to override):
  Nov 29 14:55:50 ERROR: IP address [invalid] not valid.
  ocf-exit-reason:[findif] failed
Error: Errors have occurred, therefore pcs is unable to continue
[root@virt-137 ~]# echo $?
1

> OK


## Updating disabled resource with invalid option

[root@virt-137 ~]# pcs resource create test_ip_1 ocf:heartbeat:IPaddr2 ip=192.168.2.17
[root@virt-137 ~]# pcs resource disable test_ip_1
[root@virt-137 ~]# pcs resource status test_ip_1
  * test_ip_1	(ocf::heartbeat:IPaddr2):	 Stopped (disabled)
[root@virt-137 ~]# pcs resource update test_ip_1 ip=invalid
Error: Validation result from agent (use --force to override):
  Dec 02 16:12:41 ERROR: IP address [invalid] not valid.
  ocf-exit-reason:[findif] failed
[root@virt-137 ~]# echo $?
1
[root@virt-137 ~]# pcs resource update test_ip_1 ip=1.2.3.4
Error: Validation result from agent (use --force to override):
  Dec 02 16:12:52 ERROR: Unable to find nic or netmask.
  ocf-exit-reason:[findif] failed
[root@virt-137 ~]# echo $?
1

> OK

[root@virt-137 ~]# pcs resource enable test_ip_1
[root@virt-137 ~]# pcs resource
  * test_ip_1	(ocf::heartbeat:IPaddr2):	 Started virt-137


## Restarting resource, then updating it with invalid option

[root@virt-137 ~]# pcs resource restart test_ip_1
test_ip_1 successfully restarted
[root@virt-137 ~]# echo $?
0

[root@virt-137 ~]# pcs resource update test_ip_1 ip=1.2.3.4
Error: Validation result from agent (use --force to override):
  Dec 05 11:13:07 ERROR: Unable to find nic or netmask.
  ocf-exit-reason:[findif] failed
[root@virt-137 ~]# echo $?
1

> OK


## Updating unamanged resource with invalid option

[root@virt-137 ~]# pcs resource unmanage test_ip_1
[root@virt-137 ~]# pcs resource
  * test_ip_1	(ocf::heartbeat:IPaddr2):	 Started virt-137 (unmanaged)

[root@virt-137 ~]# pcs resource update test_ip_1 ip=invalid
Error: Validation result from agent (use --force to override):
  Dec 05 11:33:44 ERROR: IP address [invalid] not valid.
  ocf-exit-reason:[findif] failed
[root@virt-137 ~]# echo $?
1
[root@virt-137 ~]# pcs resource update test_ip_1 ip=1.2.3.4
Error: Validation result from agent (use --force to override):
  Dec 05 11:33:56 ERROR: Unable to find nic or netmask.
  ocf-exit-reason:[findif] failed
[root@virt-137 ~]# echo $?
1

> OK

[root@virt-137 ~]# pcs resource manage test_ip_1


## Forcing resource to have an invalid option and then updating it with another invalid option

[root@virt-137 ~]# pcs resource update test_ip_1 ip=1.2.3.4 --force
Warning: Validation result from agent:
  Dec 05 16:46:51 ERROR: Unable to find nic or netmask.
  ocf-exit-reason:[findif] failed
[root@virt-137 ~]# pcs resource config test_ip_1 | grep ip=
    ip=1.2.3.4

[root@virt-137 ~]# pcs resource update test_ip_1 ip=invalid
[root@virt-137 ~]# echo $?
0
[root@virt-137 ~]# pcs resource config test_ip_1 | grep ip=
    ip=invalid

[root@virt-137 ~]# pcs resource update test_ip_1 ip=192.168.2.17
[root@virt-137 ~]# pcs resource update test_ip_1 ip=invalid
Error: Validation result from agent (use --force to override):
  Dec 05 16:49:07 ERROR: IP address [invalid] not valid.
  ocf-exit-reason:[findif] failed

> When the resource already has an invalid option, validate check is omitted on purpose (to have a chance to change all invalid properties independently to valid values). This is a correct behavior, however could be improved by printing warning message, that the validation check is omitted. New bz will be filed for this for next releases.


Marking as VERIFIED for pcs-0.10.14-6.el8.

Comment 37 Chris Feist 2023-02-28 15:23:02 UTC

*** Bug 2149113 has been marked as a duplicate of this bug. ***

Comment 39 errata-xmlrpc 2023-05-16 08:12:42 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (pcs bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:2738

Note You need to log in before you can comment on or make changes to this bug.