Bug 2018969

Summary: Support version 1.1 of the OCF Resource Agent API standard
Product: Red Hat Enterprise Linux 9 Reporter: Tomas Jelinek <tojeline>
Component: pcsAssignee: Tomas Jelinek <tojeline>
Status: CLOSED ERRATA QA Contact: cluster-qe <cluster-qe>
Severity: high Docs Contact: Steven J. Levine <slevine>
Priority: high    
Version: 9.0CC: cluster-maint, cluster-qe, gfialova, idevat, kgaillot, kmalyjur, lmiksik, mlisik, mmazoure, mpospisi, nhostako, omular, pgm-rhel-tools, slevine, tojeline
Target Milestone: rcKeywords: FutureFeature, Triaged
Target Release: 9.0   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: pcs-0.11.1-5.el9 Doc Type: Enhancement
Doc Text:
.`pcs` suppport for OCF Resource Agent API 1.1 standard The `pcs` command-line interface now supports OCF 1.1 resource and STONITH agents. As part of the implementation of this support, any agent's metadata must comply with the OCF schema, whether the agent is an OCF 1.0 or OCF 1.1 agent. If an agent's metadata does not comply with the OCF schema, `pcs` considers the agent invalid and will not create or update a resource of the agent unless the `--force` option is specified. The `pcsd` Web UI and `pcs` commands for listing agents now omit agents with invalid metadata from the listing.
Story Points: ---
Clone Of: 1936833 Environment:
Last Closed: 2022-05-17 12:19:34 UTC Type: Feature Request
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1384485, 1936696, 2019836    
Bug Blocks: 1936833, 2019464    

Description Tomas Jelinek 2021-11-01 11:47:37 UTC
+++ This bug was initially created as a clone of Bug #1936833 +++

+++ This bug was initially created as a clone of Bug #1936696 +++

The upstream ClusterLabs community is preparing to release version 1.1 of the OCF Resource Agent API standard, which will include a number of new features and clarifications compared to the previous version.

The RHEL High-Availability components will need to support this for RHEL 9.0, and a subset of support could be added as of 8.5 (any of the things mentioned as optional below).

Some key aspects that might require changes (agents refers to all OCF agents whether supplied by resource-agents, pacemaker, or some other package) are listed below. Some of these might warrant (or already have) their own BZs.

* The version number changes to 1.1
** In RHEL 9, agent meta-data should advertise 1.1 support. In RHEL 8, agents that use the old role names should continue to advertise 1.0 support, while agents that don't use role names could (but don't have to) advertise 1.1 support.
** pacemaker should set the OCF_RA_VERSION_MINOR environment variable to 1 instead of 0 in RHEL 9 and optionally 8

* The role names are now "promoted" and "unpromoted" instead of "Master" and "Slave".
** pacemaker should use the new names in help, logs, and output in 9 but not 8. In 9 and optionally 8, all names should be supported in user configurations; the crm_resource --master option should be renamed to --promoted, with the old option accepted for backward compatibility; the crm_master command should be renamed to pcmk_promotion, with the old name symlinked for backward compatibility; and relevant clone notification environment variables (OCF_RESKEY_CRM_meta_notify_master_resource etc.) should be provided with both the old and new names.
** Agents should use the new names in meta-data, help, etc., in 9 but not 8. Agents should use the new crm_resource --promoted option, crm_promotion command, and clone notification variables names in 9 and optionally 8 if supported by pacemaker. If agents parse pacemaker output for role names, they should look for either set of names in 9 and optionally 8.
** pcs should support all names in user input in 9 and optionally 8. The new names should be used in help and output in 9 but not 8. Any commands, options, etc., named after the old names should be renamed to the new ones with the old ones accepted for backward compatibility, in 9 and optionally 8.
** Note: promotion score node attribute names (master-*) are not part of the standard and are not changing at this time. However, anything outside pacemaker should use the crm_master or crm_promotion command instead of dealing with these attributes directly.

* The "unique" agent meta-data field has been deprecated in favor of two new fields, "unique-group" and "reloadable". Agents that support "reloadable" should support the new "reload-params" action.
** Pacemaker should support reloadable if present, otherwise unique if present, and support the reload-params action if present, otherwise reload if present, in 9 and optionally 8.
** Agents should provide both the old and new meta-data names, and the reload-params action if appropriate, in 9 and optionally 8.

* A number of new agent meta-data fields ("required", "deprecated", etc.) give additional hints for user interfaces.
** Agents should provide these in 9 and optionally 8.
** pcs can support these as desired.

* The new OCF_OUTPUT_FORMAT environment variable may be supported to indicate that the agent should output text or XML.
** Pacemaker's crm_resource and stonith_admin commands could set this appropriate to user-specified options before calling agents (at least for validate-all, which is the target use case).
** Agents may support this as desired.

* The OCF_CHECK_LEVEL environment variable may be supported for the validate-all action, to select host-independent or host-specific validation.
** Agents may support this as desired.
** Pacemaker could add an option to crm_resource and stonith_admin for check level when performing validation or monitoring.
** pcs could use the new pacemaker tool options if supported.

* Agent exit statuses have been clarified and expanded.
** Agents may support the new usage as desired. (Pacemaker already does.)

I may have missed some other places changes are needed, but those should be the most important.

Once adopted, the standard will be available at https://github.com/ClusterLabs/OCF-spec/tree/master/ra/1.1

This has been filed against 8.5 in case a subset will be implemented there, but may be cloned for or reassigned to 9.0, and some items could get their own bzs if separate tracking is desired

--- Additional comment from Ken Gaillot on 2021-04-09 01:00:43 CEST ---

Pacemaker role name changes that could affect pcs:

* Using the new role names in "role" in <op>, <rsc_location>, or <resource_set>, or in "rsc-role" or "with-rsc-role" in <rsc_ticket> or <rsc_colocation>, requires CIB schema 3.7 (i.e. "cibadmin --upgrade" or equivalent must be run on an existing cluster to use the new names).

* The crm_resource --master option has been deprecated (in help only) and replaced with a new --promoted option. The old name will continue to work for now but should be updated if pcs uses it.

* The crm_master command has been deprecated (in help only) and replaced with a new crm_attribute --promotion option that defaults to --lifetime=reboot (example: "crm_master -l reboot -v 10" becomes "crm_attribute --promotion -v 10"). The old command will continue to work for now but should be updated if pcs uses it.

* When showing ban constraints, crm_mon --output-as=xml (and --as-xml) will now show promoted-only=true/false in addition to master_only=true/false, which is now deprecated (via schema comment only). master_only will still be available but should be replaced if pcs currently checks it.

All of the above will be available in 8.5, and support can be checked by testing the CRM feature set against 3.9.0.

Additionally, in 9.0, Pacemaker will use the new names in all tool output, so anywhere pcs is parsing role names from output, it will need to be updated. This will not be a feature set bump since it is a build-time option, but support can be checked by testing whether "pacemakerd --features" contains "compat-2.0" (if it does, tool output uses the old names). (The "compat-2.0" feature can also be used to check the unrelated change of whether tool output uses ocf::provider or ocf:provider.)

Comment 1 Tomas Jelinek 2021-11-03 14:03:06 UTC
Upstream patch: https://github.com/ClusterLabs/pcs/commit/5eb51289926d5e1e68092db5d1d177fddc31a766

Test:

1) OCF 1.0 agents work the same as before

2) To test OCF 1.1 support, a resource agent implementing OCF 1.1 and using its new features is required. One of the features is unique-group of attributes. For example, there can be a group "address" consisting of "ip" and "port" attributes. Pcs then reports an error when the user is trying to create two resources of such an agent with the same combination of "ip" and "port" values.

Comment 4 Miroslav Lisik 2021-11-19 07:59:06 UTC
DevTestResults:

[root@r90-node-01 ~]# rpm -q pcs
pcs-0.11.1-5.el9.x86_64

[root@r90-node-01 ~]# crm_resource --show-metadata=ocf:pacemaker:Dummy | xmllint --xpath /resource-agent/version -
<version>1.1</version>
[root@r90-node-01 ~]# pcs resource create dummy ocf:pacemaker:Dummy
[root@r90-node-01 ~]# pcs resource
  * dummy       (ocf:pacemaker:Dummy):   Started r90-node-01

Comment 8 Michal Mazourek 2022-02-04 16:09:36 UTC
AFTER:
======

[root@virt-540 ~]# rpm -q pcs
pcs-0.11.1-9.el9.x86_64


support of new ocf 1.1 features - unique-group
-----------------------------------------------

[root@virt-540 ~]# grep -rn /usr/lib/ocf/resource.d -e "unique-group"
/usr/lib/ocf/resource.d/pacemaker/HealthIOWait:37:<parameter name="state" unique-group="state">
/usr/lib/ocf/resource.d/pacemaker/Stateful:52:<parameter name="state" unique-group="state">
/usr/lib/ocf/resource.d/pacemaker/attribute:70:    <parameter name="state" unique-group="state">
/usr/lib/ocf/resource.d/pacemaker/attribute:78:    <parameter name="name" unique-group="name">
/usr/lib/ocf/resource.d/pacemaker/ping:49:<parameter name="pidfile" unique-group="pidfile">
/usr/lib/ocf/resource.d/pacemaker/ping:63:<parameter name="name" unique-group="name">
/usr/lib/ocf/resource.d/pacemaker/remote:31:    <parameter name="server" unique-group="address">
/usr/lib/ocf/resource.d/pacemaker/remote:38:    <parameter name="port" unique-group="address">
/usr/lib/ocf/resource.d/pacemaker/Dummy:73:<parameter name="state" unique-group="state">

> 'unique-group' replaced 'unique' in ocf 1.1


[root@virt-540 ~]# pcs resource create s1 ocf:pacemaker:Stateful state=/tmp/state1
[root@virt-540 ~]# pcs resource
  * s1	(ocf:pacemaker:Stateful):	 Started virt-540
[root@virt-540 ~]# pcs resource create s2 ocf:pacemaker:Stateful state=/tmp/state1
Error: Value '/tmp/state1' of option 'state' is not unique across 'ocf:pacemaker:Stateful' resources. Following resources are configured with the same value of the instance attribute: 's1', use --force to override
Error: Errors have occurred, therefore pcs is unable to continue
[root@virt-540 ~]# echo $?
1
[root@virt-540 ~]# pcs resource
  * s1	(ocf:pacemaker:Stateful):	 Started virt-540

> OK

[root@virt-540 ~]# pcs resource create s2 ocf:pacemaker:Stateful state=/tmp/state2
[root@virt-540 ~]# pcs resource
  * s1	(ocf:pacemaker:Stateful):	 Started virt-540
  * s2	(ocf:pacemaker:Stateful):	 Started virt-541

> OK: 'unique-group' works the same as 'unique' for a single parameter


## Testing 'unique-group' for multiple parameters
## Adding 'unique-group' with a name of 'group-test' to 'passwd' and 'fake' parameters of ocf:pacemaker:Dummy

[root@virt-540 ~]# cat /usr/lib/ocf/resource.d/pacemaker/Dummy | grep group-test
<parameter name="passwd" reloadable="1" unique-group="group-test">
<parameter name="fake" reloadable="1" unique-group="group-test">

[root@virt-540 ~]# pcs resource create d1 ocf:pacemaker:Dummy passwd=123 fake=1
[root@virt-540 ~]# pcs resource
  * s1	(ocf:pacemaker:Stateful):	 Started virt-540
  * s2	(ocf:pacemaker:Stateful):	 Started virt-541
  * d1	(ocf:pacemaker:Dummy):	 Started virt-540
[root@virt-540 ~]# pcs resource create d2 ocf:pacemaker:Dummy passwd=123 fake=1
Error: Value '1', '123' of options 'fake', 'passwd' (group 'group-test') is not unique across 'ocf:pacemaker:Dummy' resources. Following resources are configured with the same values of the instance attributes: 'd1', use --force to override
Error: Errors have occurred, therefore pcs is unable to continue
[root@virt-540 ~]# echo $?
1

> OK: 'unique-group' didn't allow to set the same combination of passwd and fake parameters

[root@virt-540 ~]# pcs resource create d2 ocf:pacemaker:Dummy passwd=1234 fake=1
[root@virt-540 ~]# pcs resource create d3 ocf:pacemaker:Dummy passwd=123 fake=0
[root@virt-540 ~]# pcs resource
  * s1	(ocf:pacemaker:Stateful):	 Started virt-540
  * s2	(ocf:pacemaker:Stateful):	 Started virt-541
  * d1	(ocf:pacemaker:Dummy):	 Started virt-540
  * d2	(ocf:pacemaker:Dummy):	 Started virt-541
  * d3	(ocf:pacemaker:Dummy):	 Started virt-540

> OK: 'unique-group' allowed to create the resource, when one of the parameter is unique and the other stays the same

## Adding 'unique-group' for third parameter

[root@virt-540 ~]# cat /usr/lib/ocf/resource.d/pacemaker/Dummy | grep group-test
<parameter name="passwd" reloadable="1" unique-group="group-test">
<parameter name="fake" reloadable="1" unique-group="group-test">
<parameter name="op_sleep" reloadable="1" unique-group="group-test">

[root@virt-540 ~]# pcs resource create d4 ocf:pacemaker:Dummy passwd=123 fake=1 op_sleep=1
[root@virt-540 ~]# pcs resource
  * s1	(ocf:pacemaker:Stateful):	 Started virt-540
  * s2	(ocf:pacemaker:Stateful):	 Started virt-541
  * d1	(ocf:pacemaker:Dummy):	 Started virt-540
  * d2	(ocf:pacemaker:Dummy):	 Started virt-541
  * d3	(ocf:pacemaker:Dummy):	 Started virt-540
  * d4	(ocf:pacemaker:Dummy):	 Started virt-541

[root@virt-540 ~]# pcs resource create d5 ocf:pacemaker:Dummy passwd=123 fake=1 op_sleep=1
Error: Value '1', '1', '123' of options 'fake', 'op_sleep', 'passwd' (group 'group-test') is not unique across 'ocf:pacemaker:Dummy' resources. Following resources are configured with the same values of the instance attributes: 'd4', use --force to override
Error: Errors have occurred, therefore pcs is unable to continue
[root@virt-540 ~]# echo $?
1

> OK

[root@virt-540 ~]# pcs resource create d5 ocf:pacemaker:Dummy passwd=123 fake=1 op_sleep=2
[root@virt-540 ~]# echo $?
0 
[root@virt-540 ~]# pcs resource create d6 ocf:pacemaker:Dummy passwd=123 fake=0 op_sleep=1
[root@virt-540 ~]# echo $?
0
[root@virt-540 ~]# pcs resource create d7 ocf:pacemaker:Dummy passwd=1234 fake=1 op_sleep=1
[root@virt-540 ~]# echo $?
0
[root@virt-540 ~]# pcs resource
  * s1	(ocf:pacemaker:Stateful):	 Started virt-540
  * s2	(ocf:pacemaker:Stateful):	 Started virt-541
  * d1	(ocf:pacemaker:Dummy):	 Started virt-540
  * d2	(ocf:pacemaker:Dummy):	 Started virt-541
  * d3	(ocf:pacemaker:Dummy):	 Started virt-540
  * d4	(ocf:pacemaker:Dummy):	 Started virt-541
  * d5	(ocf:pacemaker:Dummy):	 Started virt-540
  * d6	(ocf:pacemaker:Dummy):	 Started virt-541
  * d7	(ocf:pacemaker:Dummy):	 Started virt-540

> OK


checking that resources are supporting ocf 1.1
-----------------------------------------------

[root@virt-560 ~]# grep -r "<version>1.1" /usr/lib/ocf/resource.d/
/usr/lib/ocf/resource.d/pacemaker/Dummy:<version>1.1</version>
/usr/lib/ocf/resource.d/pacemaker/HealthIOWait:<version>1.1</version>
/usr/lib/ocf/resource.d/pacemaker/Stateful:<version>1.1</version>
/usr/lib/ocf/resource.d/pacemaker/attribute:  <version>1.1</version>
/usr/lib/ocf/resource.d/pacemaker/ping:<version>1.1</version>
/usr/lib/ocf/resource.d/pacemaker/remote:  <version>1.1</version>

[root@virt-560 ~]# pcs resource create dummy ocf:pacemaker:Dummy
[root@virt-560 ~]# pcs resource create health ocf:pacemaker:HealthIOWait
[root@virt-560 ~]# pcs resource create stateful ocf:pacemaker:Stateful
[root@virt-560 ~]# pcs resource create attribute ocf:pacemaker:attribute
[root@virt-560 ~]# pcs resource create ping ocf:pacemaker:ping host_list=virt-560
[root@virt-560 ~]# pcs resource create renite ocf:pacemaker:remote 
Error: this command is not sufficient for creating a remote connection, use 'pcs cluster node add-remote', use --force to override
Error: Errors have occurred, therefore pcs is unable to continue
[root@virt-560 ~]# pcs resource
  * Clone Set: locking-clone [locking]:
    * Started: [ virt-560 virt-561 ]
  * dummy	(ocf:pacemaker:Dummy):	 Started virt-560
  * health	(ocf:pacemaker:HealthIOWait):	 Started virt-561
  * stateful	(ocf:pacemaker:Stateful):	 Started virt-560
  * attribute	(ocf:pacemaker:attribute):	 Started virt-561
  * ping	(ocf:pacemaker:ping):	 Started virt-560 (Monitoring)

> OK: validating of resources with ocf 1.1 and 1.0 will be done in bz1384485


checking that ocf agents don't fail because of the new standard
----------------------------------------------------------------

# to prevent fencing
[root@virt-560 ~]# pcs node maintenance --all

[root@virt-560 ~]# set -x; i=0; for s in `pcs resource list ocf --nodesc`; do pcs resource create r$i $s; ((i=i+1)); done
{...}

> Resources without any required parameter were created, resources with required parameters gave error about the missing parameter, every error message was inspected and no error was related to the new ocf standard and its validation.


Other bzs was related to new ocf standard, such as bz1384485 or bz1885293. Also new bz for fixing backward compatibility was created - bz2050274. Together with these bzs, marking as VERIFIED in pcs-0.11.1-9.el9.

Comment 11 errata-xmlrpc 2022-05-17 12:19:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (new packages: pcs), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:2290