Bug 1850506

Summary:	pacemaker should not throw a hissy fit when the OCF resource to be run in a bundle does not exist on the host
Product:	Red Hat Enterprise Linux 8	Reporter:	Michele Baldessari <michele>
Component:	pacemaker	Assignee:	Ken Gaillot <kgaillot>
Status:	CLOSED WONTFIX	QA Contact:	cluster-qe <cluster-qe>
Severity:	low	Docs Contact:
Priority:	unspecified
Version:	8.2	CC:	cluster-maint
Target Milestone:	rc	Keywords:	Triaged
Target Release:	---	Flags:	pm-rhel: mirror+
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2021-12-24 07:27:00 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Michele Baldessari 2020-06-24 12:23:27 UTC

Description of problem:
Seems people get scared by the following in the logs:
Jun 22 10:08:44 controller-0 pacemaker-controld  [37665] (do_lrm_rsc_op) 	info: Performing key=42:77:7:62589dff-c5f5-4837-8aeb-352b6c861136 op=ovndb_servers_monitor_0
Jun 22 10:08:44 controller-0 pacemaker-controld  [37665] (services_os_action_execute) 	warning: Cannot execute '/usr/lib/ocf/resource.d/ovn/ovndb-servers': No such file or directory (2)
Jun 22 10:08:44 controller-0 pacemaker-controld  [37665] (lrmd_api_get_metadata_params) 	error: Failed to retrieve meta-data for ocf:ovn:ovndb-servers
Jun 22 10:08:44 controller-0 pacemaker-controld  [37665] (build_operation_update) 	warning: Failed to get metadata for ovndb_servers (ocf:ovn:ovndb-servers)
Jun 22 10:08:44 controller-0 pacemaker-based     [37660] (cib_process_request) 	info: Forwarding cib_modify operation for section status to all (origin=local/crmd/186)

This is thrown around because /usr/lib/ocf/resource.d/ovn/ovndb-servers does not exist on the host, but only inside the bundle.

We should not log this stuff when the OCF resource is inside a bundle anyways.

Comment 1 Ken Gaillot 2020-06-24 17:14:12 UTC

Sadly this is a known issue that will require a massive overhaul of the relevant code.

Currently, the controller always executes meta-data calls locally, at the time the meta-data is needed, and caches the meta-data locally. This is true even when the meta-data is needed for a Pacemaker Remote node that happens to be connected to the controller (including remote nodes, guest nodes, and bundles). Thus the relevant resource agents must be installed on all cluster nodes, even if they won't be used there.

The solution, executing meta-data actions on the node that will run the resource, involves scheduling the meta-data calls via the scheduler rather than calling them when needed. Unfortunately an early attempt at this uncovered two unrelated bugs (no bz attached) that must be fixed first. So it's three significant projects.

The effect of not being able to get meta-data, besides the log message, is to make pacemaker assume that the agent does not support the reload action, and does not have any parameters marked "private" (which are hashed separately to allow debugging of bug reports with those values removed).

Comment 4 RHEL Program Management 2021-12-24 07:27:00 UTC

After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 5 Ken Gaillot 2022-01-04 20:11:57 UTC

This is still a goal and will be reopened when developer time becomes available