Bug 2182415 - azure-events-az fails with pacemaker => 2.1 with missing transition summary (RHEL9)
Summary: azure-events-az fails with pacemaker => 2.1 with missing transition summary (...
Keywords:
Status: VERIFIED
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: resource-agents
Version: 9.2
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: rc
: 9.3
Assignee: Oyvind Albrigtsen
QA Contact: Brandon Perkins
Steven J. Levine
URL:
Whiteboard:
Depends On: 2181019
Blocks: 2182766 2182764 2182765
TreeView+ depends on / blocked
 
Reported: 2023-03-28 15:07 UTC by Oyvind Albrigtsen
Modified: 2023-08-10 15:40 UTC (History)
11 users (show)

Fixed In Version: resource-agents-4.10.0-37.el9
Doc Type: Bug Fix
Doc Text:
The `azure-events-az` resource agent no longer produces an error with Pacemaker 2.1 and later The `azure-events-az` resource agent executes the 'crm_simulate -Ls' command and parses the output. With Pacemaker 2.1 and later, the output of the `crm_simulate` command no longer contains the text 'Transition Summary:', which resulted in an error. With this fix, the agent no longer yields an error when this text is missing.
Clone Of: 2181019
: 2182764 2182765 2182766 (view as bug list)
Environment:
Last Closed:
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker CLUSTERQE-6668 0 None None None 2023-04-28 17:13:39 UTC
Red Hat Issue Tracker RHELPLAN-153298 0 None None None 2023-03-28 15:11:01 UTC

Description Oyvind Albrigtsen 2023-03-28 15:07:16 UTC
+++ This bug was initially created as a clone of Bug #2181019 +++

Description of problem:
resource-agents package provides azure-events-az agent. This agent executes 'crm_simulate -Ls' and parses output. With pacemaker 2.1 and newer the output of crm_simulate changed and no longer contains text 'Transition Summary:', resulting with an error for azure-events-az.

Version-Release number of selected component (if applicable):
RHEL 8.6 (and newer)
resource-agents-4.9.0-16.el8_6.8.x86_64
pacemaker-cli-2.1.2-4.el8_6.5.x86_64

How reproducible:
Error in azure-events-az only shows after an Azure scheduled event is triggered. Example of such event is VM redeploy or reboot. https://learn.microsoft.com/en-us/azure/virtual-machines/linux/scheduled-events

Errors during execution of azure-events-az, as a monitor within pacemaker or manually with 'export OCF_ROOT=/usr/lib/ocf; export OCF_RESKEY_verbose=1; /usr/lib/ocf/resource.d/heartbeat/azure-events-az monitor' causes an error:

ocf-exit-reason:object of type 'bool' has no len()
ocf.py(None)[2963061]:  Mar 22 21:04:03 ERROR: object of type 'bool' has no len()

Steps to Reproduce:
See above section

Actual results:
Within /usr/lib/ocf/resource.d/heartbeat/azure-events-az, method transitionSummary() on line 296 calls crm_simulate -Ls (line 311). This method expects to find "Transition Summary:" in the output. Upto and including RHEL 8.4 this was true (pacemaker 2.0.5 in RHEL8.4). In pacemaker 2.1.x this line is no longer part of output.

Result is that within azure-events-az, method allResourcesStoppedOnNode(node) then fails in line 372 with above errors


Expected results:
# crm_simulate -Ls
Transition Summary:
	* Promote rsc_SAPHana_HN1_HDB03:0      (Slave -> Master hsr3-db1)
	* Stop    rsc_SAPHana_HN1_HDB03:1      (hsr3-db0)
	* Move    rsc_ip_HN1_HDB03     (Started hsr3-db0 -> hsr3-db1)
	* Start   rsc_nc_HN1_HDB03     (hsr3-db1)
# Excepted result when there are no pending actions:
Transition Summary:

Additional info:

--- Additional comment from Oyvind Albrigtsen on 2023-03-23 15:38:36 CET ---

@kgaillot is there a crm_feature_set version I should check against so we can use new format where needed, and fallback to the old way for older versions?

--- Additional comment from Ken Gaillot on 2023-03-23 15:54:47 CET ---

(In reply to Oyvind Albrigtsen from comment #3)
> @kgaillot is there a crm_feature_set version I should check
> against so we can use new format where needed, and fallback to the old way
> for older versions?

3.7.4, which is also when --output-as=xml is available for crm_simulate

--- Additional comment from Oyvind Albrigtsen on 2023-03-28 17:04:08 CEST ---

Fix to treat no "Transition Summary" as no actions:https://github.com/ClusterLabs/resource-agents/pull/1854

--output-as=xml is quite new in crm_simulate, so not as backwards compatible.

Comment 3 Oyvind Albrigtsen 2023-05-01 09:12:06 UTC
Additional patch to improve logic: https://github.com/ClusterLabs/resource-agents/pull/1864


Note You need to log in before you can comment on or make changes to this bug.