Bug 2182415

Summary: azure-events-az fails with pacemaker => 2.1 with missing transition summary (RHEL9)
Product: Red Hat Enterprise Linux 9 Reporter: Oyvind Albrigtsen <oalbrigt>
Component: resource-agentsAssignee: Oyvind Albrigtsen <oalbrigt>
Status: CLOSED ERRATA QA Contact: Brandon Perkins <bperkins>
Severity: unspecified Docs Contact: Steven J. Levine <slevine>
Priority: unspecified    
Version: 9.2CC: agk, bperkins, cfeist, cluster-maint, fdinitto, kgaillot, ksatarin, nwahl, radeltch, robert.biro, slevine
Target Milestone: rcKeywords: Triaged, ZStream
Target Release: 9.3Flags: pm-rhel: mirror+
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: resource-agents-4.10.0-37.el9 Doc Type: Bug Fix
Doc Text:
.The `azure-events-az` resource agent no longer produces an error with Pacemaker 2.1 and later The `azure-events-az` resource agent executes the `crm_simulate -Ls` command and parses the output. With Pacemaker 2.1 and later, the output of the `crm_simulate` command no longer contains the text `Transition Summary:`, which resulted in an error. With this fix, the agent no longer yields an error when this text is missing.
Story Points: ---
Clone Of: 2181019
: 2182764 2182765 2182766 (view as bug list) Environment:
Last Closed: 2023-11-07 08:23:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2181019    
Bug Blocks: 2182764, 2182765, 2182766    

Description Oyvind Albrigtsen 2023-03-28 15:07:16 UTC
+++ This bug was initially created as a clone of Bug #2181019 +++

Description of problem:
resource-agents package provides azure-events-az agent. This agent executes 'crm_simulate -Ls' and parses output. With pacemaker 2.1 and newer the output of crm_simulate changed and no longer contains text 'Transition Summary:', resulting with an error for azure-events-az.

Version-Release number of selected component (if applicable):
RHEL 8.6 (and newer)
resource-agents-4.9.0-16.el8_6.8.x86_64
pacemaker-cli-2.1.2-4.el8_6.5.x86_64

How reproducible:
Error in azure-events-az only shows after an Azure scheduled event is triggered. Example of such event is VM redeploy or reboot. https://learn.microsoft.com/en-us/azure/virtual-machines/linux/scheduled-events

Errors during execution of azure-events-az, as a monitor within pacemaker or manually with 'export OCF_ROOT=/usr/lib/ocf; export OCF_RESKEY_verbose=1; /usr/lib/ocf/resource.d/heartbeat/azure-events-az monitor' causes an error:

ocf-exit-reason:object of type 'bool' has no len()
ocf.py(None)[2963061]:  Mar 22 21:04:03 ERROR: object of type 'bool' has no len()

Steps to Reproduce:
See above section

Actual results:
Within /usr/lib/ocf/resource.d/heartbeat/azure-events-az, method transitionSummary() on line 296 calls crm_simulate -Ls (line 311). This method expects to find "Transition Summary:" in the output. Upto and including RHEL 8.4 this was true (pacemaker 2.0.5 in RHEL8.4). In pacemaker 2.1.x this line is no longer part of output.

Result is that within azure-events-az, method allResourcesStoppedOnNode(node) then fails in line 372 with above errors


Expected results:
# crm_simulate -Ls
Transition Summary:
	* Promote rsc_SAPHana_HN1_HDB03:0      (Slave -> Master hsr3-db1)
	* Stop    rsc_SAPHana_HN1_HDB03:1      (hsr3-db0)
	* Move    rsc_ip_HN1_HDB03     (Started hsr3-db0 -> hsr3-db1)
	* Start   rsc_nc_HN1_HDB03     (hsr3-db1)
# Excepted result when there are no pending actions:
Transition Summary:

Additional info:

--- Additional comment from Oyvind Albrigtsen on 2023-03-23 15:38:36 CET ---

@kgaillot is there a crm_feature_set version I should check against so we can use new format where needed, and fallback to the old way for older versions?

--- Additional comment from Ken Gaillot on 2023-03-23 15:54:47 CET ---

(In reply to Oyvind Albrigtsen from comment #3)
> @kgaillot is there a crm_feature_set version I should check
> against so we can use new format where needed, and fallback to the old way
> for older versions?

3.7.4, which is also when --output-as=xml is available for crm_simulate

--- Additional comment from Oyvind Albrigtsen on 2023-03-28 17:04:08 CEST ---

Fix to treat no "Transition Summary" as no actions:https://github.com/ClusterLabs/resource-agents/pull/1854

--output-as=xml is quite new in crm_simulate, so not as backwards compatible.

Comment 3 Oyvind Albrigtsen 2023-05-01 09:12:06 UTC
Additional patch to improve logic: https://github.com/ClusterLabs/resource-agents/pull/1864

Comment 16 errata-xmlrpc 2023-11-07 08:23:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (resource-agents bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:6312