Description of problem: resource-agents package provides azure-events-az agent. This agent executes 'crm_simulate -Ls' and parses output. With pacemaker 2.1 and newer the output of crm_simulate changed and no longer contains text 'Transition Summary:', resulting with an error for azure-events-az. Version-Release number of selected component (if applicable): RHEL 8.6 (and newer) resource-agents-4.9.0-16.el8_6.8.x86_64 pacemaker-cli-2.1.2-4.el8_6.5.x86_64 How reproducible: Error in azure-events-az only shows after an Azure scheduled event is triggered. Example of such event is VM redeploy or reboot. https://learn.microsoft.com/en-us/azure/virtual-machines/linux/scheduled-events Errors during execution of azure-events-az, as a monitor within pacemaker or manually with 'export OCF_ROOT=/usr/lib/ocf; export OCF_RESKEY_verbose=1; /usr/lib/ocf/resource.d/heartbeat/azure-events-az monitor' causes an error: ocf-exit-reason:object of type 'bool' has no len() ocf.py(None)[2963061]: Mar 22 21:04:03 ERROR: object of type 'bool' has no len() Steps to Reproduce: See above section Actual results: Within /usr/lib/ocf/resource.d/heartbeat/azure-events-az, method transitionSummary() on line 296 calls crm_simulate -Ls (line 311). This method expects to find "Transition Summary:" in the output. Upto and including RHEL 8.4 this was true (pacemaker 2.0.5 in RHEL8.4). In pacemaker 2.1.x this line is no longer part of output. Result is that within azure-events-az, method allResourcesStoppedOnNode(node) then fails in line 372 with above errors Expected results: # crm_simulate -Ls Transition Summary: * Promote rsc_SAPHana_HN1_HDB03:0 (Slave -> Master hsr3-db1) * Stop rsc_SAPHana_HN1_HDB03:1 (hsr3-db0) * Move rsc_ip_HN1_HDB03 (Started hsr3-db0 -> hsr3-db1) * Start rsc_nc_HN1_HDB03 (hsr3-db1) # Excepted result when there are no pending actions: Transition Summary: Additional info:
@kgaillot is there a crm_feature_set version I should check against so we can use new format where needed, and fallback to the old way for older versions?
(In reply to Oyvind Albrigtsen from comment #3) > @kgaillot is there a crm_feature_set version I should check > against so we can use new format where needed, and fallback to the old way > for older versions? 3.7.4, which is also when --output-as=xml is available for crm_simulate
Fix to treat no "Transition Summary" as no actions:https://github.com/ClusterLabs/resource-agents/pull/1854 --output-as=xml is quite new in crm_simulate, so not as backwards compatible.
Additional patch to improve logic: https://github.com/ClusterLabs/resource-agents/pull/1864
Thank you for the fix. Will await release of package in repository channels for full integration tests on our side. Quick tests on RH9.0 and 8.6 are positive with the checked in modifications. To clarify on the proposed fix - is there any concern of using crm_simulate -LS and depending on the output containing the expected text string in future releases, potentially breaking again down the line again? Certainly, future changes cannot be predicted, am trying to understand if there are assumed changes. Since -Ls output containing the 'Transition Summary:" text block was likely unintended and thus removed in the later/current releases, assuming -LS keeping it going forward. Any guidance on this would be appreciated.
(In reply to robbiro from comment #9) > Thank you for the fix. Will await release of package in repository channels > for full integration tests on our side. Quick tests on RH9.0 and 8.6 are > positive with the checked in modifications. > > To clarify on the proposed fix - is there any concern of using crm_simulate > -LS and depending on the output containing the expected text string in > future releases, potentially breaking again down the line again? Certainly, > future changes cannot be predicted, am trying to understand if there are > assumed changes. Since -Ls output containing the 'Transition Summary:" text > block was likely unintended and thus removed in the later/current releases, > assuming -LS keeping it going forward. Any guidance on this would be > appreciated. To address such issues, Pacemaker has been gradually adding support for XML output for all command-line tools. The idea is that the text output may change from release to release, but the XML output will change as little as possible, and remain backward-compatible as much as possible, for parsing by scripts. All commands will take the same --output-as option, which may be set to "none", "text", or "xml". The schema for the XML output is installed as /usr/share/pacemaker/api/api-result.rng (which includes RNGs for each individual command). You can use that to figure out what to parse. crm_simulate supports --output-as as of the Pacemaker 2.1.0 release (RHEL 8.5 and later, and all of RHEL 9). Most agents haven't switched to parsing XML yet in order to remain compatible with older versions, but if that's not a concern, I'd recommend the XML. FYI, we've also been gradually adding high-level C APIs corresponding to each command-line tool, and those generate the same XML output that --output-as=xml would. Commands that currently support --output-as=xml: * Since 2.0.2 (8.1+/9.0+): stonith_admin * Since 2.0.3 (8.2+/9.0+): crm_mon * Since 2.1.0 (8.5+/9.0+): crmadmin, crm_resource, crm_simulate, crm_verify * Since 2.1.3 (8.7+/9.1+): attrd_updater, crm_attribute, crm_rule * Since 2.1.5 (8.8+/9.2+): crm_error * Since 2.1.6 (8.9+/9.3+): crm_shadow * Not yet supported: cibadmin, crm_diff, crm_node, crm_ticket, iso8601