1985981 – reflect changes in crm_mon --as-xml

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1985981 - reflect changes in crm_mon --as-xml

Summary: reflect changes in crm_mon --as-xml

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Enterprise Linux 9
Classification:	Red Hat
Component:	pcs
Sub Component:
Version:	9.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	beta
Target Release:	9.0 Beta
Assignee:	Tomas Jelinek
QA Contact:	cluster-qe@redhat.com
Docs Contact:
URL:
Whiteboard:
Depends On:	1885302
Blocks:	1999022
TreeView+	depends on / blocked

Reported:	2021-07-26 12:26 UTC by Tomas Jelinek
Modified:	2021-12-07 21:22 UTC (History)
CC List:	12 users (show)
Fixed In Version:	pcs-0.11.1-1.el9
Doc Type:	Enhancement
Doc Text:	'pcs status xml' command now provides output in the new format implemented by pacemaker
Clone Of:	1885302
Clones:	1999022 (view as bug list)
Environment:
Last Closed:	2021-12-07 21:20:54 UTC
Type:	Bug
Target Upstream Version:
Embargoed:
Dependent Products:
Flags:	pm-rhel: mirror+

Attachments	(Terms of Use)

Description Tomas Jelinek 2021-07-26 12:26:57 UTC

+++ This bug was initially created as a clone of Bug #1885302 +++

From upstream info: https://wiki.clusterlabs.org/wiki/Pacemaker_2.1_Changes

The deprecated crm_mon --as-html, --as-xml, --web-cgi, and --disable-ncurses options might print a deprecation warning when used (and showing the currently supported equivalents). crm_mon.rng, the XML schema corresponding to --as-xml, would also be deprecated.


We need to check if pcs works with the deprecation warning in place or move to the new xml output right away.

--- Additional comment from Tomas Jelinek on 2021-03-04 16:00:09 CET ---

Error reporting has changed:

old version:
# crm_mon --as-xml
Not connected
crm_mon: Error: cluster is not available on this node
<crm_mon version="2.0.5"/>
# echo $?
102

new version:
# crm_mon --output-as=xml
<pacemaker-result api-version="2.3" request="crm_mon --output-as=xml">
  <status code="102" message="Not connected">
    <errors>
      <error>crm_mon: Error: cluster is not available on this node</error>
    </errors>
  </status>
</pacemaker-result>
# echo $?
102

In the new version, an xml must be parsed to obtain an error message.

--- Additional comment from Tomas Jelinek on 2021-05-11 16:29:15 CEST ---

Adapt pcs to the new crm_mon XML format. No change should be visible to the users except minor error message changes.

--- Additional comment from Tomas Jelinek on 2021-05-11 16:31:41 CEST ---

We still need to take a look at the 'pcs status xml' command to see and deal with any impact caused by switching to the new format.

--- Additional comment from Miroslav Lisik on 2021-06-14 16:09:43 CEST ---

Test:

[root@r8-node-01 ~]# rpm -q pcs
pcs-0.10.8-2.el8.x86_64

This is an internal change and there should be no visible changes. Some
commands with --wait option uses this output for checking cluster state.

[root@r8-node-01 ~]# pcs cluster start --wait --debug | grep 'Running' | grep 'crm_mon'
Running: /usr/sbin/crm_mon --help-all
Running: /usr/sbin/crm_mon --one-shot --inactive --output-as xml

--- Additional comment from Michal Mazourek on 2021-07-02 14:33:57 CEST ---

BEFORE:
=======

[root@virt-514 ~]# rpm -q pcs
pcs-0.10.8-1.el8.x86_64


[root@virt-514 ~]# pcs cluster start --wait --debug | grep 'crm_mon'
Running: /usr/sbin/crm_mon --one-shot --as-xml --inactive
Finished running: /usr/sbin/crm_mon --one-shot --as-xml --inactive
<crm_mon version="2.0.5">
</crm_mon>
Running: /usr/sbin/crm_mon --one-shot --as-xml --inactive
Finished running: /usr/sbin/crm_mon --one-shot --as-xml --inactive
<crm_mon version="2.0.5">
</crm_mon>
{...}

> pcs using crm_mon --as-xml format


AFTER:
======

[root@virt-042 ~]# rpm -q pcs
pcs-0.10.8-2.el8.x86_64


## Checking, that the internal change is present and the command functionality preserved

[root@virt-042 ~]# pcs cluster start --wait --debug | grep 'Running' | grep 'crm_mon'
Running: /usr/sbin/crm_mon --help-all
Running: /usr/sbin/crm_mon --one-shot --inactive --output-as xml
Running: /usr/sbin/crm_mon --help-all
Running: /usr/sbin/crm_mon --one-shot --inactive --output-as xml
Running: /usr/sbin/crm_mon --help-all
{...}
[root@virt-042 ~]# echo $?
0

[root@virt-049 ~]# pcs resource enable res --wait --debug | grep "Running" | grep "crm_mon"
Running: /usr/sbin/crm_mon --help-all
Running: /usr/sbin/crm_mon --one-shot --inactive --output-as xml
Running: /usr/sbin/crm_mon --help-all
Running: /usr/sbin/crm_mon --one-shot --inactive --output-as xml
[root@virt-049 ~]# echo $?
0

> pcs using new format crm_mon --output-as xml and the functionality of the command preserved
> tested with various pcs commands that using --wait flag (cluster start, resource enable/disable/safe-disable/move/ban/group add, stonith enable/disable)


## Checking error output

[root@virt-042 ~]# pcs cluster destroy
Shutting down pacemaker/corosync services...
Killing any remaining services...
Removing all cluster configuration files...

[root@virt-042 ~]# pcs cluster start --wait --debug 
Error: cluster is not currently configured on this node
[root@virt-042 ~]# echo $?
1

> OK

## on other node

[root@virt-049 ~]# pcs cluster stop
Stopping Cluster (pacemaker)...
Stopping Cluster (corosync)...

[root@virt-049 ~]# pcs cluster start --wait=1 --debug | grep 'crm_mon'
Running: /usr/sbin/crm_mon --help-all
Finished running: /usr/sbin/crm_mon --help-all
  crm_mon [OPTION?]
If this program is called as crm_mon.cgi, --output-as=html --html-cgi will
When run interactively, crm_mon can be told to hide and display various sections
	crm_mon
	crm_mon -1
	crm_mon --group-by-node --inactive
Start crm_mon as a background daemon and have it write the cluster status to an HTML file:
	crm_mon --daemonize --output-as html --output-to /path/to/docroot/filename.html
Start crm_mon and export the current cluster status as XML to stdout, then exit:
	crm_mon --output-as xml
Running: /usr/sbin/crm_mon --one-shot --inactive --output-as xml
Finished running: /usr/sbin/crm_mon --one-shot --inactive --output-as xml
<pacemaker-result api-version="2.9" request="/usr/sbin/crm_mon --one-shot --inactive --output-as xml">
Error: Waiting timeout

> OK, the error messages preserved


Marking as VERIFIED SanityOnly for pcs-0.10.8-2.el8

Comment 1 Tomas Jelinek 2021-07-26 12:29:17 UTC

This is a counterpart to bz1885302. On top of what has been implemented in bz1885302, we are also switching 'pcs status xml' to the new format.

Comment 4 Tomas Jelinek 2021-08-20 12:38:27 UTC

Upstream patch: https://github.com/ClusterLabs/pcs/commit/11e758ed5e9879c35ce9e0b6c97ed8e7cdac3a03

Test:

run 'pcs status xml' and observe it returns the new format XML: <pacemaker-result... instead of <crm_mon...

Comment 5 Miroslav Lisik 2021-08-24 13:54:15 UTC

DevTestResults:

[root@r90-node-01 ~]# rpm -q pcs
pcs-0.11.0.alpha.1-1.el9.x86_64

[root@r90-node-01 ~]# pcs status xml
<pacemaker-result api-version="2.12" request="/usr/sbin/crm_mon --one-shot --inactive --output-as xml">
  <summary>
    <stack type="corosync"/>
    <current_dc present="true" version="2.1.0-11.el9-7c3f660707" name="r90-node-02" id="2" with_quorum="true"/>
    <last_update time="Tue Aug 24 10:59:00 2021"/>
    <last_change time="Tue Aug 24 10:58:52 2021" user="root" client="cibadmin" origin="r90-node-01"/>
    <nodes_configured number="2"/>
    <resources_configured number="2" disabled="0" blocked="0"/>
    <cluster_options stonith-enabled="true" symmetric-cluster="true" no-quorum-policy="stop" maintenance-mode="false" stop-all-resources="false"/>
  </summary>
  <nodes>
    <node name="r90-node-01" id="1" online="true" standby="false" standby_onfail="false" maintenance="false" pending="false" unclean="false" shutdown="false" expected_up="true" is_dc="false" resources_running="1" type="member"/>
    <node name="r90-node-02" id="2" online="true" standby="false" standby_onfail="false" maintenance="false" pending="false" unclean="false" shutdown="false" expected_up="true" is_dc="true" resources_running="1" type="member"/>
  </nodes>
  <resources>
    <resource id="fence-r90-node-01" resource_agent="stonith:fence_xvm" role="Started" active="true" orphaned="false" blocked="false" managed="true" failed="false" failure_ignored="false" nodes_running_on="1">
      <node name="r90-node-01" id="1" cached="true"/>
    </resource>
    <resource id="fence-r90-node-02" resource_agent="stonith:fence_xvm" role="Started" active="true" orphaned="false" blocked="false" managed="true" failed="false" failure_ignored="false" nodes_running_on="1">
      <node name="r90-node-02" id="2" cached="true"/>
    </resource>
  </resources>
  <node_history>
    <node name="r90-node-02">
      <resource_history id="fence-r90-node-02" orphan="false" migration-threshold="1000000">
        <operation_history call="10" task="start" rc="0" rc_text="ok" last-rc-change="Tue Aug 24 10:58:52 2021" exec-time="34ms" queue-time="0ms"/>
        <operation_history call="11" task="monitor" rc="0" rc_text="ok" interval="60000ms" last-rc-change="Tue Aug 24 10:58:52 2021" exec-time="38ms" queue-time="0ms"/>
      </resource_history>
    </node>
    <node name="r90-node-01">
      <resource_history id="fence-r90-node-01" orphan="false" migration-threshold="1000000">
        <operation_history call="6" task="start" rc="0" rc_text="ok" last-rc-change="Tue Aug 24 10:58:52 2021" exec-time="32ms" queue-time="0ms"/>
        <operation_history call="7" task="monitor" rc="0" rc_text="ok" interval="60000ms" last-rc-change="Tue Aug 24 10:58:52 2021" exec-time="28ms" queue-time="0ms"/>
      </resource_history>
    </node>
  </node_history>
  <status code="0" message="OK"/>
</pacemaker-result>

Comment 9 Michal Mazourek 2021-08-30 09:25:46 UTC

Marking as Verified SanityOnly based on comment 6.
New TestOnly bz was created - bz1999022

Note You need to log in before you can comment on or make changes to this bug.