Bug 1985981

Summary: reflect changes in crm_mon --as-xml
Product: Red Hat Enterprise Linux 9 Reporter: Tomas Jelinek <tojeline>
Component: pcsAssignee: Tomas Jelinek <tojeline>
Status: CLOSED CURRENTRELEASE QA Contact: cluster-qe <cluster-qe>
Severity: high Docs Contact:
Priority: high    
Version: 9.0CC: cfeist, cluster-maint, cluster-qe, idevat, lmiksik, mlisik, mmazoure, mpospisi, nhostako, omular, slevine, tojeline
Target Milestone: betaKeywords: Triaged
Target Release: 9.0 Beta   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: pcs-0.11.1-1.el9 Doc Type: Enhancement
Doc Text:
'pcs status xml' command now provides output in the new format implemented by pacemaker
Story Points: ---
Clone Of: 1885302
: 1999022 (view as bug list) Environment:
Last Closed: 2021-12-07 21:20:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1885302    
Bug Blocks: 1999022    

Description Tomas Jelinek 2021-07-26 12:26:57 UTC
+++ This bug was initially created as a clone of Bug #1885302 +++

From upstream info: https://wiki.clusterlabs.org/wiki/Pacemaker_2.1_Changes

The deprecated crm_mon --as-html, --as-xml, --web-cgi, and --disable-ncurses options might print a deprecation warning when used (and showing the currently supported equivalents). crm_mon.rng, the XML schema corresponding to --as-xml, would also be deprecated.


We need to check if pcs works with the deprecation warning in place or move to the new xml output right away.

--- Additional comment from Tomas Jelinek on 2021-03-04 16:00:09 CET ---

Error reporting has changed:

old version:
# crm_mon --as-xml
Not connected
crm_mon: Error: cluster is not available on this node
<crm_mon version="2.0.5"/>
# echo $?
102

new version:
# crm_mon --output-as=xml
<pacemaker-result api-version="2.3" request="crm_mon --output-as=xml">
  <status code="102" message="Not connected">
    <errors>
      <error>crm_mon: Error: cluster is not available on this node</error>
    </errors>
  </status>
</pacemaker-result>
# echo $?
102

In the new version, an xml must be parsed to obtain an error message.

--- Additional comment from Tomas Jelinek on 2021-05-11 16:29:15 CEST ---

Adapt pcs to the new crm_mon XML format. No change should be visible to the users except minor error message changes.

--- Additional comment from Tomas Jelinek on 2021-05-11 16:31:41 CEST ---

We still need to take a look at the 'pcs status xml' command to see and deal with any impact caused by switching to the new format.

--- Additional comment from Miroslav Lisik on 2021-06-14 16:09:43 CEST ---

Test:

[root@r8-node-01 ~]# rpm -q pcs
pcs-0.10.8-2.el8.x86_64

This is an internal change and there should be no visible changes. Some
commands with --wait option uses this output for checking cluster state.

[root@r8-node-01 ~]# pcs cluster start --wait --debug | grep 'Running' | grep 'crm_mon'
Running: /usr/sbin/crm_mon --help-all
Running: /usr/sbin/crm_mon --one-shot --inactive --output-as xml

--- Additional comment from Michal Mazourek on 2021-07-02 14:33:57 CEST ---

BEFORE:
=======

[root@virt-514 ~]# rpm -q pcs
pcs-0.10.8-1.el8.x86_64


[root@virt-514 ~]# pcs cluster start --wait --debug | grep 'crm_mon'
Running: /usr/sbin/crm_mon --one-shot --as-xml --inactive
Finished running: /usr/sbin/crm_mon --one-shot --as-xml --inactive
<crm_mon version="2.0.5">
</crm_mon>
Running: /usr/sbin/crm_mon --one-shot --as-xml --inactive
Finished running: /usr/sbin/crm_mon --one-shot --as-xml --inactive
<crm_mon version="2.0.5">
</crm_mon>
{...}

> pcs using crm_mon --as-xml format


AFTER:
======

[root@virt-042 ~]# rpm -q pcs
pcs-0.10.8-2.el8.x86_64


## Checking, that the internal change is present and the command functionality preserved

[root@virt-042 ~]# pcs cluster start --wait --debug | grep 'Running' | grep 'crm_mon'
Running: /usr/sbin/crm_mon --help-all
Running: /usr/sbin/crm_mon --one-shot --inactive --output-as xml
Running: /usr/sbin/crm_mon --help-all
Running: /usr/sbin/crm_mon --one-shot --inactive --output-as xml
Running: /usr/sbin/crm_mon --help-all
{...}
[root@virt-042 ~]# echo $?
0

[root@virt-049 ~]# pcs resource enable res --wait --debug | grep "Running" | grep "crm_mon"
Running: /usr/sbin/crm_mon --help-all
Running: /usr/sbin/crm_mon --one-shot --inactive --output-as xml
Running: /usr/sbin/crm_mon --help-all
Running: /usr/sbin/crm_mon --one-shot --inactive --output-as xml
[root@virt-049 ~]# echo $?
0

> pcs using new format crm_mon --output-as xml and the functionality of the command preserved
> tested with various pcs commands that using --wait flag (cluster start, resource enable/disable/safe-disable/move/ban/group add, stonith enable/disable)


## Checking error output

[root@virt-042 ~]# pcs cluster destroy
Shutting down pacemaker/corosync services...
Killing any remaining services...
Removing all cluster configuration files...

[root@virt-042 ~]# pcs cluster start --wait --debug 
Error: cluster is not currently configured on this node
[root@virt-042 ~]# echo $?
1

> OK

## on other node

[root@virt-049 ~]# pcs cluster stop
Stopping Cluster (pacemaker)...
Stopping Cluster (corosync)...

[root@virt-049 ~]# pcs cluster start --wait=1 --debug | grep 'crm_mon'
Running: /usr/sbin/crm_mon --help-all
Finished running: /usr/sbin/crm_mon --help-all
  crm_mon [OPTION?]
If this program is called as crm_mon.cgi, --output-as=html --html-cgi will
When run interactively, crm_mon can be told to hide and display various sections
	crm_mon
	crm_mon -1
	crm_mon --group-by-node --inactive
Start crm_mon as a background daemon and have it write the cluster status to an HTML file:
	crm_mon --daemonize --output-as html --output-to /path/to/docroot/filename.html
Start crm_mon and export the current cluster status as XML to stdout, then exit:
	crm_mon --output-as xml
Running: /usr/sbin/crm_mon --one-shot --inactive --output-as xml
Finished running: /usr/sbin/crm_mon --one-shot --inactive --output-as xml
<pacemaker-result api-version="2.9" request="/usr/sbin/crm_mon --one-shot --inactive --output-as xml">
Error: Waiting timeout

> OK, the error messages preserved


Marking as VERIFIED SanityOnly for pcs-0.10.8-2.el8

Comment 1 Tomas Jelinek 2021-07-26 12:29:17 UTC
This is a counterpart to bz1885302. On top of what has been implemented in bz1885302, we are also switching 'pcs status xml' to the new format.

Comment 4 Tomas Jelinek 2021-08-20 12:38:27 UTC
Upstream patch: https://github.com/ClusterLabs/pcs/commit/11e758ed5e9879c35ce9e0b6c97ed8e7cdac3a03

Test:

run 'pcs status xml' and observe it returns the new format XML: <pacemaker-result... instead of <crm_mon...

Comment 5 Miroslav Lisik 2021-08-24 13:54:15 UTC
DevTestResults:

[root@r90-node-01 ~]# rpm -q pcs
pcs-0.11.0.alpha.1-1.el9.x86_64

[root@r90-node-01 ~]# pcs status xml
<pacemaker-result api-version="2.12" request="/usr/sbin/crm_mon --one-shot --inactive --output-as xml">
  <summary>
    <stack type="corosync"/>
    <current_dc present="true" version="2.1.0-11.el9-7c3f660707" name="r90-node-02" id="2" with_quorum="true"/>
    <last_update time="Tue Aug 24 10:59:00 2021"/>
    <last_change time="Tue Aug 24 10:58:52 2021" user="root" client="cibadmin" origin="r90-node-01"/>
    <nodes_configured number="2"/>
    <resources_configured number="2" disabled="0" blocked="0"/>
    <cluster_options stonith-enabled="true" symmetric-cluster="true" no-quorum-policy="stop" maintenance-mode="false" stop-all-resources="false"/>
  </summary>
  <nodes>
    <node name="r90-node-01" id="1" online="true" standby="false" standby_onfail="false" maintenance="false" pending="false" unclean="false" shutdown="false" expected_up="true" is_dc="false" resources_running="1" type="member"/>
    <node name="r90-node-02" id="2" online="true" standby="false" standby_onfail="false" maintenance="false" pending="false" unclean="false" shutdown="false" expected_up="true" is_dc="true" resources_running="1" type="member"/>
  </nodes>
  <resources>
    <resource id="fence-r90-node-01" resource_agent="stonith:fence_xvm" role="Started" active="true" orphaned="false" blocked="false" managed="true" failed="false" failure_ignored="false" nodes_running_on="1">
      <node name="r90-node-01" id="1" cached="true"/>
    </resource>
    <resource id="fence-r90-node-02" resource_agent="stonith:fence_xvm" role="Started" active="true" orphaned="false" blocked="false" managed="true" failed="false" failure_ignored="false" nodes_running_on="1">
      <node name="r90-node-02" id="2" cached="true"/>
    </resource>
  </resources>
  <node_history>
    <node name="r90-node-02">
      <resource_history id="fence-r90-node-02" orphan="false" migration-threshold="1000000">
        <operation_history call="10" task="start" rc="0" rc_text="ok" last-rc-change="Tue Aug 24 10:58:52 2021" exec-time="34ms" queue-time="0ms"/>
        <operation_history call="11" task="monitor" rc="0" rc_text="ok" interval="60000ms" last-rc-change="Tue Aug 24 10:58:52 2021" exec-time="38ms" queue-time="0ms"/>
      </resource_history>
    </node>
    <node name="r90-node-01">
      <resource_history id="fence-r90-node-01" orphan="false" migration-threshold="1000000">
        <operation_history call="6" task="start" rc="0" rc_text="ok" last-rc-change="Tue Aug 24 10:58:52 2021" exec-time="32ms" queue-time="0ms"/>
        <operation_history call="7" task="monitor" rc="0" rc_text="ok" interval="60000ms" last-rc-change="Tue Aug 24 10:58:52 2021" exec-time="28ms" queue-time="0ms"/>
      </resource_history>
    </node>
  </node_history>
  <status code="0" message="OK"/>
</pacemaker-result>

Comment 9 Michal Mazourek 2021-08-30 09:25:46 UTC
Marking as Verified SanityOnly based on comment 6.
New TestOnly bz was created - bz1999022