Bug 1300604
Summary: | [RFE] add option to crm_mon to display status of a single node | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Tomas Jelinek <tojeline> |
Component: | pacemaker | Assignee: | Chris Lumens <clumens> |
Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 8.0 | CC: | cfeist, cluster-maint, cluster-qe, gcase, jruemker, kgaillot, michele, msmazova, phagara, rmarigny, royoung, sbradley, tojeline |
Target Milestone: | rc | Keywords: | FutureFeature |
Target Release: | 8.3 | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | pacemaker-2.0.4-2.el8 | Doc Type: | No Doc Update |
Doc Text: |
Any corresponding pcs functionality should be documented instead
|
Story Points: | --- |
Clone Of: | 1285269 | Environment: | |
Last Closed: | 2020-11-04 04:00:53 UTC | Type: | Feature Request |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1752538 | ||
Bug Blocks: | 1285269, 1846368 |
Comment 1
Tomas Jelinek
2016-01-21 09:29:47 UTC
This will not be ready in the 7.3 timeframe. Unfortunately, due to capacity constraints, this will not be ready for 7.4. It will be a priority for 7.5. Due to a short time frame and limited capacity, this will not make 7.5. It might be a good idea to make the "query language" equivalent to "constraint specification language" (sans resource-context specifics), for nodes, it means, for instance: - #kind ne remote - #uname eq mynode1 or #uname eq mynode42 On the other hand, if we are going to mix tagging for the query purposes ([bug 1513550]), we should conversely reflect that also at the level of the constraints: - #tag eq bigiron Bumping to RHEL 8.1 due to devel/QA capacity constraints Setting QA CondNAK due to capacity constraints. To summarize the current plan, the following design will take care of this bz, Bug 1300597, and Bug 1363907: crm_mon will gain new "--include" and "--exclude" options to select what sections are shown. For example, to show only the top summary and the nodes status, you could use "crm_mon --include=none,summary,nodes". Or to show just the resources section, you could use "crm_mon --include=none,resources". (The "none" is to clear the defaults. You could equally keep the defaults and use "--exclude" to specify everything you don't want to see.) Some of these section names will take an optional qualifier as "SECTION:QUALIFIER". For "nodes" and "resources", this qualifier will be an XML ID, either of a particular node or resource, or of a tag. When a qualifier is given, crm_mon will show only the requested nodes and/or resources. For example, to show just the status of node1, you could use "crm_mon --include=none,nodes:node1". Or just the status of rsc1, "crm_mon --include=none,resources:rsc1". For an explanation of how tagging works in the Pacemaker configuration, see the "Tagging configuration elements" section of the upstream "Pacemaker Explained" documentation: https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html-single/Pacemaker_Explained/index.html#_tagging_configuration_elements pcs will need the ability to set tags (Bug 1684676) as well as new options to use this new selection syntax. Is the intention that using --include=nodes:node1 will make sure that only node1 is ever mentioned anywhere nodes would possibly show up, or only in the nodes section? For instance, nodes can be printed out as part of showing failed resource operations. Should everything except for node1 be suppressed there? (In reply to Chris Lumens from comment #13) > Is the intention that using --include=nodes:node1 will make sure that only > node1 is ever mentioned anywhere nodes would possibly show up, or only in > the nodes section? For instance, nodes can be printed out as part of > showing failed resource operations. Should everything except for node1 be > suppressed there? Good question. Now that I think about it, the proposed interface is too limiting -- someone might want to specify a single node with e.g. --include=none,attributes. Thinking about the original customer request (Bug 1285269), they were concerned with pcs status output, which is a limited subset of the crm_mon capability -- mostly just the default text output. They were interested in showing the status of a single node in the nodes section, and only the resources active on that node in the resources section. Which definitely gives a different perspective on what we need here. I'm thinking we need a separate option for --node (of course nN short options are already taken ...) so it can apply to all sections. I think the sections that are relevant are: * nodes - show only status of specified node * attributes - show only node attributes for specified node * resources - for active resources, show only resources active on the specified node (inactive resource display should obviously ignore this option) * failures - show only failures that occurred on specified node * bans - this one is debatable. We could theoretically show only bans that apply to the specified node, but bans can apply not just based on node name but on node attributes (via rules). That would involve having to evaluate rules which is probably overkill. I suppose we could omit all bans that explicitly specify a different node, so that we show any bans for the specified node and all bans that use rules (the idea being they "could" apply). * failcounts/operations - show only operations in specified node's history * fencing-* - This is another debatable one. Probably what makes the most sense is showing only fencing actions that targeted the specified node. A less likely alternative would be showing fencing actions that were executed by the specified node against other targets. Another alternative would be to ignore the option for these (i.e. always show all fencing operations). Well this project just grew a lot bigger ;) Well, there's no rule that says it has to have a short option. We could use something like --only-node=, which could then have a --only-resource= analog in the future. re-adding qa_ack+ Fixed upstream as of commit 2917e98 The option is crm_mon --node <node-name-or-tag> (no short option) > [root@virt-145 ~]# rpm -q pacemaker > pacemaker-2.0.4-5.el8.x86_64 Check that the new option is documented in crm_mon man/help. > [root@virt-145 ~]# man crm_mon > PACEMAKER(8) System Administration Utilities PACEMAKER(8) > NAME > Pacemaker - Part of the Pacemaker cluster resource manager > SYNOPSIS > crm_mon mode [options] > DESCRIPTION > Provides a summary of cluster's current state. > Outputs varying levels of detail in a number of different formats. > [...] > Display Options: > -I, --include=SECTION(s) > A list of sections to include in the output. See `Output Control` help for more information. > -U, --exclude=SECTION(s) > A list of sections to exclude from the output. See `Output Control` help for more information. > --node=NODE > When displaying information about nodes, show only what's related to the given node, or to all > nodes tagged with the given tag > [...] > [root@virt-145 ~]# crm_mon --help-display > Usage: > crm_mon [OPTION?] > Provides a summary of cluster's current state. > Outputs varying levels of detail in a number of different formats. > Display Options: > -I, --include=SECTION(s) A list of sections to include in the output. > See `Output Control` help for more information. > -U, --exclude=SECTION(s) A list of sections to exclude from the output. > See `Output Control` help for more information. > --node=NODE When displaying information about nodes, show only what's related to the given > node, or to all nodes tagged with the given tag > -n, --group-by-node Group resources by node > -r, --inactive Display inactive resources > -f, --failcounts Display resource fail counts > -o, --operations Display resource operation history > -t, --timing-details Display resource operation history with timing details > -c, --tickets Display cluster tickets > -m, --fence-history=LEVEL Show fence history: > 0=off, 1=failures and pending (default without option), > 2=add successes (default without value for option), > 3=show full history without reduction to most recent of each flavor > -L, --neg-locations Display negative location constraints [optionally filtered by id prefix] > -A, --show-node-attributes Display node attributes > -D, --hide-headers Hide all headers > -R, --show-detail Show more details (node IDs, individual clone instances) > [...] Nodes tagging is not implemented yet. More information can be found here: Bug 1684676. Display all nodes with their resources. > [root@virt-145 ~]# crm_mon -1 > Cluster Summary: > * Stack: corosync > * Current DC: virt-145 (version 2.0.4-5.el8-2deceaa3ae) - partition with quorum > * Last updated: Mon Aug 10 12:18:05 2020 > * Last change: Mon Aug 10 12:17:55 2020 by root via cibadmin on virt-145 > * 3 nodes configured > * 12 resource instances configured > Node List: > * Online: [ virt-144 virt-145 ] > * RemoteOnline: [ virt-143 ] > Active Resources: > * fence-virt-143 (stonith:fence_xvm): Started virt-145 > * fence-virt-144 (stonith:fence_xvm): Started virt-145 > * fence-virt-145 (stonith:fence_xvm): Started virt-144 > * dummy2 (ocf::pacemaker:Dummy): Started virt-143 > * dummy4 (ocf::pacemaker:Dummy): Started virt-143 > * dummy5 (ocf::pacemaker:Dummy): Started virt-145 > * Resource Group: dummy-group: > * dummy1 (ocf::pacemaker:Dummy): Started virt-144 > * dummy3 (ocf::pacemaker:Dummy): Started virt-144 > * Clone Set: dummy-clone [dummy]: > * Started: [ virt-143 virt-144 virt-145 ] > * virt-143 (ocf::pacemaker:remote): Started virt-144 Display single node. > [root@virt-145 ~]# crm_mon -1 --node virt-144 > Cluster Summary: > * Stack: corosync > * Current DC: virt-145 (version 2.0.4-5.el8-2deceaa3ae) - partition with quorum > * Last updated: Mon Aug 10 12:24:48 2020 > * Last change: Mon Aug 10 12:17:55 2020 by root via cibadmin on virt-145 > * 3 nodes configured > * 12 resource instances configured > Node List: > * Online: [ virt-144 ] > Active Resources: > * fence-virt-145 (stonith:fence_xvm): Started virt-144 > * Resource Group: dummy-group: > * dummy1 (ocf::pacemaker:Dummy): Started virt-144 > * dummy3 (ocf::pacemaker:Dummy): Started virt-144 > * Clone Set: dummy-clone [dummy]: > * Started: [ virt-144 ] > * virt-143 (ocf::pacemaker:remote): Started virt-144 Display single node with more details (node IDs, individual clone instances). > [root@virt-145 ~]# crm_mon -1 --node virt-144 --show-detail > Cluster Summary: > * Stack: corosync > * Current DC: virt-145 (3) (version 2.0.4-5.el8-2deceaa3ae) - partition with quorum > * Last updated: Mon Aug 10 12:25:08 2020 > * Last change: Mon Aug 10 12:17:55 2020 by root via cibadmin on virt-145 > * 3 nodes configured > * 12 resource instances configured > Node List: > * Online: [ virt-144 (2) ] > Active Resources: > * fence-virt-145 (stonith:fence_xvm): Started virt-144 > * Resource Group: dummy-group: > * dummy1 (ocf::pacemaker:Dummy): Started virt-144 > * dummy3 (ocf::pacemaker:Dummy): Started virt-144 > * Clone Set: dummy-clone [dummy]: > * dummy (ocf::pacemaker:Dummy): Started virt-144 > * virt-143 (ocf::pacemaker:remote): Started virt-144 Display a single node as XML. > [root@virt-145 ~]# crm_mon --node virt-144 --output-as xml > <pacemaker-result api-version="2.2" request="crm_mon --node virt-144 --output-as xml"> > <summary> > <stack type="corosync"/> > <current_dc present="true" version="2.0.4-5.el8-2deceaa3ae" name="virt-145" id="3" with_quorum="true"/> > <last_update time="Mon Aug 10 12:25:45 2020"/> > <last_change time="Mon Aug 10 12:17:55 2020" user="root" client="cibadmin" origin="virt-145"/> > <nodes_configured number="3"/> > <resources_configured number="12" disabled="0" blocked="0"/> > <cluster_options stonith-enabled="true" symmetric-cluster="true" no-quorum-policy="stop" maintenance-mode="false"/> > </summary> > <nodes> > <node name="virt-144" id="2" online="true" standby="false" standby_onfail="false" maintenance="false" pending="false" unclean="false" shutdown="false" expected_up="true" is_dc="false" resources_running="5" type="member"/> > </nodes> > <resources> > <resource id="fence-virt-145" resource_agent="stonith:fence_xvm" role="Started" active="true" orphaned="false" blocked="false" managed="true" failed="false" failure_ignored="false" nodes_running_on="1"> > <node name="virt-144" id="2" cached="true"/> > </resource> > <group id="dummy-group" number_resources="2"> > <resource id="dummy1" resource_agent="ocf::pacemaker:Dummy" role="Started" active="true" orphaned="false" blocked="false" managed="true" failed="false" failure_ignored="false" nodes_running_on="1"> > <node name="virt-144" id="2" cached="true"/> > </resource> > <resource id="dummy3" resource_agent="ocf::pacemaker:Dummy" role="Started" active="true" orphaned="false" blocked="false" managed="true" failed="false" failure_ignored="false" nodes_running_on="1"> > <node name="virt-144" id="2" cached="true"/> > </resource> > </group> > <clone id="dummy-clone" multi_state="false" unique="false" managed="true" failed="false" failure_ignored="false"> > <resource id="dummy" resource_agent="ocf::pacemaker:Dummy" role="Started" active="true" orphaned="false" blocked="false" managed="true" failed="false" failure_ignored="false" nodes_running_on="1"> > <node name="virt-144" id="2" cached="true"/> > </resource> > </clone> > <resource id="virt-143" resource_agent="ocf::pacemaker:remote" role="Started" active="true" orphaned="false" blocked="false" managed="true" failed="false" failure_ignored="false" nodes_running_on="1"> > <node name="virt-144" id="2" cached="true"/> > </resource> > </resources> > <node_history> > <node name="virt-144"> > <resource_history id="fence-virt-143" orphan="false" migration-threshold="1000000"> > <operation_history call="24" task="monitor" interval="60000ms" last-rc-change="Mon Aug 10 12:10:57 2020" exec-time="366ms" queue-time="0ms" rc="0" rc_text="ok"/> > <operation_history call="84" task="stop" last-rc-change="Mon Aug 10 12:16:40 2020" last-run="Mon Aug 10 12:16:40 2020" exec-time="2ms" queue-time="0ms" rc="0" rc_text="ok"/> > </resource_history> > <resource_history id="fence-virt-144" orphan="false" migration-threshold="1000000"> > <operation_history call="19" task="stop" last-rc-change="Mon Aug 10 12:10:57 2020" last-run="Mon Aug 10 12:10:57 2020" exec-time="0ms" queue-time="0ms" rc="0" rc_text="ok"/> > </resource_history> > <resource_history id="fence-virt-145" orphan="false" migration-threshold="1000000"> > <operation_history call="88" task="start" last-rc-change="Mon Aug 10 12:16:41 2020" last-run="Mon Aug 10 12:16:41 2020" exec-time="267ms" queue-time="0ms" rc="0" rc_text="ok"/> > <operation_history call="90" task="monitor" interval="60000ms" last-rc-change="Mon Aug 10 12:16:41 2020" exec-time="197ms" queue-time="0ms" rc="0" rc_text="ok"/> > </resource_history> > <resource_history id="dummy1" orphan="false" migration-threshold="1000000"> > <operation_history call="39" task="start" last-rc-change="Mon Aug 10 12:11:54 2020" last-run="Mon Aug 10 12:11:54 2020" exec-time="36ms" queue-time="0ms" rc="0" rc_text="ok"/> > <operation_history call="41" task="monitor" interval="10000ms" last-rc-change="Mon Aug 10 12:11:54 2020" exec-time="96ms" queue-time="0ms" rc="0" rc_text="ok"/> > </resource_history> > <resource_history id="dummy2" orphan="false" migration-threshold="1000000"> > <operation_history call="81" task="monitor" interval="10000ms" last-rc-change="Mon Aug 10 12:12:45 2020" exec-time="120ms" queue-time="0ms" rc="0" rc_text="ok"/> > <operation_history call="95" task="stop" last-rc-change="Mon Aug 10 12:16:42 2020" last-run="Mon Aug 10 12:16:42 2020" exec-time="414ms" queue-time="0ms" rc="0" rc_text="ok"/> > </resource_history> > <resource_history id="dummy3" orphan="false" migration-threshold="1000000"> > <operation_history call="51" task="start" last-rc-change="Mon Aug 10 12:11:57 2020" last-run="Mon Aug 10 12:11:57 2020" exec-time="102ms" queue-time="0ms" rc="0" rc_text="ok"/> > <operation_history call="53" task="monitor" interval="10000ms" last-rc-change="Mon Aug 10 12:11:58 2020" exec-time="78ms" queue-time="0ms" rc="0" rc_text="ok"/> > </resource_history> > <resource_history id="dummy4" orphan="false" migration-threshold="1000000"> > <operation_history call="73" task="monitor" interval="10000ms" last-rc-change="Mon Aug 10 12:12:43 2020" exec-time="84ms" queue-time="1ms" rc="0" rc_text="ok"/> > <operation_history call="99" task="stop" last-rc-change="Mon Aug 10 12:16:42 2020" last-run="Mon Aug 10 12:16:42 2020" exec-time="458ms" queue-time="0ms" rc="0" rc_text="ok"/> > </resource_history> > <resource_history id="dummy5" orphan="false" migration-threshold="1000000"> > <operation_history call="65" task="monitor" interval="10000ms" last-rc-change="Mon Aug 10 12:12:01 2020" exec-time="101ms" queue-time="0ms" rc="0" rc_text="ok"/> > <operation_history call="76" task="stop" last-rc-change="Mon Aug 10 12:12:45 2020" last-run="Mon Aug 10 12:12:45 2020" exec-time="126ms" queue-time="0ms" rc="0" rc_text="ok"/> > </resource_history> > <resource_history id="dummy" orphan="false" migration-threshold="1000000"> > <operation_history call="31" task="start" last-rc-change="Mon Aug 10 12:11:52 2020" last-run="Mon Aug 10 12:11:52 2020" exec-time="36ms" queue-time="0ms" rc="0" rc_text="ok"/> > <operation_history call="33" task="monitor" interval="10000ms" last-rc-change="Mon Aug 10 12:11:52 2020" exec-time="97ms" queue-time="0ms" rc="0" rc_text="ok"/> > </resource_history> > <resource_history id="virt-143" orphan="false" migration-threshold="1000000"> > <operation_history call="2" task="start" last-rc-change="Mon Aug 10 12:16:41 2020" last-run="Mon Aug 10 12:16:41 2020" exec-time="0ms" queue-time="0ms" rc="0" rc_text="ok"/> > <operation_history call="3" task="monitor" interval="60000ms" last-rc-change="Mon Aug 10 12:16:45 2020" exec-time="0ms" queue-time="0ms" rc="0" rc_text="ok"/> > </resource_history> > </node> > </node_history> > <status code="0" message="OK"/> > </pacemaker-result> Try using the command without specifying the node name. > [root@virt-145 ~]# crm_mon -1 --node > crm_mon: Missing argument for --node Try using it with more than one node argument. It takes only the first one. > [root@virt-145 ~]# crm_mon -1 --node virt-144 virt-143 > Cluster Summary: > * Stack: corosync > * Current DC: virt-145 (version 2.0.4-5.el8-2deceaa3ae) - partition with quorum > * Last updated: Mon Aug 10 12:30:08 2020 > * Last change: Mon Aug 10 12:17:55 2020 by root via cibadmin on virt-145 > * 3 nodes configured > * 12 resource instances configured > Node List: > * Online: [ virt-144 ] > Active Resources: > * fence-virt-145 (stonith:fence_xvm): Started virt-144 > * Resource Group: dummy-group: > * dummy1 (ocf::pacemaker:Dummy): Started virt-144 > * dummy3 (ocf::pacemaker:Dummy): Started virt-144 > * Clone Set: dummy-clone [dummy]: > * Started: [ virt-144 ] > * virt-143 (ocf::pacemaker:remote): Started virt-144 Try using two `--node` options in one command. The `--node` option accepts only a single node name or tag as argument. Only a single `--node` option is accepted by crm_mon command (latter instances overwrite earlier ones, ie. `--node n1 --node n2` shows status for node n2). > [root@virt-145 ~]# crm_mon -1 --node virt-143 --node virt-144 > Cluster Summary: > * Stack: corosync > * Current DC: virt-145 (version 2.0.4-5.el8-2deceaa3ae) - partition with quorum > * Last updated: Mon Aug 10 12:30:15 2020 > * Last change: Mon Aug 10 12:17:55 2020 by root via cibadmin on virt-145 > * 3 nodes configured > * 12 resource instances configured > Node List: > * Online: [ virt-144 ] > Active Resources: > * fence-virt-145 (stonith:fence_xvm): Started virt-144 > * Resource Group: dummy-group: > * dummy1 (ocf::pacemaker:Dummy): Started virt-144 > * dummy3 (ocf::pacemaker:Dummy): Started virt-144 > * Clone Set: dummy-clone [dummy]: > * Started: [ virt-144 ] > * virt-143 (ocf::pacemaker:remote): Started virt-144 Remove node `virt-144` from the cluster and display its status with updates as they occur. Status shows only "Cluster Summary", node details are not visible and under "Active Resources" it says "No active resources". > [root@virt-145 ~]# pcs cluster node remove virt-144 > Destroying cluster on hosts: 'virt-144'... > virt-144: Successfully destroyed cluster > Sending updated corosync.conf to nodes... > virt-145: Succeeded > virt-145: Corosync configuration reloaded > [root@virt-143 ~]# crm_mon --node virt-144 > Cluster Summary: > * Stack: corosync > * Current DC: virt-145 (version 2.0.4-5.el8-2deceaa3ae) - partition with quorum > * Last updated: Mon Aug 10 12:35:17 2020 > * Last change: Mon Aug 10 12:32:09 2020 by hacluster via crm_node on virt-145 > * 2 nodes configured > * 11 resource instances configured > Active Resources: > * No active resources Add node `virt-144` back to the cluster and restart Pacemaker on it. Node `virt-144` status is updated with information about node and active resources. > [root@virt-145 ~]# pcs cluster node add virt-144 > No addresses specified for host 'virt-144', using 'virt-144' > Disabling sbd... > virt-144: sbd disabled > Sending 'corosync authkey', 'pacemaker authkey' to 'virt-144' > virt-144: successful distribution of the file 'corosync authkey' > virt-144: successful distribution of the file 'pacemaker authkey' > Sending updated corosync.conf to nodes... > virt-145: Succeeded > virt-144: Succeeded > virt-145: Corosync configuration reloaded > [root@virt-145 ~]# pcs cluster start --all > virt-145: Starting Cluster... > virt-144: Starting Cluster... > [root@virt-143 ~]# crm_mon --node virt-144 > Cluster Summary: > * Stack: corosync > * Current DC: virt-145 (version 2.0.4-5.el8-2deceaa3ae) - partition with quorum > * Last updated: Mon Aug 10 12:52:39 2020 > * Last change: Mon Aug 10 12:52:28 2020 by hacluster via crmd on virt-145 > * 3 nodes configured > * 12 resource instances configured > Node List: > * Online: [ virt-144 ] > Active Resources: > * fence-virt-143 (stonith:fence_xvm): Started virt-144 > * fence-virt-145 (stonith:fence_xvm): Started virt-144 > * Resource Group: dummy-group: > * dummy1 (ocf::pacemaker:Dummy): Started virt-144 > * dummy3 (ocf::pacemaker:Dummy): Started virt-144 > * Clone Set: dummy-clone [dummy]: > * Started: [ virt-144 ] Display Pacemaker remote node `virt-143`. > [root@virt-145 ~]# crm_mon -1 --node virt-143 > Cluster Summary: > * Stack: corosync > * Current DC: virt-145 (version 2.0.4-5.el8-2deceaa3ae) - partition with quorum > * Last updated: Mon Aug 10 12:57:18 2020 > * Last change: Mon Aug 10 12:52:28 2020 by hacluster via crmd on virt-145 > * 3 nodes configured > * 12 resource instances configured > Node List: > * RemoteOnline: [ virt-143 ] > Active Resources: > * dummy2 (ocf::pacemaker:Dummy): Started virt-143 > * dummy4 (ocf::pacemaker:Dummy): Started virt-143 > * dummy5 (ocf::pacemaker:Dummy): Started virt-143 > * Clone Set: dummy-clone [dummy]: > * Started: [ virt-143 ] Disable Pacemaker remote resource `virt-143` and then display information about remote node `virt-143`. Remote node is marked "Offline" and has no active resources. > [root@virt-145 ~]# pcs resource disable virt-143 > [root@virt-145 ~]# crm_mon -1 --node virt-143 > Cluster Summary: > * Stack: corosync > * Current DC: virt-145 (version 2.0.4-5.el8-2deceaa3ae) - partition with quorum > * Last updated: Mon Aug 10 12:58:47 2020 > * Last change: Mon Aug 10 12:58:42 2020 by root via cibadmin on virt-145 > * 3 nodes configured > * 12 resource instances configured (1 DISABLED) > Node List: > * RemoteOFFLINE: [ virt-143 ] > Active Resources: > * No active resources Add an option `--inactive` to see the stopped (disabled) resources on remote node `virt-143`. > [root@virt-145 ~]# crm_mon -1 --inactive --node virt-143 > Cluster Summary: > * Stack: corosync > * Current DC: virt-145 (version 2.0.4-5.el8-2deceaa3ae) - partition with quorum > * Last updated: Mon Aug 10 13:00:39 2020 > * Last change: Mon Aug 10 12:58:42 2020 by root via cibadmin on virt-145 > * 3 nodes configured > * 12 resource instances configured (1 DISABLED) > Node List: > * RemoteOFFLINE: [ virt-143 ] > Full List of Resources: > * virt-143 (ocf::pacemaker:remote): Stopped (disabled) Enable remote node `virt-143`. > [root@virt-145 ~]# pcs resource enable virt-143 Disable group and clone resources and then display node `virt-144`. According to "Cluster Summary" 5 resource instances are disabled and only the started (active) resources are displayed. > [root@virt-145 ~]# pcs resource disable dummy-group; pcs resource disable dummy-clone > [root@virt-145 ~]# crm_mon -1 --node virt-144 > Cluster Summary: > * Stack: corosync > * Current DC: virt-145 (version 2.0.4-5.el8-2deceaa3ae) - partition with quorum > * Last updated: Mon Aug 10 13:04:51 2020 > * Last change: Mon Aug 10 13:04:37 2020 by root via cibadmin on virt-145 > * 3 nodes configured > * 12 resource instances configured (5 DISABLED) > Node List: > * Online: [ virt-144 ] > Active Resources: > * fence-virt-145 (stonith:fence_xvm): Started virt-144 > * virt-143 (ocf::pacemaker:remote): Started virt-144 Add an option ` --inactive` to see the stopped (disabled) resources on the node `virt-144`. Both started and stopped (disabled) resources are displayed under "Full List of Resources". > [root@virt-145 ~]# crm_mon -1 --inactive --node virt-144 > Cluster Summary: > * Stack: corosync > * Current DC: virt-145 (version 2.0.4-5.el8-2deceaa3ae) - partition with quorum > * Last updated: Mon Aug 10 13:05:22 2020 > * Last change: Mon Aug 10 13:04:37 2020 by root via cibadmin on virt-145 > * 3 nodes configured > * 12 resource instances configured (5 DISABLED) > Node List: > * Online: [ virt-144 ] > Full List of Resources: > * fence-virt-145 (stonith:fence_xvm): Started virt-144 > * Resource Group: dummy-group: > * dummy1 (ocf::pacemaker:Dummy): Stopped (disabled) > * dummy3 (ocf::pacemaker:Dummy): Stopped (disabled) > * Clone Set: dummy-clone [dummy]: > * Stopped (disabled): [ virt-144 ] > * virt-143 (ocf::pacemaker:remote): Started virt-144 Display node `virt-144` with disabled group and clone resources as XML. > [root@virt-145 ~]# crm_mon --node virt-144 --output-as xml > <pacemaker-result api-version="2.2" request="crm_mon --node virt-144 --output-as xml"> > <summary> > <stack type="corosync"/> > <current_dc present="true" version="2.0.4-5.el8-2deceaa3ae" name="virt-145" id="3" with_quorum="true"/> > <last_update time="Mon Aug 10 13:07:01 2020"/> > <last_change time="Mon Aug 10 13:04:37 2020" user="root" client="cibadmin" origin="virt-145"/> > <nodes_configured number="3"/> > <resources_configured number="12" disabled="5" blocked="0"/> > <cluster_options stonith-enabled="true" symmetric-cluster="true" no-quorum-policy="stop" maintenance-mode="false"/> > </summary> > <nodes> > <node name="virt-144" id="1" online="true" standby="false" standby_onfail="false" maintenance="false" pending="false" unclean="false" shutdown="false" expected_up="true" is_dc="false" resources_running="2" type="member"/> > </nodes> > <resources> > <resource id="fence-virt-145" resource_agent="stonith:fence_xvm" role="Started" active="true" orphaned="false" blocked="false" managed="true" failed="false" failure_ignored="false" nodes_running_on="1"> > <node name="virt-144" id="1" cached="true"/> > </resource> > <group id="dummy-group" number_resources="2"> > <resource id="dummy1" resource_agent="ocf::pacemaker:Dummy" role="Stopped" target_role="Stopped" active="false" orphaned="false" blocked="false" managed="true" failed="false" failure_ignored="false" nodes_running_on="0"/> > <resource id="dummy3" resource_agent="ocf::pacemaker:Dummy" role="Stopped" target_role="Stopped" active="false" orphaned="false" blocked="false" managed="true" failed="false" failure_ignored="false" nodes_running_on="0"/> > </group> > <clone id="dummy-clone" multi_state="false" unique="false" managed="true" failed="false" failure_ignored="false" target_role="Stopped"> > <resource id="dummy" resource_agent="ocf::pacemaker:Dummy" role="Stopped" target_role="Stopped" active="false" orphaned="false" blocked="false" managed="true" failed="false" failure_ignored="false" nodes_running_on="0"/> > <resource id="dummy" resource_agent="ocf::pacemaker:Dummy" role="Stopped" target_role="Stopped" active="false" orphaned="false" blocked="false" managed="true" failed="false" failure_ignored="false" nodes_running_on="0"/> > <resource id="dummy" resource_agent="ocf::pacemaker:Dummy" role="Stopped" target_role="Stopped" active="false" orphaned="false" blocked="false" managed="true" failed="false" failure_ignored="false" nodes_running_on="0"/> > </clone> > <resource id="virt-143" resource_agent="ocf::pacemaker:remote" role="Started" active="true" orphaned="false" blocked="false" managed="true" failed="false" failure_ignored="false" nodes_running_on="1"> > <node name="virt-144" id="1" cached="true"/> > </resource> > </resources> > <node_history> > <node name="virt-144"> > <resource_history id="fence-virt-143" orphan="false" migration-threshold="1000000"> > <operation_history call="17" task="monitor" interval="60000ms" last-rc-change="Mon Aug 10 12:52:30 2020" exec-time="323ms" queue-time="0ms" rc="0" rc_text="ok"/> > <operation_history call="65" task="stop" last-rc-change="Mon Aug 10 13:02:15 2020" last-run="Mon Aug 10 13:02:15 2020" exec-time="0ms" queue-time="0ms" rc="0" rc_text="ok"/> > </resource_history> > <resource_history id="fence-virt-145" orphan="false" migration-threshold="1000000"> > <operation_history call="15" task="start" last-rc-change="Mon Aug 10 12:52:30 2020" last-run="Mon Aug 10 12:52:30 2020" exec-time="185ms" queue-time="0ms" rc="0" rc_text="ok"/> > <operation_history call="19" task="monitor" interval="60000ms" last-rc-change="Mon Aug 10 12:52:30 2020" exec-time="303ms" queue-time="0ms" rc="0" rc_text="ok"/> > </resource_history> > <resource_history id="dummy4" orphan="false" migration-threshold="1000000"> > <operation_history call="62" task="monitor" interval="10000ms" last-rc-change="Mon Aug 10 12:58:43 2020" exec-time="105ms" queue-time="0ms" rc="0" rc_text="ok"/> > <operation_history call="71" task="stop" last-rc-change="Mon Aug 10 13:02:16 2020" last-run="Mon Aug 10 13:02:16 2020" exec-time="337ms" queue-time="0ms" rc="0" rc_text="ok"/> > </resource_history> > <resource_history id="dummy1" orphan="false" migration-threshold="1000000"> > <operation_history call="52" task="monitor" interval="10000ms" last-rc-change="Mon Aug 10 12:52:32 2020" exec-time="155ms" queue-time="0ms" rc="0" rc_text="ok"/> > <operation_history call="86" task="stop" last-rc-change="Mon Aug 10 13:04:36 2020" last-run="Mon Aug 10 13:04:36 2020" exec-time="276ms" queue-time="0ms" rc="0" rc_text="ok"/> > </resource_history> > <resource_history id="dummy3" orphan="false" migration-threshold="1000000"> > <operation_history call="57" task="monitor" interval="10000ms" last-rc-change="Mon Aug 10 12:52:32 2020" exec-time="361ms" queue-time="0ms" rc="0" rc_text="ok"/> > <operation_history call="82" task="stop" last-rc-change="Mon Aug 10 13:04:35 2020" last-run="Mon Aug 10 13:04:35 2020" exec-time="113ms" queue-time="0ms" rc="0" rc_text="ok"/> > </resource_history> > <resource_history id="dummy" orphan="false" migration-threshold="1000000"> > <operation_history call="55" task="monitor" interval="10000ms" last-rc-change="Mon Aug 10 12:52:32 2020" exec-time="240ms" queue-time="1ms" rc="0" rc_text="ok"/> > <operation_history call="90" task="stop" last-rc-change="Mon Aug 10 13:04:37 2020" last-run="Mon Aug 10 13:04:37 2020" exec-time="132ms" queue-time="0ms" rc="0" rc_text="ok"/> > </resource_history> > <resource_history id="virt-143" orphan="false" migration-threshold="1000000"> > <operation_history call="2" task="start" last-rc-change="Mon Aug 10 13:02:15 2020" last-run="Mon Aug 10 13:02:15 2020" exec-time="0ms" queue-time="0ms" rc="0" rc_text="ok"/> > <operation_history call="3" task="monitor" interval="60000ms" last-rc-change="Mon Aug 10 13:02:17 2020" exec-time="0ms" queue-time="0ms" rc="0" rc_text="ok"/> > </resource_history> > </node> > </node_history> > <status code="0" message="OK"/> > </pacemaker-result> Enable group and clone resources and also enable the cluster services to run on startup on each node in the cluster. This allows nodes to automatically rejoin the cluster after they have been fenced. > [root@virt-145 ~]# pcs resource enable dummy-group; pcs resource enable dummy-clone > [root@virt-145 ~]# pcs cluster enable --all > virt-144: Cluster Enabled > virt-145: Cluster Enabled Display node `virt-144` state and include fencing history. Since node has not been fenced, fencing history is not visible yet. > [root@virt-143 ~]# crm_mon --node virt-144 --fence-history=3 > Cluster Summary: > * Stack: corosync > * Current DC: virt-145 (version 2.0.4-5.el8-2deceaa3ae) - partition with quorum > * Last updated: Thu Aug 13 16:36:58 2020 > * Last change: Thu Aug 13 16:25:37 2020 by root via cibadmin on virt-145 > * 3 nodes configured > * 12 resource instances configured > Node List: > * Online: [ virt-144 ] > Active Resources: > * fence-virt-145 (stonith:fence_xvm): Started virt-144 > * Clone Set: dummy-clone [dummy]: > * Started: [ virt-144 ] > * dummy5 (ocf::pacemaker:Dummy): Started virt-144 > * virt-143 (ocf::pacemaker:remote): Started virt-144 Fence node `virt-144` and check how node `virt-144` status and its fencing history is updated. > [root@virt-145 ~]# pcs stonith fence virt-144 > Node: virt-144 fenced There is a pending fencing on node `virt-144`. > [root@virt-145 ~]# crm_mon --node virt-144 --fence-history=3 > Cluster Summary: > * Stack: corosync > * Current DC: virt-145 (version 2.0.4-5.el8-2deceaa3ae) - partition with quorum > * Last updated: Thu Aug 13 16:38:31 2020 > * Last change: Thu Aug 13 16:25:37 2020 by root via cibadmin on virt-145 > * 3 nodes configured > * 12 resource instances configured > Node List: > * Online: [ virt-144 ] > Active Resources: > * fence-virt-145 (stonith:fence_xvm): Started virt-144 > * Clone Set: dummy-clone [dummy]: > * Started: [ virt-144 ] > * dummy5 (ocf::pacemaker:Dummy): Started virt-144 > * virt-143 (ocf::pacemaker:remote): Started virt-144 > Fencing History: > * reboot of virt-144 pending: client=stonith_admin.771629, origin=virt-145 Node `virt-144` was fenced and is offline, "Fencing History" was updated accordingly. > [root@virt-145 ~]# crm_mon --node virt-144 --fence-history=3 > Cluster Summary: > * Stack: corosync > * Current DC: virt-145 (version 2.0.4-5.el8-2deceaa3ae) - partition with quorum > * Last updated: Thu Aug 13 16:38:53 2020 > * Last change: Thu Aug 13 16:25:37 2020 by root via cibadmin on virt-145 > * 3 nodes configured > * 12 resource instances configured > Node List: > * OFFLINE: [ virt-144 ] > Active Resources: > * No active resources > Fencing History: > * reboot of virt-144 successful: delegate=virt-145, client=stonith_admin.771629, origin=virt-145, completed='2020-08-13 16:38:34 +02:00' After approximately 2 minutes, node `virt-144` and all its resources are restarted. > [root@virt-145 ~]# crm_mon --node virt-144 --fence-history=3 > Cluster Summary: > * Stack: corosync > * Current DC: virt-145 (version 2.0.4-5.el8-2deceaa3ae) - partition with quorum > * Last updated: Thu Aug 13 16:40:29 2020 > * Last change: Thu Aug 13 16:25:37 2020 by root via cibadmin on virt-145 > * 3 nodes configured > * 12 resource instances configured > Node List: > * Online: [ virt-144 ] > Active Resources: > * fence-virt-143 (stonith:fence_xvm): Started virt-144 > * fence-virt-145 (stonith:fence_xvm): Started virt-144 > * Clone Set: dummy-clone [dummy]: > * Started: [ virt-144 ] > * dummy5 (ocf::pacemaker:Dummy): Started virt-144 > Fencing History: > * reboot of virt-144 successful: delegate=virt-145, client=stonith_admin.771629, origin=virt-145, completed='2020-08-13 16:38:34 +02:00' Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (pacemaker bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2020:4804 |