Bug 2065812
| Summary: | Show node health states in crm_mon | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Ken Gaillot <kgaillot> |
| Component: | pacemaker | Assignee: | Ken Gaillot <kgaillot> |
| Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 8.6 | CC: | bfrank, cluster-maint, jrehova, msmazova, slevine |
| Target Milestone: | rc | Keywords: | FutureFeature, Triaged |
| Target Release: | 8.7 | Flags: | pm-rhel:
mirror+
|
| Hardware: | All | ||
| OS: | All | ||
| Whiteboard: | |||
| Fixed In Version: | pacemaker-2.1.3-1.el8 | Doc Type: | Enhancement |
| Doc Text: |
Feature: If a cluster has a node health strategy configured, then nodes with at least one health attribute in "yellow" or "red" status will be indicated as such in pcs status output.
Reason: Previously, there was no easy way to tell why resources were not running on a node with degraded health.
Result: Degraded node health can be seen at a glance in pcs status output.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-11-08 09:42:25 UTC | Type: | Feature Request |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Ken Gaillot
2022-03-18 19:22:34 UTC
The message could be more specific, like "Online but node health score is $INTEGER (red):" Feature merged upstream as of commit 398d8aa
With the final design, nodes with degraded health will be shown in pcs status like:
* Node List:
* Node node1: online (health is RED)
The indicator will say RED if at least one health attribute is red, otherwise YELLOW if at least one health attribute is yellow, otherwise no health status will be shown.
* 2-node cluster * dummy fence agent installed on both nodes as /usr/sbin/fence_bz1978010: https://github.com/ClusterLabs/fence-agents/blob/master/agents/dummy/fence_dummy.py * node-health-strategy: only-green Testing ========= > [root@virt-557 ~]# rpm -q pacemaker > pacemaker-2.1.3-2.el8.x86_64 Pcs status of cluster: > [root@virt-557 15:24:23 ~]# pcs status > Cluster name: STSRHTS20634 > Cluster Summary: > * Stack: corosync > * Current DC: virt-558 (version 2.1.3-2.el8-da2fd79c89) - partition with quorum > * Last updated: Wed Jun 29 15:24:31 2022 > * Last change: Wed Jun 29 15:17:05 2022 by root via cibadmin on virt-557 > * 2 nodes configured > * 3 resource instances configured > > Node List: > * Online: [ virt-557 virt-558 ] > > Full List of Resources: > * fence-virt-557 (stonith:fence_xvm): Started virt-557 > * fence-virt-558 (stonith:fence_xvm): Started virt-558 > * resource_dummy (ocf::pacemaker:Dummy): Started virt-557 > > Daemon Status: > corosync: active/disabled > pacemaker: active/disabled > pcsd: active/enabled Test of #health-cpu attribute, possibilities: > [root@virt-557 15:24:32 ~]# attrd_updater --name "#health-cpu" --update "red" --node "virt-557" > [root@virt-557 15:25:45 ~]# pcs status > Cluster name: STSRHTS20634 > Cluster Summary: > * Stack: corosync > * Current DC: virt-558 (version 2.1.3-2.el8-da2fd79c89) - partition with quorum > * Last updated: Wed Jun 29 15:25:54 2022 > * Last change: Wed Jun 29 15:17:05 2022 by root via cibadmin on virt-557 > * 2 nodes configured > * 3 resource instances configured > > Node List: > * Node virt-557: online (health is RED) > * Online: [ virt-558 ] > > Full List of Resources: > * fence-virt-557 (stonith:fence_xvm): Started virt-558 > * fence-virt-558 (stonith:fence_xvm): Started virt-558 > * resource_dummy (ocf::pacemaker:Dummy): Started virt-558 > > Daemon Status: > corosync: active/disabled > pacemaker: active/disabled > pcsd: active/enabled > [root@virt-557 15:25:54 ~]# attrd_updater --name "#health-cpu" --update "red" --node "virt-558" > [root@virt-557 15:26:23 ~]# pcs status > Cluster name: STSRHTS20634 > Cluster Summary: > * Stack: corosync > * Current DC: virt-558 (version 2.1.3-2.el8-da2fd79c89) - partition with quorum > * Last updated: Wed Jun 29 15:26:30 2022 > * Last change: Wed Jun 29 15:17:05 2022 by root via cibadmin on virt-557 > * 2 nodes configured > * 3 resource instances configured > > Node List: > * Node virt-557: online (health is RED) > * Node virt-558: online (health is RED) > > Full List of Resources: > * fence-virt-557 (stonith:fence_xvm): Stopped > * fence-virt-558 (stonith:fence_xvm): Stopped > * resource_dummy (ocf::pacemaker:Dummy): Stopped > > Daemon Status: > corosync: active/disabled > pacemaker: active/disabled > pcsd: active/enabled > [root@virt-557 15:27:34 ~]# attrd_updater --name "#health-cpu" --update "green" --node "virt-557" > [root@virt-557 15:27:52 ~]# attrd_updater --name "#health-cpu" --update "yellow" --node "virt-558" > [root@virt-557 15:28:20 ~]# pcs status > Cluster name: STSRHTS20634 > Cluster Summary: > * Stack: corosync > * Current DC: virt-558 (version 2.1.3-2.el8-da2fd79c89) - partition with quorum > * Last updated: Wed Jun 29 15:28:35 2022 > * Last change: Wed Jun 29 15:17:05 2022 by root via cibadmin on virt-557 > * 2 nodes configured > * 3 resource instances configured > > Node List: > * Node virt-558: online (health is YELLOW) > * Online: [ virt-557 ] > > Full List of Resources: > * fence-virt-557 (stonith:fence_xvm): Started virt-557 > * fence-virt-558 (stonith:fence_xvm): Started virt-557 > * resource_dummy (ocf::pacemaker:Dummy): Started virt-557 > > Daemon Status: > corosync: active/disabled > pacemaker: active/disabled > pcsd: active/enabled > [root@virt-557 15:28:35 ~]# attrd_updater --name "#health-cpu" --update "green" --node "virt-558" > [root@virt-557 15:29:07 ~]# attrd_updater --name "#health-cpu" --update "yellow" --node "virt-557" > [root@virt-557 15:29:18 ~]# pcs status > Cluster name: STSRHTS20634 > Cluster Summary: > * Stack: corosync > * Current DC: virt-558 (version 2.1.3-2.el8-da2fd79c89) - partition with quorum > * Last updated: Wed Jun 29 15:29:23 2022 > * Last change: Wed Jun 29 15:17:05 2022 by root via cibadmin on virt-557 > * 2 nodes configured > * 3 resource instances configured > > Node List: > * Node virt-557: online (health is YELLOW) > * Online: [ virt-558 ] > > Full List of Resources: > * fence-virt-557 (stonith:fence_xvm): Started virt-558 > * fence-virt-558 (stonith:fence_xvm): Started virt-558 > * resource_dummy (ocf::pacemaker:Dummy): Started virt-558 > > Daemon Status: > corosync: active/disabled > pacemaker: active/disabled > pcsd: active/enabled Test of #health-iowait attribute, possibilities: > [root@virt-557 15:29:41 ~]# attrd_updater --name "#health-iowait" --update "red" --node "virt-557" > [root@virt-557 15:31:18 ~]# pcs status > Cluster name: STSRHTS20634 > Cluster Summary: > * Stack: corosync > * Current DC: virt-558 (version 2.1.3-2.el8-da2fd79c89) - partition with quorum > * Last updated: Wed Jun 29 15:31:22 2022 > * Last change: Wed Jun 29 15:17:05 2022 by root via cibadmin on virt-557 > * 2 nodes configured > * 3 resource instances configured > > Node List: > * Node virt-557: online (health is RED) > * Online: [ virt-558 ] > > Full List of Resources: > * fence-virt-557 (stonith:fence_xvm): Started virt-558 > * fence-virt-558 (stonith:fence_xvm): Started virt-558 > * resource_dummy (ocf::pacemaker:Dummy): Started virt-558 > > Daemon Status: > corosync: active/disabled > pacemaker: active/disabled > pcsd: active/enabled > [root@virt-557 15:31:23 ~]# attrd_updater --name "#health-iowait" --update "red" --node "virt-558" > [root@virt-557 15:31:35 ~]# pcs status > Cluster name: STSRHTS20634 > Cluster Summary: > * Stack: corosync > * Current DC: virt-558 (version 2.1.3-2.el8-da2fd79c89) - partition with quorum > * Last updated: Wed Jun 29 15:31:37 2022 > * Last change: Wed Jun 29 15:17:05 2022 by root via cibadmin on virt-557 > * 2 nodes configured > * 3 resource instances configured > > Node List: > * Node virt-557: online (health is RED) > * Node virt-558: online (health is RED) > > Full List of Resources: > * fence-virt-557 (stonith:fence_xvm): Stopped > * fence-virt-558 (stonith:fence_xvm): Stopped > * resource_dummy (ocf::pacemaker:Dummy): Stopped > > Daemon Status: > corosync: active/disabled > pacemaker: active/disabled > pcsd: active/enabled > [root@virt-557 15:31:51 ~]# attrd_updater --name "#health-iowait" --update "green" --node "virt-558" > [root@virt-557 15:32:00 ~]# attrd_updater --name "#health-iowait" --update "yellow" --node "virt-557" > [root@virt-557 15:32:19 ~]# pcs status > Cluster name: STSRHTS20634 > Cluster Summary: > * Stack: corosync > * Current DC: virt-558 (version 2.1.3-2.el8-da2fd79c89) - partition with quorum > * Last updated: Wed Jun 29 15:32:24 2022 > * Last change: Wed Jun 29 15:17:05 2022 by root via cibadmin on virt-557 > * 2 nodes configured > * 3 resource instances configured > > Node List: > * Node virt-557: online (health is YELLOW) > * Online: [ virt-558 ] > > Full List of Resources: > * fence-virt-557 (stonith:fence_xvm): Started virt-558 > * fence-virt-558 (stonith:fence_xvm): Started virt-558 > * resource_dummy (ocf::pacemaker:Dummy): Started virt-558 > > Daemon Status: > corosync: active/disabled > pacemaker: active/disabled > pcsd: active/enabled > [root@virt-557 15:32:25 ~]# attrd_updater --name "#health-iowait" --update "yellow" --node "virt-558" > [root@virt-557 15:32:43 ~]# pcs status > Cluster name: STSRHTS20634 > Cluster Summary: > * Stack: corosync > * Current DC: virt-558 (version 2.1.3-2.el8-da2fd79c89) - partition with quorum > * Last updated: Wed Jun 29 15:32:47 2022 > * Last change: Wed Jun 29 15:17:05 2022 by root via cibadmin on virt-557 > * 2 nodes configured > * 3 resource instances configured > > Node List: > * Node virt-557: online (health is YELLOW) > * Node virt-558: online (health is YELLOW) > > Full List of Resources: > * fence-virt-557 (stonith:fence_xvm): Stopped > * fence-virt-558 (stonith:fence_xvm): Stopped > * resource_dummy (ocf::pacemaker:Dummy): Stopped > > Daemon Status: > corosync: active/disabled > pacemaker: active/disabled > pcsd: active/enabled Test of #health-healthsmart attribute, possibilities: > [root@virt-557 15:33:38 ~]# attrd_updater --name "#health-healthsmart" --update "red" --node "virt-557" > [root@virt-557 15:34:41 ~]# pcs status > Cluster name: STSRHTS20634 > Cluster Summary: > * Stack: corosync > * Current DC: virt-558 (version 2.1.3-2.el8-da2fd79c89) - partition with quorum > * Last updated: Wed Jun 29 15:34:48 2022 > * Last change: Wed Jun 29 15:17:05 2022 by root via cibadmin on virt-557 > * 2 nodes configured > * 3 resource instances configured > > Node List: > * Node virt-557: online (health is RED) > * Online: [ virt-558 ] > > Full List of Resources: > * fence-virt-557 (stonith:fence_xvm): Started virt-558 > * fence-virt-558 (stonith:fence_xvm): Started virt-558 > * resource_dummy (ocf::pacemaker:Dummy): Started virt-558 > > Daemon Status: > corosync: active/disabled > pacemaker: active/disabled > pcsd: active/enabled > [root@virt-557 15:34:48 ~]# attrd_updater --name "#health-healthsmart" --update "red" --node "virt-558" > [root@virt-557 15:34:55 ~]# pcs status > Cluster name: STSRHTS20634 > Cluster Summary: > * Stack: corosync > * Current DC: virt-558 (version 2.1.3-2.el8-da2fd79c89) - partition with quorum > * Last updated: Wed Jun 29 15:34:57 2022 > * Last change: Wed Jun 29 15:17:05 2022 by root via cibadmin on virt-557 > * 2 nodes configured > * 3 resource instances configured > > Node List: > * Node virt-557: online (health is RED) > * Node virt-558: online (health is RED) > > Full List of Resources: > * fence-virt-557 (stonith:fence_xvm): Stopped > * fence-virt-558 (stonith:fence_xvm): Stopped > * resource_dummy (ocf::pacemaker:Dummy): Stopped > > Daemon Status: > corosync: active/disabled > pacemaker: active/disabled > pcsd: active/enabled > [root@virt-557 15:35:21 ~]# attrd_updater --name "#health-healthsmart" --update "yellow" --node "virt-557" > [root@virt-557 15:35:40 ~]# pcs status > Cluster name: STSRHTS20634 > Cluster Summary: > * Stack: corosync > * Current DC: virt-558 (version 2.1.3-2.el8-da2fd79c89) - partition with quorum > * Last updated: Wed Jun 29 15:35:45 2022 > * Last change: Wed Jun 29 15:17:05 2022 by root via cibadmin on virt-557 > * 2 nodes configured > * 3 resource instances configured > > Node List: > * Node virt-557: online (health is YELLOW) > * Online: [ virt-558 ] > > Full List of Resources: > * fence-virt-557 (stonith:fence_xvm): Started virt-558 > * fence-virt-558 (stonith:fence_xvm): Started virt-558 > * resource_dummy (ocf::pacemaker:Dummy): Started virt-558 > > Daemon Status: > corosync: active/disabled > pacemaker: active/disabled > pcsd: active/enabled > [root@virt-557 15:35:45 ~]# attrd_updater --name "#health-healthsmart" --update "yellow" --node "virt-558" > [root@virt-557 15:35:52 ~]# pcs status > Cluster name: STSRHTS20634 > Cluster Summary: > * Stack: corosync > * Current DC: virt-558 (version 2.1.3-2.el8-da2fd79c89) - partition with quorum > * Last updated: Wed Jun 29 15:35:54 2022 > * Last change: Wed Jun 29 15:17:05 2022 by root via cibadmin on virt-557 > * 2 nodes configured > * 3 resource instances configured > > Node List: > * Node virt-557: online (health is YELLOW) > * Node virt-558: online (health is YELLOW) > > Full List of Resources: > * fence-virt-557 (stonith:fence_xvm): Stopped > * fence-virt-558 (stonith:fence_xvm): Stopped > * resource_dummy (ocf::pacemaker:Dummy): Stopped > > Daemon Status: > corosync: active/disabled > pacemaker: active/disabled > pcsd: active/enabled > [root@virt-557 15:36:20 ~]# attrd_updater --name "#health-healthsmart" --update "red" --node "virt-557" > [root@virt-557 15:55:10 ~]# attrd_updater --name "#health-healthsmart" --update "yellow" --node "virt-558" > [root@virt-557 15:55:21 ~]# pcs status > Cluster name: STSRHTS20634 > Cluster Summary: > * Stack: corosync > * Current DC: virt-558 (version 2.1.3-2.el8-da2fd79c89) - partition with quorum > * Last updated: Wed Jun 29 15:55:28 2022 > * Last change: Wed Jun 29 15:17:05 2022 by root via cibadmin on virt-557 > * 2 nodes configured > * 3 resource instances configured > > Node List: > * Node virt-557: online (health is RED) > * Node virt-558: online (health is YELLOW) > > Full List of Resources: > * fence-virt-557 (stonith:fence_xvm): Stopped > * fence-virt-558 (stonith:fence_xvm): Stopped > * resource_dummy (ocf::pacemaker:Dummy): Stopped > > Daemon Status: > corosync: active/disabled > pacemaker: active/disabled > pcsd: active/enabled Result: The indicator said RED if at least one health attribute is red, otherwise YELLOW if at least one health attribute is yellow, otherwise no health status is shown. Cluster is acting well. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (pacemaker bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:7573 |