Bug 1880426
| Summary: | crm_mon does not return cluster status when only fencing history is unobtainable | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Simon Foucek <sfoucek> |
| Component: | pacemaker | Assignee: | Chris Lumens <clumens> |
| Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> |
| Severity: | low | Docs Contact: | |
| Priority: | medium | ||
| Version: | 8.3 | CC: | cluster-maint, kgaillot, msmazova |
| Target Milestone: | rc | Keywords: | Triaged |
| Target Release: | 8.4 | Flags: | pm-rhel:
mirror+
|
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | pacemaker-2.0.5-6.el8 | Doc Type: | Bug Fix |
| Doc Text: |
Cause: crm_mon (pcs status) exited with an error if the fencer could not be contacted for fencing history.
Consequence: No status would be available if the fencer could not be reached, even if the remaining cluster status was known.
Fix: crm_mon now displays an error if the fencer cannot be contacted, but proceeds with displaying any known status.
Result: Some status information is still viewable even if the fencer cannot be contacted.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-05-18 15:26:40 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Fix is merged upstream as of commit 8c51b49 a) XML output test Before fix: >[root@virt-492 ~]# rpm -q pacemaker >pacemaker-2.0.5-2.el8.x86_64 >[root@virt-492 ~]# pkill -KILL pacemaker-fence && crm_mon --output-as=xml ><pacemaker-result api-version="2.3" request="crm_mon --output-as=xml"> > <status code="0" message="OK"> > <errors> > <error>Critical: Unable to get stonith-history</error> > <error>Connection to the cluster-daemons terminated</error> > <error>Reading stonith-history failed</error> > </errors> > </status> ></pacemaker-result> >[root@virt-492 ~]# echo "$?" >0 Result: Return value and status code is 0, but no cluster status info is in output. After fix: >[root@virt-488 ~]# rpm -q pacemaker >pacemaker-2.0.5-6.el8.x86_64 >[root@virt-488 ~]# pkill -KILL pacemaker-fence && crm_mon --output-as=xml ><pacemaker-result api-version="2.3" request="crm_mon --output-as=xml"> > <summary> > <stack type="corosync"/> > <current_dc present="true" version="2.0.5-6.el8-ba59be7122" name="virt-489" id="2" with_quorum="true"/> > <last_update time="Thu Feb 11 17:06:32 2021"/> > <last_change time="Thu Feb 11 17:01:22 2021" user="root" client="cibadmin" origin="virt-488"/> > <nodes_configured number="2"/> > <resources_configured number="2" disabled="0" blocked="0"/> > <cluster_options stonith-enabled="true" symmetric-cluster="true" no-quorum-policy="stop" maintenance-mode="false" stop-all-resources="false"/> > </summary> > <nodes> > <node name="virt-488" id="1" online="true" standby="false" standby_onfail="false" maintenance="false" pending="false" unclean="false" shutdown="false" expected_up="true" is_dc="false" >resources_running="1" type="member"/> > <node name="virt-489" id="2" online="true" standby="false" standby_onfail="false" maintenance="false" pending="false" unclean="false" shutdown="false" expected_up="true" is_dc="true" >resources_running="1" type="member"/> > </nodes> > <resources> > <resource id="fence-virt-488" resource_agent="stonith:fence_xvm" role="Started" active="true" orphaned="false" blocked="false" managed="true" failed="false" failure_ignored="false" nodes_running_on="1"> > <node name="virt-488" id="1" cached="true"/> > </resource> > <resource id="fence-virt-489" resource_agent="stonith:fence_xvm" role="Started" active="true" orphaned="false" blocked="false" managed="true" failed="false" failure_ignored="false" nodes_running_on="1"> > <node name="virt-489" id="2" cached="true"/> > </resource> > </resources> > <node_history> > <node name="virt-489"> > <resource_history id="fence-virt-489" orphan="false" migration-threshold="1000000"> > <operation_history call="10" task="start" rc="0" rc_text="ok" last-rc-change="Thu Feb 11 17:01:22 2021" last-run="Thu Feb 11 17:01:22 2021" exec-time="55ms" queue-time="0ms"/> > <operation_history call="12" task="monitor" rc="0" rc_text="ok" interval="60000ms" last-rc-change="Thu Feb 11 17:01:23 2021" exec-time="47ms" queue-time="0ms"/> > </resource_history> > </node> > <node name="virt-488"> > <resource_history id="fence-virt-488" orphan="false" migration-threshold="1000000" fail-count="1" last-failure="Thu Feb 11 17:06:32 2021"> > <operation_history call="6" task="start" rc="0" rc_text="ok" last-rc-change="Thu Feb 11 17:01:19 2021" last-run="Thu Feb 11 17:01:19 2021" exec-time="51ms" queue-time="0ms"/> > <operation_history call="8" task="monitor" rc="0" rc_text="ok" interval="60000ms" last-rc-change="Thu Feb 11 17:01:19 2021" exec-time="73ms" queue-time="0ms"/> > <operation_history call="8" task="monitor" rc="1" rc_text="error" interval="60000ms"/> > </resource_history> > </node> > </node_history> > <failures> > <failure op_key="fence-virt-488_monitor_60000" node="virt-488" exitstatus="error" exitreason="" exitcode="1" call="8" status="Error"/> > </failures> > <fence_history status="102"/> > <status code="0" message="OK"/> ></pacemaker-result> >[root@virt-488 ~]# echo "$?" >0 Result: Return value is 0 and there is present some normal cluster status XML output from previous command, which only includes fencing history error. b) non-XML output Before fix: >[root@virt-492 ~]# rpm -q pacemaker >pacemaker-2.0.5-2.el8.x86_64 >[root@virt-492 ~]# pkill -KILL pacemaker-fence && crm_mon -1 >Critical: Unable to get stonith-history >Connection to the cluster-daemons terminated >Reading stonith-history failed >[root@virt-492 ~]# echo "$?" >0 Result: Return value and status code is 0, but no cluster status info is in output. After fix: >[root@virt-488 ~]# rpm -q pacemaker >pacemaker-2.0.5-6.el8.x86_64 >[root@virt-488 ~]# pkill -KILL pacemaker-fence && crm_mon -1 >Cluster Summary: > * Stack: corosync > * Current DC: virt-489 (version 2.0.5-6.el8-ba59be7122) - partition with quorum > * Last updated: Thu Feb 11 17:08:43 2021 > * Last change: Thu Feb 11 17:01:22 2021 by root via cibadmin on virt-488 > * 2 nodes configured > * 2 resource instances configured > >Node List: > * Online: [ virt-488 virt-489 ] > >Active Resources: > * fence-virt-488 (stonith:fence_xvm): Started virt-488 > * fence-virt-489 (stonith:fence_xvm): Started virt-489 > >Failed Resource Actions: > * fence-virt-488_monitor_60000 on virt-488 'error' (1): call=23, status='Error', exitreason='', last-rc-change='2021-02-11 17:07:32 +01:00', queued=0ms, exec=2ms > >Failed Fencing Actions: > * Failed to get fencing history: Not connected >[root@virt-488 ~]# echo "$?" >0 Result: Return value is 0 and there is present some normal cluster status non-XML output from previous command, which only includes fencing history error. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (pacemaker bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2021:1782 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (pacemaker bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2021:1782 |
Description of problem: If return value is 0 even when fencing history is unobtainable, then the rest of the cluster status info should be present in the output. Alternatively, the return value shouln't be 0 when there's no cluster status info present in the output. Version-Release number of selected component (if applicable): pacemaker-2.0.4-6.el8.x86_64 How reproducible: always Steps to Reproduce: a) Kill fencing agent and then immediately try to print xml. 1.[root@virt-242 ~]# pkill -KILL pacemaker-fence && crm_mon --output-as=xml <pacemaker-result api-version="2.2" request="crm_mon --output-as=xml"> <status code="0" message="OK"> <errors> <error>Critical: Unable to get stonith-history</error> <error>Connection to the cluster-daemons terminated</error> <error>Reading stonith-history failed</error> </errors> </status> </pacemaker-result> Check return value. 2.[root@virt-242 ~]# echo "$?" 0 b) Kill pacemaker-fenced and then immediately try to print output in non-xml. 1.[root@virt-242 ~]# pkill -KILL pacemaker-fence && crm_mon -1 Critical: Unable to get stonith-history Connection to the cluster-daemons terminated Reading stonith-history failed Check return value. 2.[root@virt-242 ~]# echo "$?" 0 Actual results: Return value and status code is 0, but no cluster status info is output. Expected results: If return value is 0 even when fencing history is unobtainable, then the rest of the cluster status info should be present in the output. Alternatively, the return value shouln't be 0 when there's no cluster status info present in the output. Additional info: Discovered when verifying bz1793653.