Bug 1595444

Summary: RFE: ability to clear stonith history
Product: Red Hat Enterprise Linux 7 Reporter: Ken Gaillot <kgaillot>
Component: pcsAssignee: Tomas Jelinek <tojeline>
Status: CLOSED ERRATA QA Contact: cluster-qe <cluster-qe>
Severity: high Docs Contact:
Priority: high    
Version: 7.6CC: abeekhof, aherr, cfeist, cluster-maint, cluster-qe, idevat, kwenning, michele.sandro.emma, mmazoure, naresh.sukhija_ext, nhostako, nwahl, obenes, omular, rbeyel, sanyadav, tojeline
Target Milestone: rcKeywords: FutureFeature
Target Release: 7.7   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: pcs-0.9.168-1.el7 Doc Type: Enhancement
Doc Text:
.The `pcs` commands now support display, cleanup, and synchronization of fencing history Pacemaker's fence daemon tracks a history of all fence actions taken (pending, successful, and failed). With this release, the `pcs` commands allow users to access the fencing history in the following ways: * The `pcs status` command shows failed and pending fencing actions * The `pcs status --full` command shows the entire fencing history * The `pcs stonith history` command provides options to display and clean up fencing history * Although fencing history is synchronized automatically, the `pcs stonith history` command now supports an `update` option that allows a user to manually synchronize fencing history should that be necessary
Story Points: ---
Clone Of: 1595422
: 1620190 (view as bug list) Environment:
Last Closed: 2020-03-31 19:09:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1595422    
Bug Blocks: 1461964, 1608369    
Attachments:
Description Flags
proposed fix + tests none

Description Ken Gaillot 2018-06-26 22:12:09 UTC
+++ This bug was initially created as a clone of Bug #1595422 +++

Description of problem: Pacemaker's fence daemon tracks a history of all fence actions (pending, successful, and failed) taken, which can be displayed by the stonith_admin --history command, and soon by crm_mon (pcs status). However there is no way to clear the history, which will be especially relevant if showing fence failures becomes the default in crm_mon as expected.

A possible interface is a new stonith_admin option, e.g. --clear-history. It may be worthwhile to accept an optional argument "failures" or "all" (defaulting to "all"), or perhaps "all" should be the only behavior.

+++

We will need corresponding functionality in pcs.

Comment 4 Tomas Jelinek 2019-07-16 15:50:01 UTC
Pacemaker CLI interface:
* cleanup the history for a specified node / all nodes:
  stonith_admin --history <node>|* --cleanup
* display the history for a specified node / all nodes:
  stonith_admin --history <node>|* --verbose
* update the history for a specified node / all nodes:
  stonith_admin --history <node>|* --broadcast

To find out if fence history is supported check crm_mon.rng for presence of fence_history element.

Comment 5 Tomas Jelinek 2019-07-16 15:51:09 UTC
Created attachment 1591105 [details]
proposed fix + tests

Backport from RHEL-8.0

New pcs commands introduced:
1) pcs stonith history show
Displays the whole fencing history, optionally filtered by one node name.
2) pcs stonith history cleanup
Cleans fencing history up, optionally filtered by one node name.
3) pcs stonith history update
Sync fencing history in the local cluster.

If a cluster is running pacemaker without fencing history support, the pcs commands exit with an error pointing that out.

Comment 8 Ivan Devat 2019-08-05 11:12:48 UTC
After Fix:

[kid76 ~] $ rpm -q pcs
pcs-0.9.168-1.el7.x86_64

[kid76 ~] $ rpm -q pacemaker
pacemaker-1.1.19-8.el7.x86_64
[kid76 ~] $ pcs stonith history show
Error: Fence history is not supported, please upgrade pacemaker

[kid76 ~] $ rpm -q pacemaker
pacemaker-1.1.20-5.el7.x86_64

[kid76 ~] $ pcs stonith history show

> enforce stonith history record e.g. by turning of the node
[kid76 ~] $ pcs stonith history show
crmd.4619 at kid76 wishes to reboot node lion76 - 1 0
kid76 failed to reboot node lion76 on behalf of crmd.4619 from kid76 at Fri Aug  2 14:28:37 2019

kid76 failed to reboot node lion76 on behalf of crmd.4619 from kid76 at Fri Aug  2 14:29:38 2019
[kid76 ~] $ pcs stonith history cleanup
cleaning up fencing-history for node *
[kid76 ~] $ pcs stonith history show
crmd.4619 at kid76 wishes to reboot node lion76 - 1 0

[kid76 ~] $ pcs stonith history update
gather fencing-history from all nodes
[kid76 ~] $ echo $?
0

Comment 10 Tomas Jelinek 2019-09-03 13:24:00 UTC
*** Bug 1748376 has been marked as a duplicate of this bug. ***

Comment 30 errata-xmlrpc 2020-03-31 19:09:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0996