Bug 1890808

Summary: New etcd alerts need to be added to the monitoring stack
Product: OpenShift Container Platform Reporter: Naga Ravi Chaitanya Elluri <nelluri>
Component: EtcdAssignee: Sam Batschelet <sbatsche>
Status: CLOSED ERRATA QA Contact: ge liu <geliu>
Severity: low Docs Contact:
Priority: unspecified    
Version: 4.6CC: alegrand, anpicker, erooth, kakkoyun, lcosic, nelluri, pkrupa, skrenger, spasquie, surbania, wking
Target Milestone: ---   
Target Release: 4.7.0   
Hardware: Unspecified   
OS: Linux   
Whiteboard: aos-scalability-46
Fixed In Version: Doc Type: Enhancement
Doc Text:
Feature: improved etcd alerting - critical alert when the etcd database quota is 95% full. - warning alert when there is a sudden surge in etcd writes leading to increase in the etcd database quota size. - critical alert when 99th percentile of wal fsync duration is greater than 1 second. Reason: cluster admin should have accurate observability regarding operand health. Result: alerting will more accurately reflect actual observed health of etcd
Story Points: ---
Clone Of:
: 1960465 (view as bug list) Environment:
Last Closed: 2021-02-24 15:27:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1960465    

Description Naga Ravi Chaitanya Elluri 2020-10-22 21:35:28 UTC
Description of problem:
We have a couple of new alerts around etcd: https://github.com/etcd-io/etcd/pull/12249, https://github.com/etcd-io/etcd/pull/12266. Cluster monitoring operator need to be modified to pick them up.

Actual results:
New etcd alerts are missing.

Expected results:
New etcd alerts are present and active.

Comment 8 errata-xmlrpc 2021-02-24 15:27:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633