Bug 1811834 - Alerts on DELETECOLLECTION latency spam and are not useful
Summary: Alerts on DELETECOLLECTION latency spam and are not useful
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Monitoring
Version: 4.4
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.5.0
Assignee: Pawel Krupa
QA Contact: hongyan li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-03-09 21:23 UTC by Steve Kuznetsov
Modified: 2020-07-13 17:19 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-07-13 17:19:21 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-monitoring-operator pull 722 0 None closed Bug 1811834: Sync jsonnet dependencies 2021-02-09 13:30:46 UTC

Description Steve Kuznetsov 2020-03-09 21:23:58 UTC
Description of problem:

Alerts for latency on DELETECOLLECTION are very sensitive. The actual DELETECOLLECTION call will scale in latency with the size of the item set to be deleted, so long latencies are not even generally problematic without knowing the deletion set. 40ms response on deleting 100 things should not fire anything. The frequency of alerting here spams to the point where ops teams will get fatigued.

Version:
$ KUBECONFIG=~/.kube/build01 oc get clusterversion version
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.3.0-0.nightly-2020-03-04-222846   True        False         4d5h    Cluster version is 4.3.0-0.nightly-2020-03-04-222846


Additional info:

See our alerts channel for more spam:
https://coreos.slack.com/archives/CV1UZU53R/p1583786540029600
https://coreos.slack.com/archives/CB48XQ4KZ/p1583520575237600

Comment 1 Stefan Schimanski 2020-03-10 09:41:27 UTC
For reference, this is the alert:

[FIRING:1] KubeAPILatencyHigh apiserver (apiserver https events.k8s.io default openshift-monitoring/k8s events namespace kubernetes warning DELETECOLLECTION v1beta1)
The API server has an abnormal latency of 0.057229939575753015 seconds for DELETECOLLECTION events.

Comment 11 errata-xmlrpc 2020-07-13 17:19:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409


Note You need to log in before you can comment on or make changes to this bug.