DescriptionSteve Kuznetsov
2020-03-09 21:23:58 UTC
Description of problem:
Alerts for latency on DELETECOLLECTION are very sensitive. The actual DELETECOLLECTION call will scale in latency with the size of the item set to be deleted, so long latencies are not even generally problematic without knowing the deletion set. 40ms response on deleting 100 things should not fire anything. The frequency of alerting here spams to the point where ops teams will get fatigued.
Version:
$ KUBECONFIG=~/.kube/build01 oc get clusterversion version
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.3.0-0.nightly-2020-03-04-222846 True False 4d5h Cluster version is 4.3.0-0.nightly-2020-03-04-222846
Additional info:
See our alerts channel for more spam:
https://coreos.slack.com/archives/CV1UZU53R/p1583786540029600https://coreos.slack.com/archives/CB48XQ4KZ/p1583520575237600
Comment 1Stefan Schimanski
2020-03-10 09:41:27 UTC
For reference, this is the alert:
[FIRING:1] KubeAPILatencyHigh apiserver (apiserver https events.k8s.io default openshift-monitoring/k8s events namespace kubernetes warning DELETECOLLECTION v1beta1)
The API server has an abnormal latency of 0.057229939575753015 seconds for DELETECOLLECTION events.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2020:2409