Bug 1949519 - Alertmanager triggers KubeAPILatencyHigh after RHOCP upgrade
Summary: Alertmanager triggers KubeAPILatencyHigh after RHOCP upgrade
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Monitoring
Version: 3.11.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 3.11.z
Assignee: Arunprasad Rajkumar
QA Contact: Junqi Zhao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-04-14 13:20 UTC by Dhruv Gautam
Modified: 2021-06-09 17:06 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-06-09 17:06:30 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-monitoring-operator pull 1170 0 None open Bug 1949519: Add DELETECOLLECTION to long-running apiserver verbs 2021-05-19 10:11:04 UTC
Red Hat Product Errata RHSA-2021:2150 0 None None None 2021-06-09 17:06:45 UTC

Description Dhruv Gautam 2021-04-14 13:20:36 UTC
Description of problem:
After upgrade to RHOCP 3.11.404 KubeAPILatencyHigh alerts are triggered. Below is the sample alert:

Labels
alertname = KubeAPILatencyHigh
cluster = abc.example.com
endpoint = https
job = apiserver
namespace = default
prometheus = openshift-monitoring/k8s
resource = controlplanes
scope = namespace
service = kubernetes
severity = critical
verb = DELETECOLLECTION
Annotations
message = The API server has an abnormal latency of 18812.224137931036 seconds for DELETECOLLECTION controlplanes.

Checked etcd, api and controller logs and found it to be clean.

Version-Release number of selected component (if applicable):
3.11.404

How reproducible:
NA

Steps to Reproduce:
1.
2.
3.

Actual results:
KubeAPILatencyHigh alerts are triggered.

Expected results:
RHOCP cluster should not trigger KubeAPILatencyHigh alerts.

Additional info:

Comment 1 Simon Pasquier 2021-04-14 13:25:05 UTC
Did the alert clear out after some time?

Comment 2 Dhruv Gautam 2021-04-14 20:59:49 UTC
The alert is not getting cleared.

Regards
Dhruv Gautam

Comment 8 Junqi Zhao 2021-05-26 02:42:32 UTC
tested with ose-cluster-monitoring-operator/images/v3.11.445,DELETECOLLECTION is excluded from the list of verbs taken into account by the KubeAPILatencyHigh alert

Comment 11 errata-xmlrpc 2021-06-09 17:06:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 3.11.452 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2150


Note You need to log in before you can comment on or make changes to this bug.