Bug 1949519

Summary: Alertmanager triggers KubeAPILatencyHigh after RHOCP upgrade
Product: OpenShift Container Platform Reporter: Dhruv Gautam <dgautam>
Component: MonitoringAssignee: Arunprasad Rajkumar <arajkuma>
Status: CLOSED ERRATA QA Contact: Junqi Zhao <juzhao>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.11.0CC: alegrand, anpicker, dgrisonn, erooth, kakkoyun, lcosic, pkrupa, spasquie
Target Milestone: ---Keywords: EasyFix
Target Release: 3.11.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-06-09 17:06:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Dhruv Gautam 2021-04-14 13:20:36 UTC
Description of problem:
After upgrade to RHOCP 3.11.404 KubeAPILatencyHigh alerts are triggered. Below is the sample alert:

Labels
alertname = KubeAPILatencyHigh
cluster = abc.example.com
endpoint = https
job = apiserver
namespace = default
prometheus = openshift-monitoring/k8s
resource = controlplanes
scope = namespace
service = kubernetes
severity = critical
verb = DELETECOLLECTION
Annotations
message = The API server has an abnormal latency of 18812.224137931036 seconds for DELETECOLLECTION controlplanes.

Checked etcd, api and controller logs and found it to be clean.

Version-Release number of selected component (if applicable):
3.11.404

How reproducible:
NA

Steps to Reproduce:
1.
2.
3.

Actual results:
KubeAPILatencyHigh alerts are triggered.

Expected results:
RHOCP cluster should not trigger KubeAPILatencyHigh alerts.

Additional info:

Comment 1 Simon Pasquier 2021-04-14 13:25:05 UTC
Did the alert clear out after some time?

Comment 2 Dhruv Gautam 2021-04-14 20:59:49 UTC
The alert is not getting cleared.

Regards
Dhruv Gautam

Comment 8 Junqi Zhao 2021-05-26 02:42:32 UTC
tested with ose-cluster-monitoring-operator/images/v3.11.445,DELETECOLLECTION is excluded from the list of verbs taken into account by the KubeAPILatencyHigh alert

Comment 11 errata-xmlrpc 2021-06-09 17:06:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 3.11.452 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2150