Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1893854

Summary: Add an alert for requests rejected by the apiserver
Product: OpenShift Container Platform Reporter: Abu Kashem <akashem>
Component: kube-apiserverAssignee: Abu Kashem <akashem>
Status: CLOSED NOTABUG QA Contact: Ke Wang <kewang>
Severity: medium Docs Contact:
Priority: low    
Version: 4.5CC: aos-bugs, kewang, mfojtik, wlewis, xxia
Target Milestone: ---Flags: mfojtik: needinfo?
Target Release: 4.6.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: LifecycleStale
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1893850
: 1893855 (view as bug list) Environment:
Last Closed: 2022-02-25 15:30:23 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1893850    
Bug Blocks: 1893855    

Description Abu Kashem 2020-11-02 19:10:28 UTC
+++ This bug was initially created as a clone of Bug #1893850 +++

The current alert 'KubeAPIErrorsHigh' does not take into account the requests rejected by the apiserver. This alert checks the 'apiserver_request_total' metrics for  5xx errors. When a request is rejected by the apiserver it does not record the 'apiserver_request_total' metric, on the other hand it records the 'apiserver_request_terminations_total' metric. 

So 'KubeAPIErrorsHigh' is not aware of any requests rejected by the apiserver. We need to add an alert that inspects the 'apiserver_request_terminations_total' metric and alerts if requests are being rejected.

Note:
- 'KubeAPIErrorsHigh' has been removed in 4.6 and replaced with 'KubeAPIErrorBudgetBurn'.
- The alert should be added to mixin first and then imported to OpenShift. This is where we would add the new alert - https://github.com/kubernetes-monitoring/kubernetes-mixin/blob/master/alerts/kube_apiserver.libsonnet#L19

Comment 1 Michal Fojtik 2020-12-02 19:46:25 UTC
This bug hasn't had any activity in the last 30 days. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're marking this bug as "LifecycleStale" and decreasing the severity/priority. If you have further information on the current state of the bug, please update it, otherwise this bug can be closed in about 7 days. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. Additionally, you can add LifecycleFrozen into Keywords if you think this bug should never be marked as stale. Please consult with bug assignee before you do that.