Bug 1910140 - fix the api dashboard with changes in upstream kube 1.20
Summary: fix the api dashboard with changes in upstream kube 1.20
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-apiserver
Version: 4.7
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.7.0
Assignee: Abu Kashem
QA Contact: Xingxing Xia
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-12-22 19:51 UTC by Abu Kashem
Modified: 2021-02-24 15:48 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-02-24 15:48:02 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
topk(20, histogram_quantile... displays more than 20 (579.08 KB, video/webm)
2021-01-18 11:56 UTC, Xingxing Xia
no flags Details
50 occurrences of {{ (1.16 MB, video/webm)
2021-01-18 11:57 UTC, Xingxing Xia
no flags Details
Rate of TLS Handshake Error has no legend name due to using empty by() (12.97 KB, image/png)
2021-01-18 12:01 UTC, Xingxing Xia
no flags Details
Request Rate by Resource and Verb screenshot: topk(20, sum(rate(apiserver_request_total{apiserver=\"$apiserver\"}[$period])) by(resource,verb)) (141.32 KB, image/png)
2021-02-07 10:21 UTC, Xingxing Xia
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-kube-apiserver-operator pull 1024 0 None closed Bug 1910140: fix the api dashboard with changes in upstream kube 1.20 2021-02-18 01:43:20 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:48:56 UTC

Description Abu Kashem 2020-12-22 19:51:07 UTC
Description of problem:

- adjust graph titles 
- update the dashboard with changes in kube 1.20
- add a couple of new metrics to the dashboard

Comment 2 Xingxing Xia 2021-01-18 11:54:15 UTC
Read the PR, got to know it updated some and added some metrics, but tested 4.7.0-0.nightly-2021-01-17-211555 , hit some more issues:
1. Some places defined "topk(20, ...", but displayed more than 20, see attachment.
2. It has no {{flowSchema}}:{{priorityLevel}} legend names now, but still has other {{ like {{resource}}-{{verb}}, {{component}}-{{resource}}, {{group}}:{{kind}}, the {{ occurrences are as many as 50, more than bug 1911173#c1 attachment, so 1911173 needs separate fix, it is not DUP of this bug, see attachment.
3. Rate of TLS Handshake Error has no legend, see attachment.

Comment 3 Xingxing Xia 2021-01-18 11:56:22 UTC
Created attachment 1748437 [details]
topk(20, histogram_quantile... displays more than 20

Comment 4 Xingxing Xia 2021-01-18 11:57:50 UTC
Created attachment 1748439 [details]
50 occurrences of {{

Comment 5 Xingxing Xia 2021-01-18 12:01:12 UTC
Created attachment 1748441 [details]
Rate of TLS Handshake Error has no legend name due to using empty by()

Comment 6 Abu Kashem 2021-02-03 22:48:38 UTC
xxia

> Some places defined "topk(20, ...", but displayed more than 20, see attachment.
This is a known issue, topk does not seem to work well histogram. We can look at it in 4.8. Do you want to open a separate BZ for this targeting 4.8?

> It has no {{flowSchema}}:{{priorityLevel}} legend names now, but still has other {{ like {{resource}}-{{verb}}, {{component}}-{{resource}}, {{group}}:{{kind}}, the {{ occurrences are as many as 50, more than bug 1911173#c1 attachment, so 1911173 needs separate fix, it is not DUP of this bug, see attachment.

Yeah, it seems to work fine in grafana, so probably a console dashboard issue.

> 3. Rate of TLS Handshake Error has no legend, see attachment.
it does not have any label.

Comment 7 Xingxing Xia 2021-02-04 03:30:07 UTC
Abu Kashem, OK, let me move this bug VERIFIED.
> This is a known issue, topk ... Do you want to open a separate BZ for this targeting 4.8?
I'll contact Monitoring QE about it.

> it seems to work fine in grafana, so probably a console dashboard issue
I'm not sure how to check this in Grafana given bug 1911182 is not fixed. I'm contacting Monitoring QE how to manually provision a dashboard, will move bug 1911173 to Management Console once confirmed.

> it does not have any label
I knew. But it has a small square legend bullet without legend name, this is ** ugly **

Comment 8 Xingxing Xia 2021-02-07 10:21:27 UTC
Created attachment 1755448 [details]
Request Rate by Resource and Verb screenshot: topk(20, sum(rate(apiserver_request_total{apiserver=\"$apiserver\"}[$period])) by(resource,verb))

>> This is a known issue, topk does not seem to work well histogram. We can look at it in 4.8. Do you want to open a separate BZ for this targeting 4.8?
> I'll contact Monitoring QE about it.
Contacted, the guy pointed out that: hovering on the the graph, the graph already works well in displaying 20 series, just the legend does not.
I checked this happens in Prometheus UI too.
I also checked topk with sum instead of histogram, happens too, see attachment.
So the legend may work by design to show all scraped series.

Comment 11 errata-xmlrpc 2021-02-24 15:48:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.