Bug 1801154 - Port changes "throw away unused high cardinality apiserver duration buckets"
Summary: Port changes "throw away unused high cardinality apiserver duration buckets"
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Monitoring
Version: 4.4
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: ---
: 4.4.0
Assignee: Lili Cosic
QA Contact: Junqi Zhao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-02-10 11:36 UTC by Lili Cosic
Modified: 2020-05-13 21:57 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-13 21:57:18 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github openshift cluster-kube-apiserver-operator pull 752 None closed Bug 1801154: manifests/: Throw away unused high cardinality apiserver duration buckets 2020-09-21 19:12:55 UTC
Github openshift cluster-openshift-apiserver-operator pull 309 None closed Bug 1801154: manifests/: Throw away unused high cardinality apiserver duration buckets 2020-09-21 19:12:52 UTC
Red Hat Product Errata RHBA-2020:0581 None None None 2020-05-13 21:57:20 UTC

Description Lili Cosic 2020-02-10 11:36:14 UTC
Description of problem:
These changes did not end up in openshift correctly during feature phase, as they were never applied in the apiserver ServiceMonitors.


How reproducible:
Check ServiceMonitor for dropping the following:
                  "regex: 'apiserver_request_duration_seconds_bucket;(0.15|0.25|0.3|0.35|0.4|0.45|0.6|0.7|0.8|0.9|1.25|1.5|1.75|2.5|3|3.5|4.5|6|7|8|9|15|25|30|50)',"


Expected results:
Drop the apiserver duration buckets.

Additional info:

PR in "upstream" -> https://github.com/coreos/kube-prometheus/pull/387/files

Comment 2 Junqi Zhao 2020-02-11 04:31:23 UTC
tested with 4.4.0-0.nightly-2020-02-10-215022, the fix is in
# oc -n openshift-apiserver get servicemonitor/openshift-apiserver -oyaml | grep apiserver_request_duration_seconds_bucket -A3 -B1
    - action: drop
      regex: apiserver_request_duration_seconds_bucket;(0.15|0.25|0.3|0.35|0.4|0.45|0.6|0.7|0.8|0.9|1.25|1.5|1.75|2.5|3|3.5|4.5|6|7|8|9|15|25|30|50)
      sourceLabels:
      - __name__
      - le

and search in prometheus, the unused high cardinality apiserver duration buckets are dropped
count(apiserver_request_duration_seconds_bucket{namespace="openshift-apiserver"}) by (namespace,le)
Element 	Value
{le="0.05",namespace="openshift-apiserver"}	139
{le="0.5",namespace="openshift-apiserver"}	139
{le="10",namespace="openshift-apiserver"}	139
{le="20",namespace="openshift-apiserver"}	139
{le="4",namespace="openshift-apiserver"}	139
{le="40",namespace="openshift-apiserver"}	139
{le="60",namespace="openshift-apiserver"}	139
{le="+Inf",namespace="openshift-apiserver"}	139
{le="0.1",namespace="openshift-apiserver"}	139
{le="0.2",namespace="openshift-apiserver"}	139
{le="1",namespace="openshift-apiserver"}	139
{le="2",namespace="openshift-apiserver"}	139
{le="5",namespace="openshift-apiserver"}	139

Comment 3 Lili Cosic 2020-04-20 10:03:46 UTC
Already in release notes, no need for docs.

Comment 5 errata-xmlrpc 2020-05-13 21:57:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0581


Note You need to log in before you can comment on or make changes to this bug.