Bug 1827744 - Image registry metrics should use summaries instead of histograms
Summary: Image registry metrics should use summaries instead of histograms
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Image Registry
Version: 3.11.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 3.11.z
Assignee: Oleg Bulatov
QA Contact: Wenjing Zheng
URL:
Whiteboard:
: 1825341 (view as bug list)
Depends On: 1827743
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-04-24 17:11 UTC by Adam Kaplan
Modified: 2020-05-28 05:44 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: the registry used repository names in metrics labels Consequence: prometheus has problems with a lot of metrics Fix: remove repo names from labels Result: less metrics are generated
Clone Of: 1827743
Environment:
Last Closed: 2020-05-28 05:44:13 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift image-registry pull 237 0 None closed [release-3.11] Bug 1827744: Reduce cardinality of metrics, use summary instead of histograms 2020-06-03 03:59:00 UTC
Red Hat Product Errata RHBA-2020:2215 0 None None None 2020-05-28 05:44:20 UTC

Description Adam Kaplan 2020-04-24 17:11:00 UTC
+++ This bug was initially created as a clone of Bug #1827743 +++

Description of problem:

Image registry metrics use Prometheus histograms, which by default can have high cardinality. These should be replaced by Summaries, which have lower cardinality and better performance.

Version-Release number of selected component (if applicable): 3.11.z


How reproducible: Always


Additional info:

Fix for 4.x - https://github.com/openshift/image-registry/pull/138

--- Additional comment from Adam Kaplan on 2020-04-24 17:10:31 UTC ---

Marking VERIFIED - this was completed prior to the 4.1 release.

Comment 1 Oleg Bulatov 2020-05-05 13:18:18 UTC
*** Bug 1825341 has been marked as a duplicate of this bug. ***

Comment 5 Wenjing Zheng 2020-05-22 10:55:28 UTC
Verified on v3.11.219, there is no "NaN" following each record:
$ curl https://docker-registry-default.apps.0522-spc.qe.rhcloud.com/extensions/v2/metrics -H "Authorization: Bearer test" -k | grep imageregistry_http
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  9024  100  9024    0     0   7146      0  0:00:01  0:00:01 --:--:--  7150
# HELP imageregistry_http_in_flight_requests A gauge of requests currently being served by the registry.
# TYPE imageregistry_http_in_flight_requests gauge
imageregistry_http_in_flight_requests 1
# HELP imageregistry_http_request_duration_seconds A histogram of latencies for requests to the registry.
# TYPE imageregistry_http_request_duration_seconds summary
imageregistry_http_request_duration_seconds{method="get",quantile="0.5"} 0.003248255
imageregistry_http_request_duration_seconds{method="get",quantile="0.9"} 0.005595875
imageregistry_http_request_duration_seconds{method="get",quantile="0.99"} 0.005595875
imageregistry_http_request_duration_seconds_sum{method="get"} 0.015543287
imageregistry_http_request_duration_seconds_count{method="get"} 3
# HELP imageregistry_http_request_size_bytes A histogram of sizes of requests to the registry.
# TYPE imageregistry_http_request_size_bytes summary
imageregistry_http_request_size_bytes{quantile="0.5"} 139
imageregistry_http_request_size_bytes{quantile="0.9"} 139
imageregistry_http_request_size_bytes{quantile="0.99"} 139
imageregistry_http_request_size_bytes_sum 417
imageregistry_http_request_size_bytes_count 3
# HELP imageregistry_http_requests_total A counter for requests to the registry.
# TYPE imageregistry_http_requests_total counter
imageregistry_http_requests_total{code="200",method="get"} 3
# HELP imageregistry_http_response_size_bytes A histogram of response sizes for requests to the registry.
# TYPE imageregistry_http_response_size_bytes summary
imageregistry_http_response_size_bytes{quantile="0.5"} 7227
imageregistry_http_response_size_bytes{quantile="0.9"} 9017
imageregistry_http_response_size_bytes{quantile="0.99"} 9017
imageregistry_http_response_size_bytes_sum 25280
imageregistry_http_response_size_bytes_count 3

Comment 7 errata-xmlrpc 2020-05-28 05:44:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2215


Note You need to log in before you can comment on or make changes to this bug.