Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1827744

Summary: Image registry metrics should use summaries instead of histograms
Product: OpenShift Container Platform Reporter: Adam Kaplan <adam.kaplan>
Component: Image RegistryAssignee: Oleg Bulatov <obulatov>
Status: CLOSED ERRATA QA Contact: Wenjing Zheng <wzheng>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.11.0CC: aos-bugs, obulatov, wzheng
Target Milestone: ---   
Target Release: 3.11.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: the registry used repository names in metrics labels Consequence: prometheus has problems with a lot of metrics Fix: remove repo names from labels Result: less metrics are generated
Story Points: ---
Clone Of: 1827743 Environment:
Last Closed: 2020-05-28 05:44:13 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1827743    
Bug Blocks:    

Description Adam Kaplan 2020-04-24 17:11:00 UTC
+++ This bug was initially created as a clone of Bug #1827743 +++

Description of problem:

Image registry metrics use Prometheus histograms, which by default can have high cardinality. These should be replaced by Summaries, which have lower cardinality and better performance.

Version-Release number of selected component (if applicable): 3.11.z


How reproducible: Always


Additional info:

Fix for 4.x - https://github.com/openshift/image-registry/pull/138

--- Additional comment from Adam Kaplan on 2020-04-24 17:10:31 UTC ---

Marking VERIFIED - this was completed prior to the 4.1 release.

Comment 1 Oleg Bulatov 2020-05-05 13:18:18 UTC
*** Bug 1825341 has been marked as a duplicate of this bug. ***

Comment 5 Wenjing Zheng 2020-05-22 10:55:28 UTC
Verified on v3.11.219, there is no "NaN" following each record:
$ curl https://docker-registry-default.apps.0522-spc.qe.rhcloud.com/extensions/v2/metrics -H "Authorization: Bearer test" -k | grep imageregistry_http
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  9024  100  9024    0     0   7146      0  0:00:01  0:00:01 --:--:--  7150
# HELP imageregistry_http_in_flight_requests A gauge of requests currently being served by the registry.
# TYPE imageregistry_http_in_flight_requests gauge
imageregistry_http_in_flight_requests 1
# HELP imageregistry_http_request_duration_seconds A histogram of latencies for requests to the registry.
# TYPE imageregistry_http_request_duration_seconds summary
imageregistry_http_request_duration_seconds{method="get",quantile="0.5"} 0.003248255
imageregistry_http_request_duration_seconds{method="get",quantile="0.9"} 0.005595875
imageregistry_http_request_duration_seconds{method="get",quantile="0.99"} 0.005595875
imageregistry_http_request_duration_seconds_sum{method="get"} 0.015543287
imageregistry_http_request_duration_seconds_count{method="get"} 3
# HELP imageregistry_http_request_size_bytes A histogram of sizes of requests to the registry.
# TYPE imageregistry_http_request_size_bytes summary
imageregistry_http_request_size_bytes{quantile="0.5"} 139
imageregistry_http_request_size_bytes{quantile="0.9"} 139
imageregistry_http_request_size_bytes{quantile="0.99"} 139
imageregistry_http_request_size_bytes_sum 417
imageregistry_http_request_size_bytes_count 3
# HELP imageregistry_http_requests_total A counter for requests to the registry.
# TYPE imageregistry_http_requests_total counter
imageregistry_http_requests_total{code="200",method="get"} 3
# HELP imageregistry_http_response_size_bytes A histogram of response sizes for requests to the registry.
# TYPE imageregistry_http_response_size_bytes summary
imageregistry_http_response_size_bytes{quantile="0.5"} 7227
imageregistry_http_response_size_bytes{quantile="0.9"} 9017
imageregistry_http_response_size_bytes{quantile="0.99"} 9017
imageregistry_http_response_size_bytes_sum 25280
imageregistry_http_response_size_bytes_count 3

Comment 7 errata-xmlrpc 2020-05-28 05:44:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2215