Bug 1768483
| Summary: | [marketplace] Default OpSrc metrics cardinality is too great | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Alexander Greene <agreene> |
| Component: | OLM | Assignee: | Alexander Greene <agreene> |
| OLM sub component: | OperatorHub | QA Contact: | Fan Jia <jfan> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | unspecified | ||
| Priority: | unspecified | ||
| Version: | 4.3.0 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.3.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-01-23 11:10:30 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
*** Bug 1768482 has been marked as a duplicate of this bug. *** test env: 4.3.0-0.nightly-2019-11-07-172437 test result: the metrics of marektplace-operators do not have "app_registry:community_operators:2xx_response" #oc -n openshift-monitoring exec -c prometheus prometheus-k8s-1 -- curl -k -H "Authorization: Bearer $token" http://10.128.0.26:8383/metrics | grep app_ ` # HELP app_registry_request_duration_seconds A histogram of AppRegistry request latencies. # TYPE app_registry_request_duration_seconds histogram app_registry_request_duration_seconds_bucket{opsrc="certified-operators",le="0.005"} 0 app_registry_request_duration_seconds_bucket{opsrc="certified-operators",le="0.01"} 0 app_registry_request_duration_seconds_bucket{opsrc="certified-operators",le="0.025"} 0 app_registry_request_duration_seconds_bucket{opsrc="certified-operators",le="0.05"} 0 app_registry_request_duration_seconds_bucket{opsrc="certified-operators",le="0.1"} 0 app_registry_request_duration_seconds_bucket{opsrc="certified-operators",le="0.25"} 0 app_registry_request_duration_seconds_bucket{opsrc="certified-operators",le="0.5"} 4 app_registry_request_duration_seconds_bucket{opsrc="certified-operators",le="1"} 6 app_registry_request_duration_seconds_bucket{opsrc="certified-operators",le="2.5"} 6 app_registry_request_duration_seconds_bucket{opsrc="certified-operators",le="5"} 6 app_registry_request_duration_seconds_bucket{opsrc="certified-operators",le="10"} 6 app_registry_request_duration_seconds_bucket{opsrc="certified-operators",le="+Inf"} 6 app_registry_request_duration_seconds_sum{opsrc="certified-operators"} 2.8043666599999995 app_registry_request_duration_seconds_count{opsrc="certified-operators"} 6 app_registry_request_duration_seconds_bucket{opsrc="community-operators",le="0.005"} 0 app_registry_request_duration_seconds_bucket{opsrc="community-operators",le="0.01"} 0 app_registry_request_duration_seconds_bucket{opsrc="community-operators",le="0.025"} 0 app_registry_request_duration_seconds_bucket{opsrc="community-operators",le="0.05"} 0 app_registry_request_duration_seconds_bucket{opsrc="community-operators",le="0.1"} 0 app_registry_request_duration_seconds_bucket{opsrc="community-operators",le="0.25"} 0 app_registry_request_duration_seconds_bucket{opsrc="community-operators",le="0.5"} 1 app_registry_request_duration_seconds_bucket{opsrc="community-operators",le="1"} 6 app_registry_request_duration_seconds_bucket{opsrc="community-operators",le="2.5"} 6 app_registry_request_duration_seconds_bucket{opsrc="community-operators",le="5"} 6 app_registry_request_duration_seconds_bucket{opsrc="community-operators",le="10"} 6 app_registry_request_duration_seconds_bucket{opsrc="community-operators",le="+Inf"} 6 app_registry_request_duration_seconds_sum{opsrc="community-operators"} 3.6850265260000006 app_registry_request_duration_seconds_count{opsrc="community-operators"} 6 app_registry_request_duration_seconds_bucket{opsrc="non-default-opsrc",le="0.005"} 0 app_registry_request_duration_seconds_bucket{opsrc="non-default-opsrc",le="0.01"} 0 app_registry_request_duration_seconds_bucket{opsrc="non-default-opsrc",le="0.025"} 0 app_registry_request_duration_seconds_bucket{opsrc="non-default-opsrc",le="0.05"} 0 app_registry_request_duration_seconds_bucket{opsrc="non-default-opsrc",le="0.1"} 0 app_registry_request_duration_seconds_bucket{opsrc="non-default-opsrc",le="0.25"} 0 app_registry_request_duration_seconds_bucket{opsrc="non-default-opsrc",le="0.5"} 2 app_registry_request_duration_seconds_bucket{opsrc="non-default-opsrc",le="1"} 7 app_registry_request_duration_seconds_bucket{opsrc="non-default-opsrc",le="2.5"} 7 app_registry_request_duration_seconds_bucket{opsrc="non-default-opsrc",le="5"} 7 app_registry_request_duration_seconds_bucket{opsrc="non-default-opsrc",le="10"} 7 app_registry_request_duration_seconds_bucket{opsrc="non-default-opsrc",le="+Inf"} 7 app_registry_request_duration_seconds_sum{opsrc="non-default-opsrc"} 3.648026931 app_registry_request_duration_seconds_count{opsrc="non-default-opsrc"} 7 app_registry_request_duration_seconds_bucket{opsrc="redhat-operators",le="0.005"} 0 app_registry_request_duration_seconds_bucket{opsrc="redhat-operators",le="0.01"} 0 app_registry_request_duration_seconds_bucket{opsrc="redhat-operators",le="0.025"} 0 app_registry_request_duration_seconds_bucket{opsrc="redhat-operators",le="0.05"} 0 app_registry_request_duration_seconds_bucket{opsrc="redhat-operators",le="0.1"} 0 app_registry_request_duration_seconds_bucket{opsrc="redhat-operators",le="0.25"} 1 app_registry_request_duration_seconds_bucket{opsrc="redhat-operators",le="0.5"} 6 app_registry_request_duration_seconds_bucket{opsrc="redhat-operators",le="1"} 6 app_registry_request_duration_seconds_bucket{opsrc="redhat-operators",le="2.5"} 6 app_registry_request_duration_seconds_bucket{opsrc="redhat-operators",le="5"} 6 app_registry_request_duration_seconds_bucket{opsrc="redhat-operators",le="10"} 6 app_registry_request_duration_seconds_bucket{opsrc="redhat-operators",le="+Inf"} 6 app_registry_request_duration_seconds_sum{opsrc="redhat-operators"} 1.765426227 app_registry_request_duration_seconds_count{opsrc="redhat-operators"} 6 # HELP app_registry_request_total A counter that stores the results of reaching out to an AppRegistry. # TYPE app_registry_request_total counter app_registry_request_total{code="200",method="get",opsrc="certified-operators"} 6 app_registry_request_total{code="200",method="get",opsrc="community-operators"} 6 app_registry_request_total{code="200",method="get",opsrc="non-default-opsrc"} 7 app_registry_request_total{code="200",method="get",opsrc="redhat-operators"} 6 ` My apologizes- I should not have written that the metrics are available at the marketplace metrics endpoint. The time series are in fact available in the console UI by clicking on the Monitoring -> Metrics tab and searching for `app_registry:community_operators:2xx_response`. Note - the `app_registry:community_operators:xxx_response` will only be available if the reported value is greater or equal to 1. test env: 4.3.0-0.nightly-2019-11-10-165138 test result: the metrics of default opsrc added in the console UI by clicking on the Monitoring -> Metrics "app_registry:community_operators:2xx_response" "app_registry:redhat_operators:2xx_response" "app_registry:certify_operators:2xx_response" Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0062 |
Description of problem: The marketplace operator is configured to expose metrics when it attempts to connect to a default AppRegisrty. Given that each http response code can create a new time series, the telemeter team has requested that we limit the potential number of time series by using a recording rule and exposing that metric to telemter instead. Version-Release number of selected component (if applicable): 4.3.x How reproducible: Always Steps to Reproduce: 1. Create an OpenShift 4.3.x cluster 2. Visit the /metrics endpoint on the marketplace-operator Actual results: Each time series tracks a response code and are not grouped by a recording rule. Example: app_registry_request_total{code="200",opsrc="community-operators"} 1 Expected results: app_registry_request_total{code="200",opsrc="community-operators"} 1 app_registry:community_operators:1xx_response 0 app_registry:community_operators:2xx_response 1 app_registry:community_operators:3xx_response 0 app_registry:community_operators:4xx_response 0 app_registry:community_operators:5xx_response 0 Additional info: