Bug 1737156
| Summary: | Report metrics on installed operators | |||
|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Evan Cordell <ecordell> | |
| Component: | OLM | Assignee: | Evan Cordell <ecordell> | |
| OLM sub component: | OLM | QA Contact: | Bruno Andrade <bandrade> | |
| Status: | CLOSED ERRATA | Docs Contact: | ||
| Severity: | medium | |||
| Priority: | medium | CC: | bandrade, chezhang, chuo, jfan, scolange | |
| Version: | 4.1.z | |||
| Target Milestone: | --- | |||
| Target Release: | 4.1.z | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1743808 (view as bug list) | Environment: | ||
| Last Closed: | 2019-09-10 15:59:27 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | 1743808 | |||
| Bug Blocks: | 1737164 | |||
|
Description
Evan Cordell
2019-08-02 21:44:52 UTC
Hi, Evan Could you help give more details about this bug? Such as the basic description, reproduce steps. And, it's better to fil in that two Priority and Severity fields. Bruno, Please help verify this bug once it in ON_QA status. Basic info: a synchronization metric for the subscription object added in the Prometheus metrics. For more details, you can ask for the reporter's help. Thanks! Jian, I'll keep an eye on it.
Evan, from my understanding, the verification steps would be:
1) Check if metrics are available. In order to validate that I should follow [1] and search for '{__name__="subscription_sync_total"}' metrics on Prometheus UI.
2) Check for metrics count increasing. Try to upgrade an operator and check subscription_sync_total.
Can you please validate that?
[1] https://docs.openshift.com/container-platform/4.1/telemetry/showing-data-collected-by-telemetry.html
Hi Bruno, Yes, this is correct. Thanks! Verification Failed Steps used to validate: 1) Create a subscription for etcd operator on default project 2) Check subscription_sync_total metrics on catalog operator oc get pods -n openshift-operator-lifecycle-manager NAME READY STATUS RESTARTS AGE catalog-operator-58cfd7cc84-ktdfh 1/1 Running 0 29m olm-operator-8999bd5fd-2wc4p 1/1 Running 0 29m olm-operators-8vbmm 1/1 Running 0 27m packageserver-67f857985f-gkgrq 1/1 Running 0 26m packageserver-67f857985f-tqwd6 1/1 Running 0 26m oc port-forward catalog-operator-58cfd7cc84-ktdfh 8081 -n openshift-operator-lifecycle-manager Forwarding from 127.0.0.1:8081 -> 8081 Forwarding from [::1]:8081 -> 8081 Handling connection for 8081 curl -k -H "Authorization: Bearer $(oc sa get-token prometheus-k8s -n openshift-monitoring)" https://localhost:8081/metrics | grep subs # HELP subscription_count Number of subscriptions # TYPE subscription_count gauge subscription_count 2.0 As shown above, the 'subscription_sync_total' does not appear on Prometheus UI and on Catalog Operator metrics. Cluster Details: Cluster Version: oc get clusterversion -o json|jq ".items[0].status.history[0].version" "4.1.0-0.nightly-2019-08-26-164941" OLM Version: oc exec catalog-operator-58cfd7cc84-ktdfh -n openshift-operator-lifecycle-manager -- olm -version OLM version: 0.9.0 git commit: afc7402 LGTM, marking as VERIFIED Steps used to validate: 1) Create a subscription for etcd operator on default project 2) Check subscription_sync_total metrics on catalog operator oc get pods -n openshift-operator-lifecycle-manager NAME READY STATUS RESTARTS AGE catalog-operator-5f68dfb696-kk6vr 1/1 Running 1 20m olm-operator-588cb66f54-m8h59 1/1 Running 1 20m olm-operators-ph5nk 1/1 Running 0 17m packageserver-54f9598d56-2zqpw 1/1 Running 0 16m packageserver-54f9598d56-rvclf 1/1 Running 0 16m oc port-forward catalog-operator-5f68dfb696-kk6vr 8081 -n openshift-operator-lifecycle-manager Forwarding from 127.0.0.1:8081 -> 8081 Forwarding from [::1]:8081 -> 8081 Handling connection for 8081 curl -k -H "Authorization: Bearer $(oc sa get-token prometheus-k8s -n openshift-monitoring)" https://localhost:8081/metrics | grep subs # HELP subscription_count Number of subscriptions # TYPE subscription_count gauge subscription_count 2.0 # HELP subscription_sync_total Monotonic count of subscription syncs # TYPE subscription_sync_total counter subscription_sync_total{installed="",name="etcd"} 2.0 subscription_sync_total{installed="etcdoperator.v0.9.4",name="etcd"} 2.0 subscription_sync_total{installed="packageserver.v0.9.0",name="packageserver"} 2.0 3) Query for {__name__="subscription_sync_total"} on Prometheus UI and checked that all metrics are shown: http://pics.osci.redhat.com/5chjce.png Cluster Details: Cluster Version: oc get clusterversion -o json|jq ".items[0].status.history[0].version" "4.1.0-0.nightly-2019-08-27-070548" OLM Version: oc exec catalog-operator-5f68dfb696-kk6vr -n openshift-operator-lifecycle-manager -- olm -version OLM version: 0.9.0 git commit: b28fc94 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2019:2594 |