Bug 1663856

Summary: OLM metrics are not updated
Product: OpenShift Container Platform Reporter: Yadan Pei <yapei>
Component: OLMAssignee: Evan Cordell <ecordell>
Status: CLOSED ERRATA QA Contact: Yadan Pei <yapei>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.1.0CC: jpeeler, yapei
Target Milestone: ---   
Target Release: 4.1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-04 10:41:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Yadan Pei 2019-01-07 08:40:01 UTC
Description of problem:
When resource counts are changed, some OLM metrics are not updated

Version-Release number of selected component (if applicable):
$ olm -version
OLM version: 0.8.0
git commit: 81104ff
$ oc get pods olm-operator-856c56ddbd-xqjfw -n openshift-operator-lifecycle-manager -o yaml | grep -i image
image: registry.svc.ci.openshift.org/openshift/origin-v4.0-2019-01-06-163602@sha256:0a861cd2e46964467afc01497485ad40e6348008292ff98213da93812fc9e25c


How reproducible:
Always

Steps to Reproduce:
1. check OLM metrics data
$ curl -k -H "Authorization: Bearer $(oc sa get-token prometheus-k8s -n openshift-monitoring)" https://$olm-pod-ip:8080/metrics
catalog_source_count 0
csv_count 1
csv_upgrade_count 0
install_plan_count 0
subscription_count 0
2. Create etcd subscription in openshift-operator-lifecycle-manager
[yapei@dhcp-140-3 test-scripts]$ oc get csv -n openshift-operator-lifecycle-manager
NAME                   DISPLAY          VERSION   REPLACES              PHASE
etcdoperator.v0.9.2    etcd             0.9.2     etcdoperator.v0.9.0   Succeeded
packageserver.v0.8.0   Package Server   0.8.0                           Succeeded
[yapei@dhcp-140-3 test-scripts]$ oc get subscription -n openshift-operator-lifecycle-manager
NAME            PACKAGE         SOURCE          CHANNEL
etcd-97hl2      etcd            rh-operators    alpha
packageserver   packageserver   olm-operators   alpha
[yapei@dhcp-140-3 test-scripts]$ oc get installplan -n openshift-operator-lifecycle-manager
NAME                                 CSV                    SOURCE          APPROVAL    APPROVED
install-etcdoperator.v0.9.2-pwvkj    etcdoperator.v0.9.2    rh-operators    Automatic   false
install-packageserver.v0.8.0-tk7j7   packageserver.v0.8.0   olm-operators   Automatic   false
3. check OLM metrics again
$ curl -k -H "Authorization: Bearer $(oc sa get-token prometheus-k8s -n openshift-monitoring)" https://$olm-pod-ip:8080/metrics
catalog_source_count 0
csv_count 2
csv_upgrade_count 0
install_plan_count 0
subscription_count 0


Actual results:
3. Only csv_count is added, install_plan_count & subscription_count are not counted correctly

Expected results:
3. install_plan_count & subscription_count should be counted correctly

Additional info:

Comment 2 Jeff Peeler 2019-01-16 13:38:18 UTC
PR was merged yesterday.

Comment 3 Jian Zhang 2019-01-16 14:23:53 UTC
Change status to ON_QA since PR already merged.

Comment 4 Yadan Pei 2019-01-22 05:30:07 UTC
1. get catalog operator pod and olm operator pod IP
$ oc get pods <catalog_operator_pod> -n openshift-operator-lifecycle-manager -o yaml | grep -i ip
$ oc get pods <olm_operator_pod> -n openshift-operator-lifecycle-manager -o yaml | grep -i ip

2. Check metrics
2.1 Check catalog operator pod metrics
$ curl -k -H "Authorization: Bearer $(oc sa get-token prometheus-k8s -n openshift-monitoring)" http://<catalog_operator_pod_ip>:8081/metrics | grep -e catalog_source_count -e install_plan_count -e subscription_count 
catalog_source_count 5.0  
install_plan_count 31.0 
subscription_count 2.0 

2.2 Check olm operator pod metrics
curl -k -H "Authorization: Bearer $(oc sa get-token prometheus-k8s -n openshift-monitoring)" http://<olm_operator_pod_ip>:8081/metrics | grep -e csv_count -e csv_upgrade_count
csv_count 42.0 
csv_upgrade_count 0.0

3. Create a new subscription in 'test' namespace via web console

4. Check the value with command
$ oc get installplan --all-namespaces | wc -l        // 39-1 should be the installplan count
39  
$ oc get sub --all-namespaces | wc -l       // 4-1 should be the subscription count
4
$ oc get csv --all-namespaces | wc -l     // 85 -1 should be the csv count
85

5. check metrics again
5.1 catalog operator pod metrics
$ curl -k -H "Authorization: Bearer $(oc sa get-token prometheus-k8s -n openshift-monitoring)" http://<catalog_operator_pod_ip>:8081/metrics | grep -e catalog_source_count -e install_plan_count -e subscription_count
catalog_source_count 5.0
install_plan_count 38.0
subscription_count 3.0

5.2 olm operator pod metrics
curl -k -H "Authorization: Bearer $(oc sa get-token prometheus-k8s -n openshift-monitoring)" http://<olm_operator_pod_ip>:8081/metrics | grep -e csv_count -e csv_upgrade_count
csv_count 84.0
csv_upgrade_count 0.0


Verified on 

sh-4.2$ olm -version
OLM version: 0.8.1
git commit: a36ed09

Comment 7 errata-xmlrpc 2019-06-04 10:41:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758