Bug 1952576
Summary: | csv_succeeded metric not present in olm-operator for all successful CSVs | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Arjun Naik <anaik> |
Component: | OLM | Assignee: | tflannag |
OLM sub component: | OLM | QA Contact: | xzha |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | medium | ||
Priority: | medium | CC: | anbhatta, aos-bugs, bluddy, davegord, dsover, krizza, tflannag |
Version: | 4.7 | Keywords: | Triaged |
Target Milestone: | --- | ||
Target Release: | 4.10.0 | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
Cause:
The value of the "csv_succeeded" metric was lost between pod restarts for the OLM Operator container as that metric was only emitted when a CSV's status sub-resource was changed.
Consequence:
The "csv_succeeded" metric is not always present for successfully installed CSVs.
Fix:
Emit the "csv_succeeded" metric at the beginning of the OLM Operator's startup logic.
Result:
The value of the "csv_succeeded" metric is correctly persisted during pod restarts.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2022-03-10 16:03:07 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 2072995 |
Description
Arjun Naik
2021-04-22 15:06:51 UTC
Also verified that the missing CSVs are not "Copied" oc get csv -n openshift-splunk-forwarder-operator splunk-forwarder-operator.v0.1.217-a5cba25 -o json | jq -r '.status.reason' InstallSucceeded > oc get pods olm-operator-d97f6b57-bx4pl -o json | jq -r '.status.startTime' > 2021-04-22T06:33:18Z > oc get csv -n openshift-splunk-forwarder-operator splunk-forwarder-operator.v0.1.217-a5cba25 -o json | jq -r '.status.lastUpdateTime' > 2021-04-21T09:46:14Z Could be due to the fact that this metric is only updated when a CSV status changes. *** Bug 1964716 has been marked as a duplicate of this bug. *** [root@preserve-olm-agent-test ~]# oc version Client Version: 4.10.0-0.nightly-2022-01-11-065245 Server Version: 4.10.0-0.nightly-2022-01-11-065245 Kubernetes Version: v1.22.1+6859754 [root@preserve-olm-agent-test ~]# oc exec catalog-operator-67f5bfd4f9-2g79c -- olm --version OLM version: 0.19.0 git commit: 79c782526c3c1c2da88f63b34707b23fb04f7da5 1, install some operators [root@preserve-olm-agent-test ~]# oc get csv -A | grep -v elasticsearch | grep -v nginx-ingress | grep -v must-gather NAMESPACE NAME DISPLAY VERSION REPLACES PHASE default ditto-operator.v0.3.1 Eclipse Ditto 0.3.1 ditto-operator.v0.2.0 Succeeded openshift-logging cluster-logging.5.3.2-26 Red Hat OpenShift Logging 5.3.2-26 Succeeded openshift-operator-lifecycle-manager packageserver Package Server 0.19.0 Succeeded test-1 anzo-operator.v2.0.101 Anzo Operator 2.0.0 Succeeded test-3 cockroachdb.v5.0.4 CockroachDB Helm Operator 5.0.4 cockroachdb.v5.0.3 Succeeded 2, port-fowarding to the olm-operator pod and curling the metrics endpoint [root@preserve-olm-agent-test ~]# oc get pod NAME READY STATUS RESTARTS AGE catalog-operator-67f5bfd4f9-2g79c 1/1 Running 0 128m collect-profiles-27366135--1-t7rh9 0/1 Completed 0 42m collect-profiles-27366150--1-27jsg 0/1 Completed 0 27m collect-profiles-27366165--1-f9n4d 0/1 Completed 0 12m olm-operator-68bfb9479b-j72b5 1/1 Running 0 128m package-server-manager-66b87fbcc9-95qt5 1/1 Running 0 128m packageserver-7566c94648-ldktx 1/1 Running 0 122m packageserver-7566c94648-sbmn2 1/1 Running 0 122m [root@preserve-olm-agent-test ~]# oc port-forward olm-operator-68bfb9479b-j72b5 8443 [root@preserve-olm-agent-test ~]# curl -k https://localhost:8443/metrics | grep csv % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 100 7315 # HELP csv_count Number of CSVs successfully registered # TYPE csv_count gauge 0 csv_count 8 # HELP csv_succeeded Successful CSV install 7315 0# TYPE csv_succeeded gauge csv_succeeded{name="anzo-operator.v2.0.101",namespace="test-1",version="2.0.0"} 1 csv_succeeded{name="cluster-logging.5.3.2-26",namespace="openshift-logging",version="5.3.2-26"} 1 csv_succeeded{name="cockroachdb.v5.0.4",namespace="test-3",version="5.0.4"} 1 csv_succeeded{name="ditto-operator.v0.3.1",namespace="default",version="0.3.1"} 1 0 csv_succeeded{name="elasticsearch-operator.5.3.2-26",namespace="openshift-operators-redhat",version="5.3.2-26"} 1 csv_succeeded{name="must-gather-operator.v1.1.2",namespace="global-load-balancer-operator",version="1.1.2"} 1 8191csv_succeeded{name="nginx-ingress-operator.v0.4.0",namespace="openshift-operators",version="0.4.0"} 1 csv_succeeded{name="packageserver",namespace="openshift-operator-lifecycle-manager",version="0.19.0"} 1 0 # HELP csv_upgrade_count Monotonic count of CSV upgrades # TYPE csv_upgrade_count counter --:csv_upgrade_count 0 --:-- --:--:-- --:--:-- 8182 [root@preserve-olm-agent-test ~]# LGTM, verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056 |