Bug 1724784 - Unable to identify age of cluster in relation to current version via PromQL
Summary: Unable to identify age of cluster in relation to current version via PromQL
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cluster Version Operator
Version: 4.1.z
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.1.z
Assignee: Abhinav Dahiya
QA Contact: Junqi Zhao
URL:
Whiteboard: 4.1.6
Depends On: 1723945
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-06-27 19:13 UTC by Clayton Coleman
Modified: 2019-07-23 18:12 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1723945
Environment:
Last Closed: 2019-07-23 18:12:12 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:1766 0 None None None 2019-07-23 18:12:19 UTC

Description Clayton Coleman 2019-06-27 19:13:40 UTC
+++ This bug was initially created as a clone of Bug #1723945 +++

Slated for back port because this is a key metric for us and the current version labels don't allow us to answer important queries around upgrades.  Low risk, given that it is metrics only and has been manually verified.

----

While building dashboards for understanding the state of upgrades, it is currently not possible to build the query:

"how old are the clusters of a given current version"

without having the ability to get the current version but with the timestamp of the initial install.

Repurpose the "cluster_version{type="cluster"}" series to:

1. have version and image set to the same value as type="current" (what the operator is currently trying to apply)
2. have from_version set to the same version as type="initial"
3. keep the date as the initial
4. if the cluster has never completed a sync, set from_version empty (so we can exclude clusters that have never completed)

This also now allows the query "show all successfully installed clusters by their version" and means most people should use type="cluster" instead of type="current" (since the current date is not that useful)

Because this loses queryability of "which images were clusters installed with", add a new "cluster_version{type="initial"}" series which has

1. version and image set to the oldest entry in the history.
2. date set to the install time
3. from_version set empty

Hopefully this is the last major change to the cluster_version metric, given that our current queries have so far been able to triage the state of upgrades across give versions.

Verification will be manual by querying telemetry.

--- Additional comment from Abhinav Dahiya on 2019-06-25 16:16:59 EDT ---

PR https://github.com/openshift/cluster-version-operator/pull/212

Comment 4 errata-xmlrpc 2019-07-23 18:12:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:1766


Note You need to log in before you can comment on or make changes to this bug.