While building dashboards for understanding the state of upgrades, it is currently not possible to build the query: "how old are the clusters of a given current version" without having the ability to get the current version but with the timestamp of the initial install. Repurpose the "cluster_version{type="cluster"}" series to: 1. have version and image set to the same value as type="current" (what the operator is currently trying to apply) 2. have from_version set to the same version as type="initial" 3. keep the date as the initial 4. if the cluster has never completed a sync, set from_version empty (so we can exclude clusters that have never completed) This also now allows the query "show all successfully installed clusters by their version" and means most people should use type="cluster" instead of type="current" (since the current date is not that useful) Because this loses queryability of "which images were clusters installed with", add a new "cluster_version{type="initial"}" series which has 1. version and image set to the oldest entry in the history. 2. date set to the install time 3. from_version set empty Hopefully this is the last major change to the cluster_version metric, given that our current queries have so far been able to triage the state of upgrades across give versions. Verification will be manual by querying telemetry.
PR https://github.com/openshift/cluster-version-operator/pull/212
Verified manually max_over_time(cluster_version{type="cluster",from_version!=""}[2d]) {_id="b020b111-fe78-4353-acb0-a35da607ca01",endpoint="metrics",from_version="0.0.1-2019-06-26-201348",image="registry.svc.ci.openshift.org/ci-op-74jsslv3/release@sha256:9467af6b15821a223253d5f34cae519d9781416f23009475f4acb274549c6171",instance="10.0.149.248:9099",job="cluster-version-operator",namespace="openshift-cluster-version",pod="cluster-version-operator-c45df89d-tnlgp",prometheus="openshift-monitoring/k8s",service="cluster-version-operator",type="cluster",version="0.0.1-2019-06-26-201348"} 1561580894 {_id="b57f1806-62ae-4730-997b-227e54944a64",endpoint="metrics",from_version="0.0.1-2019-06-25-200446",image="registry.svc.ci.openshift.org/ci-op-7zdb1my6/release@sha256:dea359a2c87418917ce49d1be3ec3f82531d3a423c0fa1f94d471c8632bf3abc",instance="10.0.128.253:9099",job="cluster-version-operator",namespace="openshift-cluster-version",pod="cluster-version-operator-576988ccd4-brbxq",prometheus="openshift-monitoring/k8s",service="cluster-version-operator",type="cluster",version="0.0.1-2019-06-25-200446"}
Under no circumstances can an engineer EVER verify their own bugzilla. Do not do this again.
If you want to take over verification of this, please do.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922