1731232 – Cloud provider type is not reported via metrics or telemetry, which prevents useful debug of clusters

Bug 1731232 - Cloud provider type is not reported via metrics or telemetry, which prevents useful debug of clusters

Summary: Cloud provider type is not reported via metrics or telemetry, which prevents ...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	kube-apiserver
Sub Component:
Version:	4.2.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Clayton Coleman
QA Contact:	Xingxing Xia
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1732945 1733105
TreeView+	depends on / blocked

Reported:	2019-07-18 17:22 UTC by Clayton Coleman
Modified:	2020-07-13 14:52 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1732945 1733105 (view as bug list)
Environment:
Last Closed:	2020-07-13 14:52:16 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift cluster-kube-apiserver-operator pull 526	0	None	closed	Bug 1731232: Report infrastructure provider metric from the operator	2020-07-13 14:49:03 UTC

Description Clayton Coleman 2019-07-18 17:22:09 UTC

The core platform type is a key metric for debugging and triaging issues.  It is currently not available on cluster, nor reported via telemetry.

After discussion with apiserver team we will add

cluster_infrastructure_provider{type="AWS",region="us-east-1"} 1

cluster_infrastructure_provider{type="Azure",region=""} 1

cluster_infrastructure_provider{type="GCP",region="us-central"} 1

cluster_infrastructure_provider{type="OpenStack",region=""} 1

cluster_infrastructure_provider{type="BareMetal",region=""} 1

cluster_infrastructure_provider{type="",region=""} 1

reporting from the kube-apiserver and then add that telemetry, allowing cluster type to be unambiguously determined from metrics and alerting.

Comment 1 Clayton Coleman 2019-07-18 17:27:51 UTC

Intended to be back ported to 4.1.

Comment 2 Xingxing Xia 2019-07-25 11:13:43 UTC

Verified in 4.2.0-0.nightly-2019-07-24-000310 AWS env, query cluster_infrastructure_provider in Prometheus UI, got 1 record:
cluster_infrastructure_provider{endpoint="https",instance="10.128.0.63:8443",job="metrics",namespace="openshift-kube-apiserver-operator",pod="kube-apiserver-operator-84989775cf-dbzpg",region="us-east-2",service="metrics",type="AWS"}	1

It shows type and region well from kas operator. I think can move bug to VERIFIED.

Comment 4 Xingxing Xia 2019-07-31 07:21:44 UTC

Today checked Azure env, the metrics work well too:
cluster_infrastructure_provider{endpoint="https",instance="10.130.0.11:8443",job="metrics",namespace="openshift-kube-apiserver-operator",pod="kube-apiserver-operator-65ff4f566c-8lh2c",service="metrics",type="Azure"}	1

Note You need to log in before you can comment on or make changes to this bug.