Bug 1731232

Summary: Cloud provider type is not reported via metrics or telemetry, which prevents useful debug of clusters
Product: OpenShift Container Platform Reporter: Clayton Coleman <ccoleman>
Component: kube-apiserverAssignee: Clayton Coleman <ccoleman>
Status: CLOSED CURRENTRELEASE QA Contact: Xingxing Xia <xxia>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.2.0CC: aos-bugs, mfojtik, nstielau, vlaad, yinzhou
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1732945 1733105 (view as bug list) Environment:
Last Closed: 2020-07-13 14:52:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1732945, 1733105    

Description Clayton Coleman 2019-07-18 17:22:09 UTC
The core platform type is a key metric for debugging and triaging issues.  It is currently not available on cluster, nor reported via telemetry.

After discussion with apiserver team we will add

cluster_infrastructure_provider{type="AWS",region="us-east-1"} 1

cluster_infrastructure_provider{type="Azure",region=""} 1

cluster_infrastructure_provider{type="GCP",region="us-central"} 1

cluster_infrastructure_provider{type="OpenStack",region=""} 1

cluster_infrastructure_provider{type="BareMetal",region=""} 1

cluster_infrastructure_provider{type="",region=""} 1

reporting from the kube-apiserver and then add that telemetry, allowing cluster type to be unambiguously determined from metrics and alerting.

Comment 1 Clayton Coleman 2019-07-18 17:27:51 UTC
Intended to be back ported to 4.1.

Comment 2 Xingxing Xia 2019-07-25 11:13:43 UTC
Verified in 4.2.0-0.nightly-2019-07-24-000310 AWS env, query cluster_infrastructure_provider in Prometheus UI, got 1 record:
cluster_infrastructure_provider{endpoint="https",instance="10.128.0.63:8443",job="metrics",namespace="openshift-kube-apiserver-operator",pod="kube-apiserver-operator-84989775cf-dbzpg",region="us-east-2",service="metrics",type="AWS"}	1

It shows type and region well from kas operator. I think can move bug to VERIFIED.

Comment 4 Xingxing Xia 2019-07-31 07:21:44 UTC
Today checked Azure env, the metrics work well too:
cluster_infrastructure_provider{endpoint="https",instance="10.130.0.11:8443",job="metrics",namespace="openshift-kube-apiserver-operator",pod="kube-apiserver-operator-65ff4f566c-8lh2c",service="metrics",type="Azure"}	1