Bug 1769779

Summary: library-go operator /metrics endpoint doesn't contain metrics
Product: OpenShift Container Platform Reporter: David Eads <deads>
Component: kube-apiserverAssignee: Luis Sanchez <sanchezl>
Status: CLOSED ERRATA QA Contact: zhou ying <yinzhou>
Severity: unspecified Docs Contact:
Priority: urgent    
Version: 4.3.0CC: adam.kaplan, aos-bugs, mfojtik, xxia
Target Milestone: ---   
Target Release: 4.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-01-23 11:11:28 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1731228    

Description David Eads 2019-11-07 12:55:52 UTC
This bug was initially created as a copy of Bug #1764313

I am copying this bug because: 

metrics for the kube-apiserver include critical metrics for determining pieces of cluster configuration and state back through telemetry.  it's also the generic mechanism used by many other operators.

found by service catalog and console who have each tried to shim around the problem.

The /metrics endpoint should have an e2e test to validate that it responds with prometheus metrics.  The key metrics is console_url.  

The console operator is not currently reporting console_url on master (4.3).

Comment 1 Michal Fojtik 2019-11-21 11:28:57 UTC
https://github.com/openshift/library-go/pull/584

Comment 3 Michal Fojtik 2019-11-21 12:22:04 UTC
*** Bug 1766518 has been marked as a duplicate of this bug. ***

Comment 4 Michal Fojtik 2019-11-21 13:56:33 UTC
*** Bug 1770381 has been marked as a duplicate of this bug. ***

Comment 5 Xingxing Xia 2019-11-27 00:51:42 UTC
Ying Zhou, please help verify this bug, thanks.

Comment 6 zhou ying 2019-11-27 06:50:10 UTC
Confirmed with latest payload 4.3.0-0.nightly-2019-11-26-171052, the issue has fixed:

1. oc edit FeatureGate cluster # add below and save
...
spec:
  featureSet: "TechPreviewNoUpgrade"
2. TK=`oc sa get-token cluster-monitoring-operator -n openshift-monitoring`
   [root@dhcp-140-138 ~]# oc get ep -n openshift-kube-apiserver-operator 
NAME      ENDPOINTS         AGE
metrics   10.128.0.8:8443   5h8m
[root@dhcp-140-138 ~]# oc get po -n openshift-kube-apiserver-operator
NAME                                      READY   STATUS    RESTARTS   AGE
kube-apiserver-operator-b76fc9c88-cv7lk   1/1     Running   2          5h8m

3. [root@dhcp-140-138 ~]# oc exec kube-apiserver-operator-b76fc9c88-cv7lk -- curl -k -H "Authorization: Bearer $token" https://10.128.0.8:8443/metrics > /tmp/metrics
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 96849    0 96849    0     0   663k      0 --:--:-- --:--:-- --:--:--  666k
[root@dhcp-140-138 ~]# cat  /tmp/metrics |grep featu
# HELP cluster_feature_set Reports the feature set the cluster is configured to expose. name corresponds to the featureSet field of the cluster. The value is 1 if a cloud provider is supported.
# TYPE cluster_feature_set gauge
cluster_feature_set{name="TechPreviewNoUpgrade"} 0

Comment 8 errata-xmlrpc 2020-01-23 11:11:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062