1887639 – [https_proxy] query console-operator metrics report certificate problem via oc exec

Bug 1887639 - [https_proxy] query console-operator metrics report certificate problem via oc exec

Summary: [https_proxy] query console-operator metrics report certificate problem via o...

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Management Console
Sub Component:
Version:	4.6
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	low
Target Milestone:	---
Target Release:	---
Assignee:	Jakub Hadvig
QA Contact:	Yadan Pei
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-10-13 03:56 UTC by Yadan Pei
Modified:	2021-01-21 17:29 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-01-21 17:29:32 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Yadan Pei 2020-10-13 03:56:02 UTC

Description of problem:
based on our testing, on https_proxy and ipv6 cluster, when querying console operator exposed metrics via `oc exec` command, it will report certification problem.

Version-Release number of selected component (if applicable):
4.6.0-rc.2

How reproducible:
Always

Steps to Reproduce:
1. query console operator exposed metrics with 'oc exec'

# oc get pods -o wide
NAME                                READY   STATUS    RESTARTS   AGE   IP            NODE                             NOMINATED NODE   READINESS GATES
console-operator-757f85b94b-dj2s4   1/1     Running   0          24h   10.128.0.16   wsun1012-brdkd-control-plane-1   <none>           <none>
# export token=$(oc serviceaccounts get-token prometheus-k8s -n openshift-monitoring)

# oc exec console-operator-757f85b94b-dj2s4 -- curl -k -H "Authorization: Bearer $token" https://10.128.0.16:8443/metrics | grep console_url
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (60) SSL certificate problem: self signed certificate in certificate chain
More details here: https://curl.haxx.se/docs/sslcerts.html

curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.
command terminated with exit code 60

# oc get proxy cluster -o yaml
apiVersion: config.openshift.io/v1
kind: Proxy
metadata:
  creationTimestamp: "2020-10-12T02:59:21Z"
  generation: 1
  managedFields:
  - apiVersion: config.openshift.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:spec:
        .: {}
        f:httpProxy: {}
        f:httpsProxy: {}
        f:noProxy: {}
        f:trustedCA:
          .: {}
          f:name: {}
      f:status:
        .: {}
        f:httpProxy: {}
        f:httpsProxy: {}
        f:noProxy: {}
    manager: cluster-bootstrap
    operation: Update
    time: "2020-10-12T02:59:22Z"
  name: cluster
  resourceVersion: "513"
  selfLink: /apis/config.openshift.io/v1/proxies/cluster
  uid: 4525c2e1-1350-4aec-a5d4-8f31ac5e9af8
spec:
  httpProxy: http://proxy-user1:JYgU8qRZV4DY4PXJbxJK@10.0.77.163:3128
  httpsProxy: https://proxy-user1:JYgU8qRZV4DY4PXJbxJK@10.0.77.163:3130
  noProxy: test.no-proxy.com
  trustedCA:
    name: user-ca-bundle
status:
  httpProxy: http://proxy-user1:JYgU8qRZV4DY4PXJbxJK@10.0.77.163:3128
  httpsProxy: https://proxy-user1:JYgU8qRZV4DY4PXJbxJK@10.0.77.163:3130
  noProxy: .cluster.local,.svc,10.0.0.0/16,10.128.0.0/14,127.0.0.1,172.30.0.0/16,api-int.wsun1012.qe.devcluster.openshift.com,etcd-0.wsun1012.qe.devcluster.openshift.com,etcd-1.wsun1012.qe.devcluster.openshift.com,etcd-2.wsun1012.qe.devcluster.openshift.com,localhost,test.no-proxy.com

2. Viewing metrics from console Monitoring -> Metrics -> console_url, a correct result record is returned
console_url	https	10.128.0.16:8443	metrics	openshift-console-operator	console-operator-757f85b94b-dj2s4	openshift-monitoring/k8s	metrics	https://console-openshift-console.apps.wsun1012.qe.devcluster.openshift.com	1

3. Run `oc exec` to query metrics via prometheus endpoint, results can be returned
# oc project openshift-monitoring
Now using project "openshift-monitoring" on server "https://api.wsun1012.qe.devcluster.openshift.com:6443".

# oc exec prometheus-k8s-0  -c prometheus -- curl -k -H "Authorization: Bearer $token" https://prometheus-k8s.openshift-monitoring.svc:9091/api/v1/query\?query\=ALERTS\%7Balertname\%3D\%22PodDisruptionBudgetAtLimit\%22\%7D
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    63  100    63    0     0   1235      0 --:--:-- --:--:-- --:--:--  1260{"status":"success","data":{"resultType":"vector","result":[]}}



Actual results:
1. curl commands reports error:
curl: (60) SSL certificate problem: self signed certificate in certificate chain 

Expected results:
1. correct metrics data should be returned

Additional info:
1. For comparison, on a normal cluster(without https_proxy), the query command can return successfully

# oc project
Using project "openshift-console-operator" on server "https://api.qe-ui46-1013.qe.devcluster.openshift.com:6443".

# oc get pods -o wide
NAME                                READY   STATUS    RESTARTS   AGE    IP            NODE                                         NOMINATED NODE   READINESS GATES
console-operator-6d7f7d464d-d48hd   1/1     Running   0          148m   10.129.0.15   ip-10-0-178-119.us-east-2.compute.internal   <none>           <none>

# export token=$(oc serviceaccounts get-token prometheus-k8s -n openshift-monitoring)

# oc exec console-operator-6d7f7d464d-d48hd -- curl -k -H "Authorization: Bearer $token" https://10.129.0.15:8443/metrics | grep console_url
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0# HELP console_url [ALPHA] URL of the console exposed on the cluster
# TYPE console_url gauge
console_url{url="https://console-openshift-console.apps.qe-ui46-1013.qe.devcluster.openshift.com"} 1

# oc get proxy cluster -o yaml
apiVersion: config.openshift.io/v1
kind: Proxy
metadata:
  creationTimestamp: "2020-10-13T00:35:24Z"
  generation: 1
  managedFields:
  - apiVersion: config.openshift.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:spec:
        .: {}
        f:trustedCA:
          .: {}
          f:name: {}
      f:status: {}
    manager: cluster-bootstrap
    operation: Update
    time: "2020-10-13T00:35:24Z"
  name: cluster
  resourceVersion: "526"
  selfLink: /apis/config.openshift.io/v1/proxies/cluster
  uid: bbfe7319-3938-44e6-9654-bd7c0bf426a7
spec:
  trustedCA:
    name: ""
status: {}

Comment 4 Jakub Hadvig 2020-12-23 16:11:57 UTC

We did not have time to fix this issue this sprint. Will reevaluate and try to fix in next sprint.

Comment 5 Samuel Padgett 2021-01-21 17:29:32 UTC

Hi, Ya Dan. What is being tested here?

The metrics endpoint is using a service serving certificate, so I would expect the curl command to fail unless you pass the correct CA bundle. This is expected and doesn't indicate that there's a problem. This endpoint is unrelated to the cluster proxy settings (although the proxy settings could change the default CA bundle when exec'ing into the pod). Do you see any alert or error indicating that the metrics aren't being scraped?

Based on the information in the description, I'm closing as NOTABUG. I believe this is expected. If you see any bad effects other than the curl command failing, let us know.

Note You need to log in before you can comment on or make changes to this bug.