Bug 1679922

Summary: "x509: certificate is valid for localhost" for openshift-controller-manager-operator and kube-controller-manager-operator targets
Product: OpenShift Container Platform Reporter: Junqi Zhao <juzhao>
Component: MasterAssignee: Michal Fojtik <mfojtik>
Status: CLOSED DUPLICATE QA Contact: Xingxing Xia <xxia>
Severity: low Docs Contact:
Priority: unspecified    
Version: 4.1.0CC: aos-bugs, jokerman, minden, mloibl, mmccomas, surbania
Target Milestone: ---   
Target Release: 4.1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-02-25 00:54:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Junqi Zhao 2019-02-22 08:57:22 UTC
Description of problem:
Cloned from https://jira.coreos.com/browse/MON-587
"x509: certificate is valid for localhost" for openshift-controller-manager-operator and kube-controller-manager-operator targets, but could get the metrics from api, it just shows error in /targets, see below

                "discoveredLabels": {
                    "__address__": "10.128.0.22:8443",
                    "__meta_kubernetes_endpoint_address_target_kind": "Pod",
                    "__meta_kubernetes_endpoint_address_target_name": "openshift-controller-manager-operator-76f66d9b65-45vgn",
                    "__meta_kubernetes_endpoint_port_name": "https",
                    "__meta_kubernetes_endpoint_port_protocol": "TCP",
                    "__meta_kubernetes_endpoint_ready": "true",
                    "__meta_kubernetes_endpoints_name": "metrics",
                    "__meta_kubernetes_namespace": "openshift-controller-manager-operator",
                    "__meta_kubernetes_pod_annotation_k8s_v1_cni_cncf_io_networks_status": "[{\n    \"name\": \"openshift-sdn\",\n    \"ips\": [\n        \"10.128.0.22\"\n    ],\n    \"default\": true,\n    \"dns\": {}\n}]",
                    "__meta_kubernetes_pod_container_name": "operator",
                    "__meta_kubernetes_pod_container_port_name": "metrics",
                    "__meta_kubernetes_pod_container_port_number": "8443",
                    "__meta_kubernetes_pod_container_port_protocol": "TCP",
                    "__meta_kubernetes_pod_controller_kind": "ReplicaSet",
                    "__meta_kubernetes_pod_controller_name": "openshift-controller-manager-operator-76f66d9b65",
                    "__meta_kubernetes_pod_host_ip": "10.0.140.52",
                    "__meta_kubernetes_pod_ip": "10.128.0.22",
                    "__meta_kubernetes_pod_label_app": "openshift-controller-manager-operator",
                    "__meta_kubernetes_pod_label_pod_template_hash": "76f66d9b65",
                    "__meta_kubernetes_pod_name": "openshift-controller-manager-operator-76f66d9b65-45vgn",
                    "__meta_kubernetes_pod_node_name": "ip-10-0-140-52.us-east-2.compute.internal",
                    "__meta_kubernetes_pod_phase": "Running",
                    "__meta_kubernetes_pod_ready": "true",
                    "__meta_kubernetes_pod_uid": "3c501b7a-365c-11e9-9033-0287d15921aa",
                    "__meta_kubernetes_service_annotation_service_alpha_openshift_io_serving_cert_secret_name": "openshift-controller-manager-operator-serving-cert",
                    "__meta_kubernetes_service_annotation_service_alpha_openshift_io_serving_cert_signed_by": "openshift-service-serving-signer@1550809822",
                    "__meta_kubernetes_service_label_app": "openshift-controller-manager-operator",
                    "__meta_kubernetes_service_name": "metrics",
                    "__metrics_path__": "/metrics",
                    "__scheme__": "https",
                    "job": "openshift-controller-manager-operator/openshift-controller-manager-operator/0"
                },                
                "health": "down",
                "labels": {
                    "endpoint": "https",
                    "instance": "10.128.0.22:8443",
                    "job": "metrics",
                    "namespace": "openshift-controller-manager-operator",
                    "pod": "openshift-controller-manager-operator-76f66d9b65-45vgn",
                    "service": "metrics"
                },
                "lastError": "Get https://10.128.0.22:8443/metrics: x509: certificate is valid for localhost, not metrics.openshift-controller-manager-operator.svc",
                "lastScrape": "2019-02-22T07:52:51.626332327Z",
                "scrapeUrl": "https://10.128.0.22:8443/metrics"
****************************************************************************************************************
                "discoveredLabels": {
                    "__address__": "10.128.0.6:8443",
                    "__meta_kubernetes_endpoint_address_target_kind": "Pod",
                    "__meta_kubernetes_endpoint_address_target_name": "kube-controller-manager-operator-54f59bbf4f-29dnr",
                    "__meta_kubernetes_endpoint_port_name": "https",
                    "__meta_kubernetes_endpoint_port_protocol": "TCP",
                    "__meta_kubernetes_endpoint_ready": "true",
                    "__meta_kubernetes_endpoints_name": "metrics",
                    "__meta_kubernetes_namespace": "openshift-kube-controller-manager-operator",
                    "__meta_kubernetes_pod_annotation_k8s_v1_cni_cncf_io_networks_status": "[{\n    \"name\": \"openshift-sdn\",\n    \"ips\": [\n        \"10.128.0.6\"\n    ],\n    \"default\": true,\n    \"dns\": {}\n}]",
                    "__meta_kubernetes_pod_container_name": "operator",
                    "__meta_kubernetes_pod_container_port_name": "metrics",
                    "__meta_kubernetes_pod_container_port_number": "8443",
                    "__meta_kubernetes_pod_container_port_protocol": "TCP",
                    "__meta_kubernetes_pod_controller_kind": "ReplicaSet",
                    "__meta_kubernetes_pod_controller_name": "kube-controller-manager-operator-54f59bbf4f",
                    "__meta_kubernetes_pod_host_ip": "10.0.140.52",
                    "__meta_kubernetes_pod_ip": "10.128.0.6",
                    "__meta_kubernetes_pod_label_app": "kube-controller-manager-operator",
                    "__meta_kubernetes_pod_label_pod_template_hash": "54f59bbf4f",
                    "__meta_kubernetes_pod_name": "kube-controller-manager-operator-54f59bbf4f-29dnr",
                    "__meta_kubernetes_pod_node_name": "ip-10-0-140-52.us-east-2.compute.internal",
                    "__meta_kubernetes_pod_phase": "Running",
                    "__meta_kubernetes_pod_ready": "true",
                    "__meta_kubernetes_pod_uid": "4f98f87b-365b-11e9-8b9a-0289e5c3d940",
                    "__meta_kubernetes_service_annotation_service_alpha_openshift_io_serving_cert_secret_name": "kube-controller-manager-operator-serving-cert",
                    "__meta_kubernetes_service_annotation_service_alpha_openshift_io_serving_cert_signed_by": "openshift-service-serving-signer@1550809822",
                    "__meta_kubernetes_service_label_app": "kube-controller-manager-operator",
                    "__meta_kubernetes_service_name": "metrics",
                    "__metrics_path__": "/metrics",
                    "__scheme__": "https",
                    "job": "openshift-kube-controller-manager-operator/kube-controller-manager-operator/0"
                },
                "health": "down",
                "labels": {
                    "endpoint": "https",
                    "instance": "10.128.0.6:8443",
                    "job": "metrics",
                    "namespace": "openshift-kube-controller-manager-operator",
                    "pod": "kube-controller-manager-operator-54f59bbf4f-29dnr",
                    "service": "metrics"
                },
                "lastError": "Get https://10.128.0.6:8443/metrics: x509: certificate is valid for localhost, not metrics.openshift-kube-controller-manager-operator.svc",
                "lastScrape": "2019-02-22T07:52:46.648616909Z",
                "scrapeUrl": "https://10.128.0.6:8443/metrics"

$ curl -k -H "Authorization: Bearer $(oc sa get-token prometheus-k8s -n openshift-monitoring)" https://10.128.0.22:8443/metrics
# HELP ConfigObserver_adds Total number of adds handled by workqueue: ConfigObserver
# TYPE ConfigObserver_adds counter
ConfigObserver_adds 721
# HELP ConfigObserver_depth Current depth of workqueue: ConfigObserver
# TYPE ConfigObserver_depth gauge
ConfigObserver_depth 1
# HELP ConfigObserver_queue_latency How long an item stays in workqueueConfigObserver before being requested.
# TYPE ConfigObserver_queue_latency summary
ConfigObserver_queue_latency{quantile="0.5"} 1.8688621e+07
ConfigObserver_queue_latency{quantile="0.9"} 1.9818535e+07
ConfigObserver_queue_latency{quantile="0.99"} 1.9923432e+07
ConfigObserver_queue_latency_sum 1.3578328876e+10
ConfigObserver_queue_latency_count 720
# HELP ConfigObserver_retries Total number of retries handled by workqueue: ConfigObserver
# TYPE ConfigObserver_retries counter
ConfigObserver_retries 0
# HELP ConfigObserver_work_duration How long processing an item from workqueueConfigObserver takes.
# TYPE ConfigObserver_work_duration summary

$ curl -k -H "Authorization: Bearer $(oc sa get-token prometheus-k8s -n openshift-monitoring)" https://10.128.0.6:8443/metrics
# HELP BackingResourceController_adds Total number of adds handled by workqueue: BackingResourceController
# TYPE BackingResourceController_adds counter
BackingResourceController_adds 391
# HELP BackingResourceController_depth Current depth of workqueue: BackingResourceController
# TYPE BackingResourceController_depth gauge
BackingResourceController_depth 0
# HELP BackingResourceController_queue_latency How long an item stays in workqueueBackingResourceController before being requested.
# TYPE BackingResourceController_queue_latency summary
BackingResourceController_queue_latency{quantile="0.5"} 4
BackingResourceController_queue_latency{quantile="0.9"} 335286
BackingResourceController_queue_latency{quantile="0.99"} 335286
BackingResourceController_queue_latency_sum 4.2015996e+07
BackingResourceController_queue_latency_count 391
# HELP BackingResourceController_retries Total number of retries handled by workqueue: BackingResourceController
# TYPE BackingResourceController_retries counter
BackingResourceController_retries 0
# HELP BackingResourceController_work_duration How long processing an item from workqueueBackingResourceController takes.
# TYPE BackingResourceController_work_duration summary
BackingResourceController_work_duration{quantile="0.5"} 187615
BackingResourceController_work_duration{quantile="0.9"} 790087


Version-Release number of selected component (if applicable):
$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.0.0-0.nightly-2019-02-21-215247   True        False         4h4m    Cluster version is 4.0.0-0.nightly-2019-02-21-215247


How reproducible:
Frequently

Steps to Reproduce:
1. Check all targets via REST API
curl -k -H "Authorization: Bearer $(oc sa get-token prometheus-k8s -n openshift-monitoring)" https://{prometheus_ip}:9091/api/v1/targets | python -mjson.tool
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Frederic Branczyk 2019-02-22 10:02:10 UTC
This is not a bug in monitoring, this is the openshift-controller-manager-operator and kube-controller-manager-operator targets not presenting the expected certificates. Moving to master component.

Comment 3 Junqi Zhao 2019-02-25 00:54:41 UTC

*** This bug has been marked as a duplicate of bug 1678929 ***