Description of problem: Creating a second ingress causes prometheus to try to scrap routers using the wrong certificate. Version-Release number of selected component (if applicable): OpenShift 4.1 How reproducible: Always Steps to Reproduce: 1. Create a second ingresscontroller in the namespace openshift-ingress-operator 2. Check the in grafana the openshift-ingress/<router>/0 jobs Actual results: Both jobs scrap all the pods in the openshift-ingress operator, provoking TLS errors when querying endpoints of the other pod. In other words openshift-ingress/router-default/0 shows the router-default pods as UP and the router-custom as UP and the openshift-ingress/router-default/0 shows the router-default pods as DOWN and the router-custom as UP Expected results: Each job only scraps the pods of the ingress they're monitoring Additional info: My understanding is there should be a regex based on __meta_kubernetes_endpoints_name. These are the endppints generated: - job_name: openshift-ingress/router-custom/0 │- job_name: openshift-ingress/router-default/0 honor_labels: false │ honor_labels: false kubernetes_sd_configs: │ kubernetes_sd_configs: - role: endpoints │ - role: endpoints namespaces: │ namespaces: names: │ names: - openshift-ingress │ - openshift-ingress scrape_interval: 30s │ scrape_interval: 30s metrics_path: /metrics │ metrics_path: /metrics scheme: https │ scheme: https tls_config: │ tls_config: insecure_skip_verify: false │ insecure_skip_verify: false ca_file: /etc/prometheus/configmaps/serving-certs-ca-bundle/service-ca.crt │ ca_file: /etc/prometheus/configmaps/serving-certs-ca-bundle/service-ca.crt server_name: router-internal-dmz.openshift-ingress.svc │ server_name: router-internal-default.openshift-ingress.svc bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token │ bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token relabel_configs: │ relabel_configs: - action: keep │ - action: keep source_labels: │ source_labels: - __meta_kubernetes_endpoint_port_name │ - __meta_kubernetes_endpoint_port_name regex: metrics │ regex: metrics - source_labels: │ - source_labels: - __meta_kubernetes_endpoint_address_target_kind │ - __meta_kubernetes_endpoint_address_target_kind - __meta_kubernetes_endpoint_address_target_name │ - __meta_kubernetes_endpoint_address_target_name separator: ; │ separator: ; regex: Node;(.*) │ regex: Node;(.*) replacement: ${1} │ replacement: ${1} target_label: node │ target_label: node - source_labels: │ - source_labels: - __meta_kubernetes_endpoint_address_target_kind │ - __meta_kubernetes_endpoint_address_target_kind - __meta_kubernetes_endpoint_address_target_name │ - __meta_kubernetes_endpoint_address_target_name separator: ; │ separator: ; regex: Pod;(.*) │ regex: Pod;(.*) replacement: ${1} │ replacement: ${1} target_label: pod │ target_label: pod - source_labels: │ - source_labels: - __meta_kubernetes_namespace │ - __meta_kubernetes_namespace target_label: namespace │ target_label: namespace - source_labels: │ - source_labels: - __meta_kubernetes_service_name │ - __meta_kubernetes_service_name target_label: service │ target_label: service - source_labels: │ - source_labels: - __meta_kubernetes_pod_name │ - __meta_kubernetes_pod_name target_label: pod │ target_label: pod - source_labels: │ - source_labels: - __meta_kubernetes_service_name │ - __meta_kubernetes_service_name target_label: job │ target_label: job replacement: ${1} │ replacement: ${1} - target_label: endpoint │ - target_label: endpoint replacement: metrics │ replacement: metrics
This issue has been fixed in 4.2.0 by PR: https://github.com/openshift/cluster-ingress-operator/pull/242. Checked with 4.2.0-0.nightly-2019-06-30-221852 and below lines were added to the job (take router-default for example): <---snip---> relabel_configs: - source_labels: [__meta_kubernetes_service_label_ingresscontroller_operator_openshift_io_owning_ingresscontroller] separator: ; regex: default replacement: $1 action: keep <---snip--->
The bug has been verified according to Comment 1.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922