1724498 – Prometheus jobs for ingresses aren't working as expected if multiple ingresses exist

Bug 1724498 - Prometheus jobs for ingresses aren't working as expected if multiple ingresses exist

Summary: Prometheus jobs for ingresses aren't working as expected if multiple ingress...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	4.1.z
Hardware:	Unspecified
OS:	Unspecified
Priority:	low
Severity:	low
Target Milestone:	---
Target Release:	4.2.0
Assignee:	Dan Mace
QA Contact:	Hongan Li
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1740258
TreeView+	depends on / blocked

Reported:	2019-06-27 08:24 UTC by Juan Luis de Sousa-Valadas
Modified:	2023-09-07 20:11 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-10-16 06:32:41 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift cluster-ingress-operator pull 242	0	'None'	closed	add selector to servicemonitor	2021-01-20 19:42:01 UTC
Red Hat Product Errata	RHBA-2019:2922	0	None	None	None	2019-10-16 06:32:53 UTC

Description Juan Luis de Sousa-Valadas 2019-06-27 08:24:36 UTC

Description of problem:
Creating a second ingress causes prometheus to try to scrap routers using the wrong certificate.

Version-Release number of selected component (if applicable):
OpenShift 4.1

How reproducible:
Always

Steps to Reproduce:
1. Create a second ingresscontroller in the namespace openshift-ingress-operator
2. Check the in grafana the openshift-ingress/<router>/0 jobs

Actual results:
Both jobs scrap all the pods in the openshift-ingress operator, provoking TLS errors when querying endpoints of the other pod.

In other words openshift-ingress/router-default/0 shows the router-default pods  as UP and the router-custom as UP and the openshift-ingress/router-default/0 shows the router-default pods as DOWN and the router-custom as UP

Expected results:
Each job only scraps the pods of the ingress they're monitoring


Additional info:

My understanding is there should be a regex based on __meta_kubernetes_endpoints_name. These are the endppints generated:

    - job_name: openshift-ingress/router-custom/0                                        │- job_name: openshift-ingress/router-default/0                                
      honor_labels: false                                                                │  honor_labels: false                                
      kubernetes_sd_configs:                                                             │  kubernetes_sd_configs:                                
      - role: endpoints                                                                  │  - role: endpoints                                
        namespaces:                                                                      │    namespaces:                                
          names:                                                                         │      names:                                
          - openshift-ingress                                                            │      - openshift-ingress                                
      scrape_interval: 30s                                                               │  scrape_interval: 30s                                
      metrics_path: /metrics                                                             │  metrics_path: /metrics                                
      scheme: https                                                                      │  scheme: https                                
      tls_config:                                                                        │  tls_config:                                
        insecure_skip_verify: false                                                      │    insecure_skip_verify: false                                
        ca_file: /etc/prometheus/configmaps/serving-certs-ca-bundle/service-ca.crt       │    ca_file: /etc/prometheus/configmaps/serving-certs-ca-bundle/service-ca.crt                                
        server_name: router-internal-dmz.openshift-ingress.svc                           │    server_name: router-internal-default.openshift-ingress.svc                                
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token             │  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token                                
      relabel_configs:                                                                   │  relabel_configs:                                
      - action: keep                                                                     │  - action: keep                                
        source_labels:                                                                   │    source_labels:                                
        - __meta_kubernetes_endpoint_port_name                                           │    - __meta_kubernetes_endpoint_port_name                                
        regex: metrics                                                                   │    regex: metrics                                
      - source_labels:                                                                   │  - source_labels:                                
        - __meta_kubernetes_endpoint_address_target_kind                                 │    - __meta_kubernetes_endpoint_address_target_kind                                
        - __meta_kubernetes_endpoint_address_target_name                                 │    - __meta_kubernetes_endpoint_address_target_name                                
        separator: ;                                                                     │    separator: ;                                
        regex: Node;(.*)                                                                 │    regex: Node;(.*)                                
        replacement: ${1}                                                                │    replacement: ${1}                                
        target_label: node                                                               │    target_label: node                                
      - source_labels:                                                                   │  - source_labels:                                
        - __meta_kubernetes_endpoint_address_target_kind                                 │    - __meta_kubernetes_endpoint_address_target_kind                                
        - __meta_kubernetes_endpoint_address_target_name                                 │    - __meta_kubernetes_endpoint_address_target_name                                
        separator: ;                                                                     │    separator: ;                                
        regex: Pod;(.*)                                                                  │    regex: Pod;(.*)                                
        replacement: ${1}                                                                │    replacement: ${1}                                
        target_label: pod                                                                │    target_label: pod                                
      - source_labels:                                                                   │  - source_labels:                                
        - __meta_kubernetes_namespace                                                    │    - __meta_kubernetes_namespace                                
        target_label: namespace                                                          │    target_label: namespace                                
      - source_labels:                                                                   │  - source_labels:                                
        - __meta_kubernetes_service_name                                                 │    - __meta_kubernetes_service_name                                
        target_label: service                                                            │    target_label: service                                
      - source_labels:                                                                   │  - source_labels:                                
        - __meta_kubernetes_pod_name                                                     │    - __meta_kubernetes_pod_name                                
        target_label: pod                                                                │    target_label: pod                                
      - source_labels:                                                                   │  - source_labels:                                
        - __meta_kubernetes_service_name                                                 │    - __meta_kubernetes_service_name                                
        target_label: job                                                                │    target_label: job                                
        replacement: ${1}                                                                │    replacement: ${1}                                
      - target_label: endpoint                                                           │  - target_label: endpoint                                
        replacement: metrics                                                             │    replacement: metrics

Comment 1 Hongan Li 2019-07-02 02:31:50 UTC

This issue has been fixed in 4.2.0 by PR: https://github.com/openshift/cluster-ingress-operator/pull/242.

Checked with 4.2.0-0.nightly-2019-06-30-221852 and below lines were added to the job (take router-default for example):

<---snip--->
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_label_ingresscontroller_operator_openshift_io_owning_ingresscontroller]
    separator: ;
    regex: default
    replacement: $1
    action: keep
<---snip--->

Comment 3 Hongan Li 2019-07-18 05:28:54 UTC

The bug has been verified according to Comment 1.

Comment 5 errata-xmlrpc 2019-10-16 06:32:41 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2922

Note You need to log in before you can comment on or make changes to this bug.