Bug 1687640 - router metrics with monitoring integration does not work
Summary: router metrics with monitoring integration does not work
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: 4.1.0
Assignee: Dan Mace
QA Contact: Hongan Li
Depends On:
TreeView+ depends on / blocked
Reported: 2019-03-12 02:46 UTC by Hongan Li
Modified: 2022-08-04 22:20 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: REST scheme/mapper of kube client is cached before ServiceMonitor CRD is registered. Consequence: ServiceMonitor creation for router metrics fails as Client will return GVK error. Fix: Refresh kube client generation in case of NoMatch error for ServiceMonitor. Result: Router metrics is reported correctly.
Clone Of:
Last Closed: 2019-06-04 10:45:33 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:0758 0 None None None 2019-06-04 10:45:41 UTC

Description Hongan Li 2019-03-12 02:46:29 UTC
Description of problem:
Router metrics with monitoring integration does not work.
No servicemonitor resources in openshift-ingress namespace.
No router metrics on Prometheus UI. 

Version-Release number of selected component (if applicable):.
$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.0.0-0.nightly-2019-03-06-074438   True        False         19h     Cluster version is 4.0.0-0.nightly-2019-03-06-074438

How reproducible:

Steps to Reproduce:
1. install 4.0 cluster on AWS
2. oc get servicemonitor -n openshift-ingress
3. Logon Prometheus UI and go to Status>Targets page to check router targets.

Actual results:
No servicemonitor resources in openshift-ingress namespace.
No router metrics on Prometheus UI. 

Expected results:
User should view the router metrics on Prometheus UI.

Additional info:
It was working after https://github.com/openshift/cluster-ingress-operator/pull/108 merged, but is not working in latest build.

Comment 2 Ravi Sankar 2019-03-19 22:33:37 UTC
Fixed by https://github.com/openshift/cluster-ingress-operator/pull/166

Comment 3 Hongan Li 2019-03-25 02:19:42 UTC
verified with 4.0.0-0.nightly-2019-03-23-222829 and the issue has been fixed. The metrics of routers can be showed on Prometheus UI.

$ oc get servicemonitors.monitoring.coreos.com/router-default -n openshift-ingress -o yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
  creationTimestamp: 2019-03-25T00:29:31Z
  generation: 1
  name: router-default
  namespace: openshift-ingress
  - apiVersion: apps/v1
    controller: true
    kind: Deployment
    name: router-default
    uid: 06c11402-4e95-11e9-b378-125857bf72fa
  resourceVersion: "10253"
  selfLink: /apis/monitoring.coreos.com/v1/namespaces/openshift-ingress/servicemonitors/router-default
  uid: 0e856e06-4e95-11e9-b378-125857bf72fa
  - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
    interval: 30s
    path: /metrics
    port: metrics
    scheme: https
      caFile: /etc/prometheus/configmaps/serving-certs-ca-bundle/service-ca.crt
      serverName: router-internal-default.openshift-ingress.svc
    - openshift-ingress
  selector: {}

Comment 6 errata-xmlrpc 2019-06-04 10:45:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.