Bug 1986061 - cluster network operator deploys a service monitor which is never picked up by cluster monitoring operator
Summary: cluster network operator deploys a service monitor which is never picked up b...
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.8
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: 4.9.0
Assignee: Christoph Stäbler
QA Contact: zhaozhanqi
: 2027290 (view as bug list)
Depends On:
TreeView+ depends on / blocked
Reported: 2021-07-26 15:35 UTC by Simon Pasquier
Modified: 2021-11-29 13:18 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: CNO deploys a service monitor for the network-check-source service without correct annotations and RBAC to get discovered by Prometheus. Consequence: The service and its metrics never show up in Prometheus. Fix: Adding the correct annotations the the namespace of network-check-source service and add missing RBAC. Result: Metrics of service network-check-source get scraped by Prometheus.
Clone Of:
Last Closed: 2021-10-18 17:41:12 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Private Priority Status Summary Last Updated
Github openshift cluster-network-operator pull 1190 0 None None None 2021-08-27 08:35:52 UTC
Red Hat Product Errata RHSA-2021:3759 0 None None None 2021-10-18 17:42:29 UTC

Description Simon Pasquier 2021-07-26 15:35:45 UTC
Description of problem:
The cluster network operator deploys a service monitor named "network-check-source" [1] which is never picked up by the cluster monitoring operator because the service monitor lives in the "openshift-network-diagnostics" namespace which doesn't have the 'openshift.io/cluster-monitoring="true"' label. 

Version-Release number of selected component (if applicable):
4.7 (probably)

How reproducible:

Steps to Reproduce:
1. Spin up a 4.7 cluster.
2. Query the "pod_network_connectivity_check_count" metric [3] in Prometheus.

Actual results:
No result returned.

Expected results:
Some result returned.

Additional info:

[1] https://github.com/openshift/cluster-network-operator/blob/7498fab6bdc2b872c2f09e9a35eb19794a4c0fd7/bindata/network-diagnostics/network-check-source.yaml#L80-L101
[2] https://github.com/openshift/cluster-network-operator/blob/7498fab6bdc2b872c2f09e9a35eb19794a4c0fd7/bindata/network-diagnostics/000-ns.yaml#L1-L7
[3] https://github.com/openshift/cluster-network-operator/blob/7498fab6bdc2b872c2f09e9a35eb19794a4c0fd7/pkg/cmd/checkendpoints/controller/metrics.go#L22-L25

Comment 1 Simon Pasquier 2021-07-26 15:55:36 UTC
Additional information: I've noticed about it because this service monitor starts to show up in the user-workload Prometheus once you enable it.

Comment 5 zhaozhanqi 2021-09-06 08:56:27 UTC
Verified this bug on 4.9.0-0.nightly-2021-09-05-204238

Check "pod_network_connectivity_check_count" in console --> Observe -> Metrics

Data can be found.

Comment 8 errata-xmlrpc 2021-10-18 17:41:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Comment 9 Simon Pasquier 2021-11-29 10:56:30 UTC
*** Bug 2027290 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.