Bug 1986061 - cluster network operator deploys a service monitor which is never picked up by cluster monitoring operator
Summary: cluster network operator deploys a service monitor which is never picked up b...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.8
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: ---
: 4.9.0
Assignee: cstabler
QA Contact: zhaozhanqi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-07-26 15:35 UTC by Simon Pasquier
Modified: 2021-10-18 17:42 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: CNO deploys a service monitor for the network-check-source service without correct annotations and RBAC to get discovered by Prometheus. Consequence: The service and its metrics never show up in Prometheus. Fix: Adding the correct annotations the the namespace of network-check-source service and add missing RBAC. Result: Metrics of service network-check-source get scraped by Prometheus.
Clone Of:
Environment:
Last Closed: 2021-10-18 17:41:12 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-network-operator pull 1190 0 None None None 2021-08-27 08:35:52 UTC
Red Hat Product Errata RHSA-2021:3759 0 None None None 2021-10-18 17:42:29 UTC

Description Simon Pasquier 2021-07-26 15:35:45 UTC
Description of problem:
The cluster network operator deploys a service monitor named "network-check-source" [1] which is never picked up by the cluster monitoring operator because the service monitor lives in the "openshift-network-diagnostics" namespace which doesn't have the 'openshift.io/cluster-monitoring="true"' label. 

Version-Release number of selected component (if applicable):
4.7 (probably)

How reproducible:
Always

Steps to Reproduce:
1. Spin up a 4.7 cluster.
2. Query the "pod_network_connectivity_check_count" metric [3] in Prometheus.
3.

Actual results:
No result returned.

Expected results:
Some result returned.

Additional info:

[1] https://github.com/openshift/cluster-network-operator/blob/7498fab6bdc2b872c2f09e9a35eb19794a4c0fd7/bindata/network-diagnostics/network-check-source.yaml#L80-L101
[2] https://github.com/openshift/cluster-network-operator/blob/7498fab6bdc2b872c2f09e9a35eb19794a4c0fd7/bindata/network-diagnostics/000-ns.yaml#L1-L7
[3] https://github.com/openshift/cluster-network-operator/blob/7498fab6bdc2b872c2f09e9a35eb19794a4c0fd7/pkg/cmd/checkendpoints/controller/metrics.go#L22-L25

Comment 1 Simon Pasquier 2021-07-26 15:55:36 UTC
Additional information: I've noticed about it because this service monitor starts to show up in the user-workload Prometheus once you enable it.

Comment 5 zhaozhanqi 2021-09-06 08:56:27 UTC
Verified this bug on 4.9.0-0.nightly-2021-09-05-204238


Check "pod_network_connectivity_check_count" in console --> Observe -> Metrics

Data can be found.

Comment 8 errata-xmlrpc 2021-10-18 17:41:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759


Note You need to log in before you can comment on or make changes to this bug.