Bug 2002868 - Node exporter not able to scrape OVS metrics
Summary: Node exporter not able to scrape OVS metrics
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.9
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.11.z
Assignee: Dan Williams
QA Contact: Anurag saxena
URL:
Whiteboard: perfscale-ovn
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-09-09 21:43 UTC by Mohit Sheth
Modified: 2022-08-10 10:37 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-10 10:37:25 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-network-operator pull 1393 0 None Merged Bug 2002868: ovnkube: export OVS metrics along with OVN metrics 2022-07-11 20:36:24 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 10:37:45 UTC

Description Mohit Sheth 2021-09-09 21:43:00 UTC
Description of problem:
Earlier we were able to scrape the OVS metrics like CPU and Memory utilization from Prometheus, with the recent change it runs on the host itself and the metrics are not exported

Expected results: Able to get the OVS metrics to Prometheus

Comment 1 Joe Talerico 2021-09-10 15:13:28 UTC
Since in previous versions we had OVS metrics, we should consider this a regression.

Comment 2 Dan Williams 2021-09-10 18:10:51 UTC
Note that this is not specific to ovnkube, but applies to SDN too.

Comment 4 Dan Williams 2021-12-18 16:27:21 UTC
Part of the ovn-kubernetes solution is https://github.com/ovn-org/ovn-kubernetes/pull/2723 . Still need to work out with upstream whether we enable the metrics as part of ovnkube-node (which we already expose with kbue-rbac-proxy) or whether it's done as part of a another container with the standalone upstream metrics executable.

It could be a different daemonset that both SDN and OVN can use, but that's a lot more work/book-keeping (different image too) and the code isn't huge, so can be duplicated for SDN as well.

Comment 9 Dan Williams 2022-07-11 20:37:18 UTC
This was enabled for ovn-kubernetes by https://github.com/openshift/cluster-network-operator/pull/1393 and the core functionality was present in ovnkube since the beginning of 4.11.

Comment 10 Dan Williams 2022-07-11 20:38:34 UTC
To be clear, this has been enabled & available in 4.11 ovnkube/CNO since late April 2022. Just missed the tie between PR and bug.

Comment 13 Weibin Liang 2022-07-20 15:08:18 UTC
Tested and verified in 4.11.0-rc.4

sh-4.4# curl 127.0.0.1:29105/metrics | grep -E 'process_virtual_memory_bytes|process_cpu_seconds_total|process_virtual_memory_max_bytes|process_resident_memory_bytes'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 3# HELP ovs_db_process_cpu_seconds_total Total user and system CPU time spent in seconds.
13# TYPE ovs_db_process_cpu_seconds_total counter
80ovs_db_process_cpu_seconds_total 1.63
    0 31380    0     0   74# HELP ovs_db_process_resident_memory_bytes Resident memory size in bytes.
7k# TYPE ovs_db_process_resident_memory_bytes gauge
 ovs_db_process_resident_memory_bytes 3.2751616e+07
     0 --:--:-- --:--:-- --:--:--  766k
# HELP ovs_db_process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE ovs_db_process_virtual_memory_bytes gauge
ovs_db_process_virtual_memory_bytes 9.6464896e+07
# HELP ovs_db_process_virtual_memory_max_bytes Maximum amount of virtual memory available in bytes.
# TYPE ovs_db_process_virtual_memory_max_bytes gauge
ovs_db_process_virtual_memory_max_bytes 1.8446744073709552e+19
# HELP ovs_vswitchd_process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE ovs_vswitchd_process_cpu_seconds_total counter
ovs_vswitchd_process_cpu_seconds_total 25.24
# HELP ovs_vswitchd_process_resident_memory_bytes Resident memory size in bytes.
# TYPE ovs_vswitchd_process_resident_memory_bytes gauge
ovs_vswitchd_process_resident_memory_bytes 1.95637248e+08
# HELP ovs_vswitchd_process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE ovs_vswitchd_process_virtual_memory_bytes gauge
ovs_vswitchd_process_virtual_memory_bytes 7.06281472e+08
# HELP ovs_vswitchd_process_virtual_memory_max_bytes Maximum amount of virtual memory available in bytes.
# TYPE ovs_vswitchd_process_virtual_memory_max_bytes gauge
ovs_vswitchd_process_virtual_memory_max_bytes 1.8446744073709552e+19
sh-4.4# exit
exit
[weliang@weliang ~]$ oc get clusterversion
NAME      VERSION       AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.11.0-rc.4   True        False         24m     Cluster version is 4.11.0-rc.4
[weliang@weliang ~]$

Comment 15 errata-xmlrpc 2022-08-10 10:37:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.