Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 2002868

Summary:	Node exporter not able to scrape OVS metrics
Product:	OpenShift Container Platform	Reporter:	Mohit Sheth <msheth>
Component:	Networking	Assignee:	Dan Williams <dcbw>
Networking sub component:	ovn-kubernetes	QA Contact:	Anurag saxena <anusaxen>
Status:	CLOSED ERRATA	Docs Contact:
Severity:	high
Priority:	high	CC:	spasquie, trozet, weliang, zzhao
Version:	4.9
Target Milestone:	---
Target Release:	4.11.z
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:	perfscale-ovn
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2022-08-10 10:37:25 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Mohit Sheth 2021-09-09 21:43:00 UTC

Description of problem:
Earlier we were able to scrape the OVS metrics like CPU and Memory utilization from Prometheus, with the recent change it runs on the host itself and the metrics are not exported

Expected results: Able to get the OVS metrics to Prometheus

Comment 1 Joe Talerico 2021-09-10 15:13:28 UTC

Since in previous versions we had OVS metrics, we should consider this a regression.

Comment 2 Dan Williams 2021-09-10 18:10:51 UTC

Note that this is not specific to ovnkube, but applies to SDN too.

Comment 4 Dan Williams 2021-12-18 16:27:21 UTC

Part of the ovn-kubernetes solution is https://github.com/ovn-org/ovn-kubernetes/pull/2723 . Still need to work out with upstream whether we enable the metrics as part of ovnkube-node (which we already expose with kbue-rbac-proxy) or whether it's done as part of a another container with the standalone upstream metrics executable.

It could be a different daemonset that both SDN and OVN can use, but that's a lot more work/book-keeping (different image too) and the code isn't huge, so can be duplicated for SDN as well.

Comment 9 Dan Williams 2022-07-11 20:37:18 UTC

This was enabled for ovn-kubernetes by https://github.com/openshift/cluster-network-operator/pull/1393 and the core functionality was present in ovnkube since the beginning of 4.11.

Comment 10 Dan Williams 2022-07-11 20:38:34 UTC

To be clear, this has been enabled & available in 4.11 ovnkube/CNO since late April 2022. Just missed the tie between PR and bug.

Comment 13 Weibin Liang 2022-07-20 15:08:18 UTC

Tested and verified in 4.11.0-rc.4

sh-4.4# curl 127.0.0.1:29105/metrics | grep -E 'process_virtual_memory_bytes|process_cpu_seconds_total|process_virtual_memory_max_bytes|process_resident_memory_bytes'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 3# HELP ovs_db_process_cpu_seconds_total Total user and system CPU time spent in seconds.
13# TYPE ovs_db_process_cpu_seconds_total counter
80ovs_db_process_cpu_seconds_total 1.63
    0 31380    0     0   74# HELP ovs_db_process_resident_memory_bytes Resident memory size in bytes.
7k# TYPE ovs_db_process_resident_memory_bytes gauge
 ovs_db_process_resident_memory_bytes 3.2751616e+07
     0 --:--:-- --:--:-- --:--:--  766k
# HELP ovs_db_process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE ovs_db_process_virtual_memory_bytes gauge
ovs_db_process_virtual_memory_bytes 9.6464896e+07
# HELP ovs_db_process_virtual_memory_max_bytes Maximum amount of virtual memory available in bytes.
# TYPE ovs_db_process_virtual_memory_max_bytes gauge
ovs_db_process_virtual_memory_max_bytes 1.8446744073709552e+19
# HELP ovs_vswitchd_process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE ovs_vswitchd_process_cpu_seconds_total counter
ovs_vswitchd_process_cpu_seconds_total 25.24
# HELP ovs_vswitchd_process_resident_memory_bytes Resident memory size in bytes.
# TYPE ovs_vswitchd_process_resident_memory_bytes gauge
ovs_vswitchd_process_resident_memory_bytes 1.95637248e+08
# HELP ovs_vswitchd_process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE ovs_vswitchd_process_virtual_memory_bytes gauge
ovs_vswitchd_process_virtual_memory_bytes 7.06281472e+08
# HELP ovs_vswitchd_process_virtual_memory_max_bytes Maximum amount of virtual memory available in bytes.
# TYPE ovs_vswitchd_process_virtual_memory_max_bytes gauge
ovs_vswitchd_process_virtual_memory_max_bytes 1.8446744073709552e+19
sh-4.4# exit
exit
[weliang@weliang ~]$ oc get clusterversion
NAME      VERSION       AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.11.0-rc.4   True        False         24m     Cluster version is 4.11.0-rc.4
[weliang@weliang ~]$

Comment 15 errata-xmlrpc 2022-08-10 10:37:25 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069