Bug 2002868
Summary: | Node exporter not able to scrape OVS metrics | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Mohit Sheth <msheth> |
Component: | Networking | Assignee: | Dan Williams <dcbw> |
Networking sub component: | ovn-kubernetes | QA Contact: | Anurag saxena <anusaxen> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | high | ||
Priority: | high | CC: | spasquie, trozet, weliang, zzhao |
Version: | 4.9 | ||
Target Milestone: | --- | ||
Target Release: | 4.11.z | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | perfscale-ovn | ||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2022-08-10 10:37:25 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Mohit Sheth
2021-09-09 21:43:00 UTC
Since in previous versions we had OVS metrics, we should consider this a regression. Note that this is not specific to ovnkube, but applies to SDN too. Part of the ovn-kubernetes solution is https://github.com/ovn-org/ovn-kubernetes/pull/2723 . Still need to work out with upstream whether we enable the metrics as part of ovnkube-node (which we already expose with kbue-rbac-proxy) or whether it's done as part of a another container with the standalone upstream metrics executable. It could be a different daemonset that both SDN and OVN can use, but that's a lot more work/book-keeping (different image too) and the code isn't huge, so can be duplicated for SDN as well. This was enabled for ovn-kubernetes by https://github.com/openshift/cluster-network-operator/pull/1393 and the core functionality was present in ovnkube since the beginning of 4.11. To be clear, this has been enabled & available in 4.11 ovnkube/CNO since late April 2022. Just missed the tie between PR and bug. Tested and verified in 4.11.0-rc.4 sh-4.4# curl 127.0.0.1:29105/metrics | grep -E 'process_virtual_memory_bytes|process_cpu_seconds_total|process_virtual_memory_max_bytes|process_resident_memory_bytes' % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 3# HELP ovs_db_process_cpu_seconds_total Total user and system CPU time spent in seconds. 13# TYPE ovs_db_process_cpu_seconds_total counter 80ovs_db_process_cpu_seconds_total 1.63 0 31380 0 0 74# HELP ovs_db_process_resident_memory_bytes Resident memory size in bytes. 7k# TYPE ovs_db_process_resident_memory_bytes gauge ovs_db_process_resident_memory_bytes 3.2751616e+07 0 --:--:-- --:--:-- --:--:-- 766k # HELP ovs_db_process_virtual_memory_bytes Virtual memory size in bytes. # TYPE ovs_db_process_virtual_memory_bytes gauge ovs_db_process_virtual_memory_bytes 9.6464896e+07 # HELP ovs_db_process_virtual_memory_max_bytes Maximum amount of virtual memory available in bytes. # TYPE ovs_db_process_virtual_memory_max_bytes gauge ovs_db_process_virtual_memory_max_bytes 1.8446744073709552e+19 # HELP ovs_vswitchd_process_cpu_seconds_total Total user and system CPU time spent in seconds. # TYPE ovs_vswitchd_process_cpu_seconds_total counter ovs_vswitchd_process_cpu_seconds_total 25.24 # HELP ovs_vswitchd_process_resident_memory_bytes Resident memory size in bytes. # TYPE ovs_vswitchd_process_resident_memory_bytes gauge ovs_vswitchd_process_resident_memory_bytes 1.95637248e+08 # HELP ovs_vswitchd_process_virtual_memory_bytes Virtual memory size in bytes. # TYPE ovs_vswitchd_process_virtual_memory_bytes gauge ovs_vswitchd_process_virtual_memory_bytes 7.06281472e+08 # HELP ovs_vswitchd_process_virtual_memory_max_bytes Maximum amount of virtual memory available in bytes. # TYPE ovs_vswitchd_process_virtual_memory_max_bytes gauge ovs_vswitchd_process_virtual_memory_max_bytes 1.8446744073709552e+19 sh-4.4# exit exit [weliang@weliang ~]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.0-rc.4 True False 24m Cluster version is 4.11.0-rc.4 [weliang@weliang ~]$ Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069 |