Bug 1974967
| Summary: | Prometheus Memory Usage 50-100% higher on 4.8+ OVN when under load | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Keith <kwhitley> | ||||
| Component: | Networking | Assignee: | Antonio Ojea <aojeagar> | ||||
| Networking sub component: | ovn-kubernetes | QA Contact: | Kedar Kulkarni <kkulkarn> | ||||
| Status: | CLOSED ERRATA | Docs Contact: | |||||
| Severity: | medium | ||||||
| Priority: | unspecified | CC: | aconstan, anpicker, aojeagar, aos-bugs, astoycos, erooth, jfajersk, jlema, nelluri, vrutkovs, zzhao | ||||
| Version: | 4.8 | ||||||
| Target Milestone: | --- | ||||||
| Target Release: | 4.9.0 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: |
Cause: Service controller metrics cardinality explosion, since it was tracked metrics per each service created.
Consequence: High memory usage on OVN master pods
Fix: Reduce cardinality on metrics removing the per service label.
Result: Reduce memory usage on OVN master pods
|
Story Points: | --- | ||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2021-10-18 17:35:57 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Keith
2021-06-22 20:47:37 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=1925061 is likely unrelated. We know about memory usage increase on updates and its connected to series churn/restart all containers. This one seems to be specific to the underlying provider. One straight forward theory would be that on OVN prometheus ingests more series then with SDN. Is there a way to get a OVN/SDN pair of clusters, either 4.8 or 4.9, so we can investigate a bit? I would agree about OVN, however, I don't see this issue on 4.7 OVN so it might be a relatively recent change there. We have our regular workloads running today so we should get 4.8/4.9 clusters up. They're almost done installing now but the workloads that reproduce this issue take a bit longer to run. I'll get the clusters into the state they were in so we can see what the differences are From the monitoring perspective the serviceMonitor/openshift-ovn-kubernetes/monitor-ovn-master-metrics is the offender here. One metric in particular seems to cause the brunt of the resource usage: ovnkube_master_sync_service_latency_seconds_bucket This metric carries a label called name which has as its value a namespace ID (probably among other things). E.g. name="cluster-density-374ea166-191f-46ba-8626-5f7859567ab3-1/deployment-1pod-1-1". The scaling test, that exposed this creates many of these namespace and in turn we see a cardinality explosion for this metric (see screenshots attached). This dramatically increases prometheus' resource usage and slows down the exporter quite a bit. Using IDs that can grow without constraints should not be used as a label value. I suspect that the main issue is that the ovnkube_master_sync_service_latency_seconds_bucket, once created, never go away when the respective namespace/pod is deleted. If I'm not mistaken the data is exported here https://github.com/ovn-org/ovn-kubernetes/blob/master/go-controller/pkg/ovn/controller/services/services_controller.go I'm not sure how valuable this metric is but either the metrics for deleted namespaces and pods mus also be deleted or it might be worth considering if the name label is needed or not. This is my fault, since I really didn't understand well the implications of labels on prometheus metrics. We can just have only a global metrics, no need for such per service granularity https://github.com/ovn-org/ovn-kubernetes/pull/2279 This fix made it downstream in https://github.com/openshift/ovn-kubernetes/pull/600 Hi, I tested the OVN and SDN 4.8/4.9 side by side, with exact same kind of workload at 50 nodes scale. Based on my observations, Prometheus memory usage for OVN 4.9 showed improvements over OVN 4.8. Roughly it improved by ~16%. Just to note: Between OVN and SDN though, for 4.9, SDN used around ~12.6GB memory(avg across both replicas) while OVN ~22GB memory(avg across both replicas) of prometheus. Since the fix that was merged, supposed to improve OVN, and that is what I observed, I am marking this as Verified. @kwhitley please open a new BZ if you think this issue needs further improvements. Thanks, KK. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759 |