Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1955457

Summary:	Drop container_memory_failures_total metric because of high cardinality
Product:	OpenShift Container Platform	Reporter:	Simon Pasquier <spasquie>
Component:	Monitoring	Assignee:	Simon Pasquier <spasquie>
Status:	CLOSED ERRATA	QA Contact:	Junqi Zhao <juzhao>
Severity:	high	Docs Contact:
Priority:	high
Version:	4.6	CC:	alegrand, anpicker, erooth, kakkoyun, lcosic, pkrupa
Target Milestone:	---
Target Release:	4.8.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	No Doc Update
Doc Text:		Story Points:	---
Clone Of:
Clones:	1955462 (view as bug list)		Environment:
Last Closed:	2021-07-27 23:05:13 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1951052, 1955462

Description Simon Pasquier 2021-04-30 07:36:45 UTC

Description of problem:
The container_memory_failures_total metric is in the top 10 of metrics with high cardinality. It isn't used in any rule or dashboard. Storing the metric in Prometheus increases memory usage for no benefit.

Version-Release number of selected component (if applicable):
4.6

How reproducible:
Always

Steps to Reproduce:
1. Open the Prometheus UI, go to the Status > TSDB status page and look at the "Top 10 series count by metric names" section.

Actual results:
container_memory_failures_total is listed.

Expected results:
container_memory_failures_total isn't present.

Additional info:
N/A

Comment 2 Junqi Zhao 2021-05-06 03:55:35 UTC

checked with 4.8.0-0.nightly-2021-05-05-030749, no container_memory_failures_total metrics now
# token=`oc sa get-token prometheus-k8s -n openshift-monitoring`
# oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://prometheus-k8s.openshift-monitoring.svc:9091/api/v1/label/__name__/values' | jq | grep container_memory_failures_total
no result

Comment 5 errata-xmlrpc 2021-07-27 23:05:13 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438