1757551 – Metering fails to import pod cpu/memory usage metrics due to many-to-many matching error

Bug 1757551 - Metering fails to import pod cpu/memory usage metrics due to many-to-many matching error

Summary: Metering fails to import pod cpu/memory usage metrics due to many-to-many mat...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Metering Operator
Sub Component:
Version:	4.2.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Target Release:	4.2.z
Assignee:	Brett Tofel
QA Contact:	Peter Ruan
Docs Contact:
URL:
Whiteboard:
Depends On:	1757547
Blocks:
TreeView+	depends on / blocked

Reported:	2019-10-01 19:31 UTC by Chance Zibolski
Modified:	2020-02-24 16:53 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:	1757547
Environment:
Last Closed:	2020-02-24 16:52:45 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	operator-framework operator-metering pull 1014	0	'None'	closed	[release-4.2]: bug 1757551: charts/openshift-metering: Update cpu/memory usage promqueries to handle many to many matchi...	2020-05-05 00:03:32 UTC
Red Hat Product Errata	RHBA-2020:0460	0	None	None	None	2020-02-24 16:52:59 UTC

Description Chance Zibolski 2019-10-01 19:31:27 UTC

+++ This bug was initially created as a clone of Bug #1757547 +++

Description of problem: The following error sometimes occurs when metering is importing CPU or memory usage metrics in the pod-usage-cpu-cores or pod-usage-memory-bytes ReportDataSources.

time="2019-10-01T19:16:51Z" level=error msg="error collecting metrics" app=metering chunkSize=5m0s component=PrometheusImporter endTime="2019-10-01 19:16:39.238555324 +0000 UTC" error="failed to perform Prometheus query: execution: many-to-many matching not allowed: matching labels must be unique on one side" logID=OBju7Ykcm2 namespace=metering-chancez2 reportDataSource=pod-usage-cpu-cores startTime="2019-09-23 23:12:00 +0000 UTC" stepSize=1m0s tableName=hive.metering.datasource_metering_chancez2_pod_usage_cpu_cores


Version-Release number of selected component (if applicable): 4.3.0


How reproducible: This seems to depend heavily on the metrics in Prometheus. I cannot determine what part of the query causes it yet, but it's caused by a group_left from the container usage metrics to the kube_pod_info metrics. 


Steps to Reproduce: Unknown

Actual results: Failed to perform Prometheus query, thus no metrics imported


Expected results: Promtheus queries in ReportDataSources do not error when doing group_left, and metrics are imported.


Additional info:

Comment 1 Chance Zibolski 2019-10-01 19:32:51 UTC

The above description should say the effected version is 4.2.0, i forgot to modify that when cloning.

Comment 3 Peter Ruan 2019-11-14 18:49:48 UTC

Move it back to POST due the bug dropped from errata.  Please move it back to ON_QA once https://github.com/operator-framework/operator-metering/pull/1016 is merged.  Thanks!

Comment 6 tflannag 2019-12-17 14:23:03 UTC

PR #1016 was merged, so moving back to ON_QA.

Comment 7 Peter Ruan 2020-02-04 23:17:28 UTC

Not seeing it with latest nightly.  
pruan@desktop ~/workspace/testcases/aws_billing $ oc logs  reporting-operator-b4bbd7c87-qk275 -c reporting-operator  | grep 'many-to-many matching not allowed'
pruan@desktop ~/workspace/testcases/aws_billing $

Comment 9 errata-xmlrpc 2020-02-24 16:52:45 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0460

Note You need to log in before you can comment on or make changes to this bug.