Bug 1888549

Summary: Prometheus: err="query processing would load too many samples into memory in query execution"
Product: OpenShift Container Platform Reporter: Oscar Casal Sanchez <ocasalsa>
Component: MonitoringAssignee: Sergiusz Urbaniak <surbania>
Status: CLOSED DUPLICATE QA Contact: Junqi Zhao <juzhao>
Severity: low Docs Contact:
Priority: unspecified    
Version: 4.5CC: alegrand, anpicker, erooth, kakkoyun, lcosic, mloibl, pkrupa, spasquie, surbania
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-15 07:49:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Oscar Casal Sanchez 2020-10-15 07:42:33 UTC
[Description of problem]

In the prometheus pod logs it's possible to read errors processing load dues too many examples into memory in query execution the message:

~~~
2020-10-12T08:18:00.703972307Z level=warn ts=2020-10-12T08:18:00.703Z caller=manager.go:525 component="rule manager" group=kube-apiserver-availability.rules msg="Evaluating rule failed" rule="record: apiserver_request:availability30d\nexpr: 1 - ((sum(increase(apiserver_request_duration_seconds_count{verb=~\"POST|PUT|PATCH|DELETE\"}[30d]))\n  - sum(increase(apiserver_request_duration_seconds_bucket{le=\"1\",verb=~\"POST|PUT|PATCH|DELETE\"}[30d])))\n  + (sum(increase(apiserver_request_duration_seconds_count{verb=~\"LIST|GET\"}[30d]))\n  - (sum(increase(apiserver_request_duration_seconds_bucket{le=\"0.1\",scope=~\"resource|\",verb=~\"LIST|GET\"}[30d]))\n  + sum(increase(apiserver_request_duration_seconds_bucket{le=\"0.5\",scope=\"namespace\",verb=~\"LIST|GET\"}[30d]))\n  + sum(increase(apiserver_request_duration_seconds_bucket{le=\"5\",scope=\"cluster\",verb=~\"LIST|GET\"}[30d]))))\n  + sum(code:apiserver_request_total:increase30d{code=~\"5..\"} or vector(0))) / sum(code:apiserver_request_total:increase30d)\nlabels:\n  verb: all\n" err="query processing would load too many samples into memory in query execution"
2020-10-12T08:21:00.851730908Z level=warn ts=2020-10-12T08:21:00.851Z caller=manager.go:525 component="rule manager" group=kube-apiserver-availability.rules msg="Evaluating rule failed" rule="record: apiserver_request:availability30d\nexpr: 1 - ((sum(increase(apiserver_request_duration_seconds_count{verb=~\"POST|PUT|PATCH|DELETE\"}[30d]))\n  - sum(increase(apiserver_request_duration_seconds_bucket{le=\"1\",verb=~\"POST|PUT|PATCH|DELETE\"}[30d])))\n  + (sum(increase(apiserver_request_duration_seconds_count{verb=~\"LIST|GET\"}[30d]))\n  - (sum(increase(apiserver_request_duration_seconds_bucket{le=\"0.1\",scope=~\"resource|\",verb=~\"LIST|GET\"}[30d]))\n  + sum(increase(apiserver_request_duration_seconds_bucket{le=\"0.5\",scope=\"namespace\",verb=~\"LIST|GET\"}[30d]))\n  + sum(increase(apiserver_request_duration_seconds_bucket{le=\"5\",scope=\"cluster\",verb=~\"LIST|GET\"}[30d]))))\n  + sum(code:apiserver_request_total:increase30d{code=~\"5..\"} or vector(0))) / sum(code:apiserver_request_total:increase30d)\nlabels:\n  verb: all\n" err="query processing would load too many samples into memory in query execution"
~~~

In this error is possible to see that this query is delivered by default with the default configuration for prometheus. 

[Version-Release number of selected component (if applicable)]



How reproducible:


Steps to Reproduce:
1. Install OCP 4.5
2. Enabled techPreviewUserWorkload
3. Check prometheus pod logs after some time

[Actual results]
Possible to read the error mentioned above


[Expected results]
The errors are not shown and it's possible to execute the query without any issues

Comment 1 Simon Pasquier 2020-10-15 07:49:16 UTC

*** This bug has been marked as a duplicate of bug 1872786 ***