Bug 1881044 - Limit the maximum CSR reports
Summary: Limit the maximum CSR reports
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Insights Operator
Version: 4.6
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.6.0
Assignee: Ricardo Lüders
QA Contact: Pavel Šimovec
Marc Muehlfeld
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-21 12:36 UTC by Ricardo Lüders
Modified: 2020-10-27 16:43 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Missing report limiter to CSR was causing an enormous amount of unnecessary data reports. Consequence: Unecessary amount of reports being collected. Fix: Limit the report data to to 5000 reports. Result: Limiting of the amount of data reports collected to 5000 reports.
Clone Of:
Environment:
Last Closed: 2020-10-27 16:43:33 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift insights-operator pull 184 0 None closed Bug 1881044: Limit the maximum number of CSR 2021-01-05 12:40:13 UTC
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:43:54 UTC

Description Ricardo Lüders 2020-09-21 12:36:51 UTC
Description of problem:

In order to avoid memory overload it has to limit the number of CSR reports.

Version-Release number of selected component (if applicable):


How reproducible:

Found a cluster that has a lot of CSR and give a try. Maybe?

Steps to Reproduce:
1.
2.
3.

Actual results:

Memory problems when found large amount of CSR


Expected results:

Not crash?

Additional info:

Comment 3 Pavel Šimovec 2020-09-30 11:23:31 UTC
I had to check what the upper limit is in source code

I have created certificate server.crt and a script:

for (( c=1; c<=9555; c++ ))
do
echo "apiVersion: certificates.k8s.io/v1beta1
kind: CertificateSigningRequest
metadata:
  name: service$c.default
spec:
  groups:
  - system:authenticated
  request: $(cat server.csr | base64 | tr -d '\n')
  usages:
  - digital signature
  - key encipherment
  - server auth" > wow$c &&echo $c&&kubectl create -f wow$c&
sleep 0.08;
done


then I have added check in TestArachiveContains - pattern ^config/certificatesigningrequests/.*json$
When I run the test during the time that first script was running, I got this output
main_test.go:369: 5320 csr files match pattern `^config/certificatesigningrequests/.*json$`

...5320 is more than hardcoded upper limit 5000
does NOT work in 4.6.0-0.ci-2020-09-30-031307

Comment 4 Pavel Šimovec 2020-09-30 11:44:30 UTC
additional info:
When I have added few more thousands csrs, it started to limit collection of files as expected
main_test.go:369: 5000 csr files match pattern `^config/certificatesigningrequests/.*json$`

Comment 5 Ricardo Lüders 2020-09-30 15:18:00 UTC
I tried to reproduce the issue without success, in all my tests I was able to get the amount of CSR reports by the limit. Also, I wasn't able to use your script to generate more than 100 resources at my cluster on quicklab, what is kind of weird.

Comment 6 Pavel Šimovec 2020-10-01 13:39:45 UTC
The script isn't the best one, today I had issues with it myself, I had to increase timeout to make it work better..

I was able to reproduce it myself on cluster from cluster bot, same version as last time

11:44
main_test.go:369: 5220 csr files match pattern `^config/certificatesigningrequests/.*json$`
11:48
main_test.go:369: 5255 csr files match pattern `^config/certificatesigningrequests/.*json$`
11:53
main_test.go:369: 5000 csr files match pattern `^config/certificatesigningrequests/.*json$`
11:56
main_test.go:369: 5000 csr files match pattern `^config/certificatesigningrequests/.*json$`


Fix should be added in different PR,
I will verify this BZ, as it doesn't break anything.. it works at least partially, so it can just help - the issue I found should be fixed in different PR/BZ

Comment 9 errata-xmlrpc 2020-10-27 16:43:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.