Bug 1918287 - [ovirt] ovirt csi driver is flooding RHV with API calls and spam the event UI with new connections
Summary: [ovirt] ovirt csi driver is flooding RHV with API calls and spam the event UI...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 4.7
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.7.0
Assignee: Gal Zaidman
QA Contact: Michael Burman
URL:
Whiteboard:
Depends On:
Blocks: 1924623
TreeView+ depends on / blocked
 
Reported: 2021-01-20 11:49 UTC by Gal Zaidman
Modified: 2021-02-24 15:55 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-02-24 15:54:57 UTC
Target Upstream Version:


Attachments (Terms of Use)
200 line on /var/log/httpd/ssl_request log - 3masters 3workers (24.71 KB, text/plain)
2021-01-20 11:49 UTC, Gal Zaidman
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift ovirt-csi-driver pull 67 0 None closed Bug 1918287: pass ovirtClient to identity and remove redundant call to Test connection 2021-02-08 17:34:14 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:55:13 UTC

Description Gal Zaidman 2021-01-20 11:49:43 UTC
Created attachment 1749053 [details]
200 line on /var/log/httpd/ssl_request log - 3masters 3workers

Description of problem:

When installing the CSI driver we see that the RHV events log gets spammed with infinite amount of auth calls to the engine - one call from each node -each 10 sec.

This is a problem not only for the logs but also to the engine itself that can handle a limited amount of requests per second and so many request can really be an issue for the engine.

When debugging this we see that the problem is in our Probe call of the CSI driver that gets called each 10 secs, since the driver is on each node that will lead to around 1 auth per sec on a small cluster and more on larger one.

Comment 1 Benny Zlotnik 2021-01-20 12:21:11 UTC
When liveness probe was introduced the period was set to 30 seconds
https://github.com/oVirt/csi-driver/commit/efb64526378d20fca0039b61aa421b29321e380e#diff-4b761eacd68a8a2fb6f81f7c9f0e4c22b0c707238a37d7325f0bf84d211586adR103

But looks like it was lost in the migration to a second level operator, it needs to adjusted in the operator:
https://github.com/openshift/ovirt-csi-driver-operator/tree/master/assets

Comment 2 Gal Zaidman 2021-01-20 12:29:14 UTC
This is not a blocker

Comment 3 Gal Zaidman 2021-01-20 12:29:26 UTC
(In reply to Benny Zlotnik from comment #1)
> When liveness probe was introduced the period was set to 30 seconds
> https://github.com/oVirt/csi-driver/commit/
> efb64526378d20fca0039b61aa421b29321e380e#diff-
> 4b761eacd68a8a2fb6f81f7c9f0e4c22b0c707238a37d7325f0bf84d211586adR103
> 
> But looks like it was lost in the migration to a second level operator, it
> needs to adjusted in the operator:
> https://github.com/openshift/ovirt-csi-driver-operator/tree/master/assets

Can you open a separate bug on that? I want to reserve this bug for a different fix.

Comment 5 Michael Burman 2021-01-26 13:35:06 UTC
Verified on - 4.7.0-0.nightly-2021-01-26-044139 with 4.4.4.7-0.1.el8ev

The API calls spam are gone from the event log UI.
There is only one connecting event coming from one of the master VMs(leader) each time it expired(10-15 minutes) and connecting.

Comment 8 errata-xmlrpc 2021-02-24 15:54:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.