Bug 2009859 - Large number of sessions created by vmware-vsphere-csi-driver-operator during e2e tests
Summary: Large number of sessions created by vmware-vsphere-csi-driver-operator during...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 4.10
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.10.0
Assignee: Fabio Bertinatto
QA Contact: Wei Duan
URL:
Whiteboard:
Depends On:
Blocks: 2018496
TreeView+ depends on / blocked
 
Reported: 2021-10-01 19:34 UTC by rvanderp
Modified: 2022-03-10 16:17 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-10 16:16:28 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-storage-operator pull 221 0 None open Bug 2009859: Install vSphere CSI Driver by default (again) 2021-10-06 11:17:00 UTC
Github openshift vmware-vsphere-csi-driver-operator pull 47 0 None open Bug 2009859: Close connection to vCenter API 2021-10-04 20:36:14 UTC
Red Hat Product Errata RHSA-2022:0056 0 None None None 2022-03-10 16:16:59 UTC

Description rvanderp 2021-10-01 19:34:37 UTC
Description of problem:
During e2e testing, there has been a recent, significant increase in vSphere sessions.  vCenter has a maximum limit of 2000 concurrent sessions.  When the vmware-vsphere-csi-driver-operator is running, it has been noticed that individual clusters sometime consume a few hundred sessions at once.  At most, clusters, consume a few dozen sessions.  When the operator is disabled, no further session growth is noted and established sessions are eventually closed.

The session growth only occurs during e2e tests and corresponds with the  operator sync which can occur every few seconds and results in a new connection to vCenter[https://github.com/openshift/vmware-vsphere-csi-driver-operator/blob/cb321b1980d02f4e8ded29da8371e0f466454e10/pkg/operator/storageclasscontroller/storageclasscontroller.go#L163]. Clusters with over 250 sessions have been noted.

This behavior results in significant instability for all of vSphere CI as all clusters are prevented from accessing the vCenter API once sessions are exhausted.  

Version-Release number of selected component (if applicable):
- 4.10.0-0.nightly-2021-10-01-013103
- VMware IPI

How reproducible: consistently

Steps to Reproduce:
1. Install 4.10.0-0.nightly-2021-10-01-013103
2. Run e2e tests
3. Check session count in vCenter

Actual results:
a new session is established with every sync

Expected results:
session reuse should be investigated or explicitly closed

Master Log:

Node Log (of failed PODs):

PV Dump:

PVC Dump:

StorageClass Dump (if StorageClass used by PV/PVC):

Additional info:

snippet of sync instances
I1001 19:25:03.197131       1 vmware.go:307] Found existing profile with same name: openshift-storage-policy-rvanderp-dev-bxhft
I1001 19:25:04.003940       1 vmware.go:307] Found existing profile with same name: openshift-storage-policy-rvanderp-dev-bxhft
I1001 19:25:07.898412       1 vmware.go:307] Found existing profile with same name: openshift-storage-policy-rvanderp-dev-bxhft
I1001 19:25:13.909210       1 vmware.go:307] Found existing profile with same name: openshift-storage-policy-rvanderp-dev-bxhft
I1001 19:25:23.634340       1 vmware.go:307] Found existing profile with same name: openshift-storage-policy-rvanderp-dev-bxhft
I1001 19:25:48.719095       1 vmware.go:307] Found existing profile with same name: openshift-storage-policy-rvanderp-dev-bxhft
I1001 19:26:01.911076       1 vmware.go:307] Found existing profile with same name: openshift-storage-policy-rvanderp-dev-bxhft
I1001 19:26:05.427926       1 vmware.go:307] Found existing profile with same name: openshift-storage-policy-rvanderp-dev-bxhft
I1001 19:26:09.292775       1 vmware.go:307] Found existing profile with same name: openshift-storage-policy-rvanderp-dev-bxhft
I1001 19:26:09.649060       1 vmware.go:307] Found existing profile with same name: openshift-storage-policy-rvanderp-dev-bxhft

sessions in use by the cluster(user id test):
govc session.ls | grep test | wc -l
134

Comment 2 Hemant Kumar 2021-10-01 19:38:21 UTC
I think we will have to implement connection caching for both SOAP and REST clients..

Comment 4 rvanderp 2021-10-01 20:10:52 UTC
If you need any help at all testing fixes for this, just let me know.  I'm happy to help.

Comment 18 errata-xmlrpc 2022-03-10 16:16:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056


Note You need to log in before you can comment on or make changes to this bug.