Bug 2011903 - vsphere-problem-detector: session leak
Summary: vsphere-problem-detector: session leak
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 4.10
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.10.0
Assignee: Robert Bost
QA Contact: Wei Duan
URL:
Whiteboard:
Depends On:
Blocks: 2033733
TreeView+ depends on / blocked
 
Reported: 2021-10-07 16:50 UTC by Robert Bost
Modified: 2022-03-10 16:18 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-10 16:17:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift vsphere-problem-detector pull 60 0 None open Bug 2011903: Deferred logout after checks are run 2021-10-07 20:43:44 UTC
Red Hat Issue Tracker SPLAT-246 0 None None None 2021-10-07 16:50:57 UTC
Red Hat Product Errata RHSA-2022:0056 0 None None None 2022-03-10 16:18:30 UTC

Description Robert Bost 2021-10-07 16:50:58 UTC
Description of problem:

vSphere Problem Detector may be leaking sessions in vSphere due to client not being logged out between syncs. 

Version-Release number of selected component (if applicable): 4.10 and master

How reproducible: Always

Steps to Reproduce:
1. Monitor sessions using `govc session.ls` while vsphere-problem-detector is running

Actual results: 
Sessions count increases from 1 to 14 in about 17 minutes.

In the output below, I was running build from https://github.com/openshift/vsphere-problem-detector/pull/58

$ while true; do date; govc session.ls | grep vsphere-prob; sleep 5m ; done
Thu Oct  7 10:20:26 AM MDT 2021
523e179e-4ead-fd7b-b51a-4f51ad4789a0  VSPHERE.LOCAL\rbost    2021-10-07 16:20  12s     x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
Thu Oct  7 10:25:26 AM MDT 2021
5237d10f-eb55-be7b-08f3-199397fec29d  VSPHERE.LOCAL\rbost    2021-10-07 16:23  2m8s    x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
523e179e-4ead-fd7b-b51a-4f51ad4789a0  VSPHERE.LOCAL\rbost    2021-10-07 16:20  5m12s   x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
528a51b6-142d-9e56-afbd-2e19b384b975  VSPHERE.LOCAL\rbost    2021-10-07 16:24  1m7s    x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
528f6e0d-da37-ffe0-247a-11f9bff5a2de  VSPHERE.LOCAL\rbost    2021-10-07 16:22  3m10s   x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
52f4a6c1-c603-bdf5-1922-0f3129bc0196  VSPHERE.LOCAL\rbost    2021-10-07 16:21  4m11s   x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
52facf01-0f12-d658-077a-2485d25b602e  VSPHERE.LOCAL\rbost    2021-10-07 16:25  7s      x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
Thu Oct  7 10:30:26 AM MDT 2021
5212af06-f896-be51-8088-4c881459226f  VSPHERE.LOCAL\rbost    2021-10-07 16:26  4m5s    x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
522071e9-0ab3-d843-e02e-d44924644ea2  VSPHERE.LOCAL\rbost    2021-10-07 16:28  2m3s    x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
5237d10f-eb55-be7b-08f3-199397fec29d  VSPHERE.LOCAL\rbost    2021-10-07 16:23  7m9s    x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
523e179e-4ead-fd7b-b51a-4f51ad4789a0  VSPHERE.LOCAL\rbost    2021-10-07 16:20  10m13s  x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
528a51b6-142d-9e56-afbd-2e19b384b975  VSPHERE.LOCAL\rbost    2021-10-07 16:24  6m8s    x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
528f6e0d-da37-ffe0-247a-11f9bff5a2de  VSPHERE.LOCAL\rbost    2021-10-07 16:22  8m10s   x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
52b62109-7618-33f1-12ab-0eccf6c46c77  VSPHERE.LOCAL\rbost    2021-10-07 16:27  3m4s    x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
52d08b6b-9be2-b4de-c7bc-4aae330242ab  VSPHERE.LOCAL\rbost    2021-10-07 16:29  1m2s    x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
52df60e9-7dd4-666f-711c-99d872e0a85a  VSPHERE.LOCAL\rbost    2021-10-07 16:30  2s      x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
52f4a6c1-c603-bdf5-1922-0f3129bc0196  VSPHERE.LOCAL\rbost    2021-10-07 16:21  9m11s   x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
52facf01-0f12-d658-077a-2485d25b602e  VSPHERE.LOCAL\rbost    2021-10-07 16:25  5m7s    x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
Thu Oct  7 10:35:27 AM MDT 2021
5211905e-4c06-e20e-edc2-6d7debaf3af8  VSPHERE.LOCAL\rbost    2021-10-07 16:31  4m0s    x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
5212af06-f896-be51-8088-4c881459226f  VSPHERE.LOCAL\rbost    2021-10-07 16:26  9m5s    x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
522071e9-0ab3-d843-e02e-d44924644ea2  VSPHERE.LOCAL\rbost    2021-10-07 16:28  7m3s    x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
5234bcde-f49b-f53f-edcf-fcdd0db91de5  VSPHERE.LOCAL\rbost    2021-10-07 16:32  2m59s   x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
5237d10f-eb55-be7b-08f3-199397fec29d  VSPHERE.LOCAL\rbost    2021-10-07 16:23  12m9s   x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
523e179e-4ead-fd7b-b51a-4f51ad4789a0  VSPHERE.LOCAL\rbost    2021-10-07 16:20  15m13s  x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
528a51b6-142d-9e56-afbd-2e19b384b975  VSPHERE.LOCAL\rbost    2021-10-07 16:24  11m8s   x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
528f6e0d-da37-ffe0-247a-11f9bff5a2de  VSPHERE.LOCAL\rbost    2021-10-07 16:22  13m10s  x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
52b62109-7618-33f1-12ab-0eccf6c46c77  VSPHERE.LOCAL\rbost    2021-10-07 16:27  8m4s    x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
52d08b6b-9be2-b4de-c7bc-4aae330242ab  VSPHERE.LOCAL\rbost    2021-10-07 16:29  6m2s    x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
52db6e7f-d860-dc5a-109c-e2a936be556e  VSPHERE.LOCAL\rbost    2021-10-07 16:34  56s     x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
52df60e9-7dd4-666f-711c-99d872e0a85a  VSPHERE.LOCAL\rbost    2021-10-07 16:30  5m1s    x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
52ebd12c-0084-4ee6-e6b5-339337f7a3d1  VSPHERE.LOCAL\rbost    2021-10-07 16:33  1m57s   x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
52f4a6c1-c603-bdf5-1922-0f3129bc0196  VSPHERE.LOCAL\rbost    2021-10-07 16:21  14m11s  x.x.x.x  vsphere-problem-detector/v0.0.0-unknown                     
52facf01-0f12-d658-077a-2485d25b602e  VSPHERE.LOCAL\rbost    2021-10-07 16:25  10m7s   x.x.x.x  vsphere-problem-detector/v0.0.0-unknown

Comment 6 Wei Duan 2021-10-20 11:20:06 UTC
I did some test in vsphere6.7 which we have admin access to check the session with the nightly 4.10.0-0.nightly-2021-10-16-173656. 

1. In the running cluster, I did not see the vsphere-problem-detector session alive: 
$ govc session.ls | wc -l
25
$ govc session.ls | grep vsphere-prob | wc -l
0

2. Using the command "while true; do govc session.ls | grep vsphere-prob;done" during restarting vsphere-problem-detector pod, I still could not see the alive vsphere-problem-detector session.

From my side, it seems no session leak any more.

@Robert What do you think?


BTW, where do you find this issue? I understand it is not in VMC as you could check the session in the bug description.
"
$ while true; do date; govc session.ls | grep vsphere-prob; sleep 5m ; done
Thu Oct  7 10:20:26 AM MDT 2021
523e179e-4ead-fd7b-b51a-4f51ad4789a0  VSPHERE.LOCAL\rbost    2021-10-07 16:20  12s     x.x.x.x  vsphere-problem-detector/v0.0.0-unknown    
"

I'm wondering if you could help verify on your env (no need on VMC) to double confirm. Thanks.

Comment 7 Robert Bost 2021-10-20 15:33:26 UTC
> $ govc session.ls | grep vsphere-prob | wc -l

The `grep vsphere-prob` would only be correct if https://github.com/openshift/vsphere-problem-detector/pull/58 was merged or you had a build containing that change. Otherwise, the user agents will not be set as you would expect. 

> BTW, where do you find this issue? I understand it is not in VMC as you could check the session in the bug description.

I was testing in a vSphere installation in IBM Cloud where full vSphere admin privileges are given and happy to test there again, however..

This issue may be blocked until https://github.com/openshift/vsphere-problem-detector/pull/58 can be merged otherwise we cannot really isolate what is vsphere-problem-detector and what is not.

Comment 8 Wei Duan 2021-11-03 01:03:04 UTC
@Robert Thanks for the info.

Comment 9 Jan Safranek 2021-11-24 11:14:14 UTC
With regards to https://issues.redhat.com/browse/RFE-2123, I suggest we backport it everywhere we have vsphere-problem-detector installed by default. This is a report from 4.7: https://access.redhat.com/support/cases/#/case/02988444 - they fixed their credentials and it's closed, but it looks related.

Comment 10 Wei Duan 2021-12-14 02:38:29 UTC
Verified pass on 4.10.0-0.nightly-2021-12-06-201335

1. Check on the cluster after running for a while, no session from vsphere-problem-detector 
$ govc session.ls | grep vsphere-prob | wc -l
0

2. Using following command to monitor the session and restart the vsphere-problem-detector-operator pod, we see new session launched and releases immediately
$ while true; do date; govc session.ls | grep vsphere-prob; done 
$ oc -n openshift-cluster-storage-operator delete pod vsphere-problem-detector-operator-8794689bc-kqm7l (restart vsphere-problem-detector-operator pod)  


Tue 14 Dec 2021 02:25:32 AM UTC
Tue 14 Dec 2021 02:25:32 AM UTC
Tue 14 Dec 2021 02:25:32 AM UTC
Tue 14 Dec 2021 02:25:32 AM UTC
Tue 14 Dec 2021 02:25:32 AM UTC
Tue 14 Dec 2021 02:25:32 AM UTC
5227b896-edbb-5049-6544-327c122e5689  VSPHERE.LOCAL\openshift-qe-machineset                              2021-12-14 02:29  0s      10.8.30.2    vsphere-problem-detector/4.10.0-202111222329.p0.gbda2d97.assembly.stream-bda2d97
Tue 14 Dec 2021 02:25:32 AM UTC
5227b896-edbb-5049-6544-327c122e5689  VSPHERE.LOCAL\openshift-qe-machineset                              2021-12-14 02:29  0s      10.8.30.2    vsphere-problem-detector/4.10.0-202111222329.p0.gbda2d97.assembly.stream-bda2d97
Tue 14 Dec 2021 02:25:32 AM UTC
5227b896-edbb-5049-6544-327c122e5689  VSPHERE.LOCAL\openshift-qe-machineset                              2021-12-14 02:29  0s      10.8.30.2    vsphere-problem-detector/4.10.0-202111222329.p0.gbda2d97.assembly.stream-bda2d97
Tue 14 Dec 2021 02:25:32 AM UTC
5227b896-edbb-5049-6544-327c122e5689  VSPHERE.LOCAL\openshift-qe-machineset                              2021-12-14 02:29  0s      10.8.30.2    vsphere-problem-detector/4.10.0-202111222329.p0.gbda2d97.assembly.stream-bda2d97
Tue 14 Dec 2021 02:25:32 AM UTC
5227b896-edbb-5049-6544-327c122e5689  VSPHERE.LOCAL\openshift-qe-machineset                              2021-12-14 02:29  0s      10.8.30.2    vsphere-problem-detector/4.10.0-202111222329.p0.gbda2d97.assembly.stream-bda2d97
Tue 14 Dec 2021 02:25:32 AM UTC
5227b896-edbb-5049-6544-327c122e5689  VSPHERE.LOCAL\openshift-qe-machineset                              2021-12-14 02:29  0s      10.8.30.2    vsphere-problem-detector/4.10.0-202111222329.p0.gbda2d97.assembly.stream-bda2d97
Tue 14 Dec 2021 02:25:32 AM UTC
5227b896-edbb-5049-6544-327c122e5689  VSPHERE.LOCAL\openshift-qe-machineset                              2021-12-14 02:29  0s      10.8.30.2    vsphere-problem-detector/4.10.0-202111222329.p0.gbda2d97.assembly.stream-bda2d97
Tue 14 Dec 2021 02:25:32 AM UTC
Tue 14 Dec 2021 02:25:32 AM UTC
Tue 14 Dec 2021 02:25:32 AM UTC
Tue 14 Dec 2021 02:25:32 AM UTC

Comment 13 errata-xmlrpc 2022-03-10 16:17:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056


Note You need to log in before you can comment on or make changes to this bug.