Bug 1891332

Summary: Engine have a memory leak
Product: [oVirt] ovirt-engine Reporter: David Vaanunu <dvaanunu>
Component: GeneralAssignee: Artur Socha <asocha>
Status: CLOSED CURRENTRELEASE QA Contact: David Vaanunu <dvaanunu>
Severity: high Docs Contact:
Priority: high    
Version: 4.4.3.5CC: ahadas, aoconnor, asocha, bugs, dfodor, lrotenbe, michal.skrivanek, mkalinin, mlehrer, mperina, mtessun, tnisan
Target Milestone: ovirt-4.4.3-1Keywords: Performance
Target Release: 4.4.3.12Flags: pm-rhel: ovirt-4.4+
pm-rhel: exception+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovirt-engine-4.4.3.12 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-11-27 15:46:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description David Vaanunu 2020-10-25 18:54:57 UTC
Description of problem:

Engine has memory leak while running a test using API flow. 
The flow includes:
* Login
* Get VMs (Max. 100), get Cluster, get Host, get template
* Loop of 100 times
  * GetVM_ReportedDevices
  * GetVM_Tags
  * GetVM_Statistics
  * GetVM_AffinityLables
  * GetCluster_AffinityGroups
  * GetHost
  * GetCluster
  * GetTemplate

The flow is base on the inventory:
https://github.com/oVirt/ovirt-ansible-collection/blob/master/plugins/inventory/ovirt.py


Version-Release number of selected component (if applicable):

rel8.3
rhv4.4.3-7


How reproducible:


Steps to Reproduce:
1. Running 20 users (Using Jmeter)
2. Duration 12Hrs
3. Connect the engine to JConsole to measure the HeapMemory
4. Create a dump file
5. Analyze the dump file with 'Eclipse Memory Analyzer'


Actual results:

Have memory leak

Expected results:

No memory leak

Additional info:


Link doc with graphs and info from MemoryAnalyzer

Comment 1 Liran Rotenberg 2020-10-27 13:04:15 UTC
Hi, some information I got just in case to make Artur's life easier ;)

The problem introduced when you start using constructorCache (https://gerrit.ovirt.org/#/c/106355/).
Each time you get into CommandsFactory::getCommandConstructor, the expectedParams is a new class instance, therefore a new Pair and a new key to the constructorCache map.
It adds up each time and then you have a really big map having multiple constructors which are the same one causing to the memory usage increment.

I hope it will save you some time.

Comment 2 Artur Socha 2020-10-27 18:21:27 UTC
(In reply to Liran Rotenberg from comment #1)
> Hi, some information I got just in case to make Artur's life easier ;)
> 
> The problem introduced when you start using constructorCache
> (https://gerrit.ovirt.org/#/c/106355/).
> Each time you get into CommandsFactory::getCommandConstructor, the
> expectedParams is a new class instance, therefore a new Pair and a new key
> to the constructorCache map.
> It adds up each time and then you have a really big map having multiple
> constructors which are the same one causing to the memory usage increment.
> 
> I hope it will save you some time.

Liran, thanks for additional information it made my life easier indeed :)
I will soon try to provide fix for the leak.

Comment 3 Arik 2020-10-27 20:42:47 UTC
(In reply to Liran Rotenberg from comment #1)
> Hi, some information I got just in case to make Artur's life easier ;)
> 
> The problem introduced when you start using constructorCache
> (https://gerrit.ovirt.org/#/c/106355/).
> Each time you get into CommandsFactory::getCommandConstructor, the
> expectedParams is a new class instance, therefore a new Pair and a new key
> to the constructorCache map.

It caught my eyes because of the term "Class instance" that was used here - note that the JVM maintains one Class object per type so comparing Classes as we do should be ok
but yeah, the expectedParams is an array, and arrays should not be compared using Objects#equals

Comment 4 mlehrer 2020-11-01 11:43:55 UTC
Any chance to get this fix for 4.4.3 ?

Comment 5 Artur Socha 2020-11-02 08:25:04 UTC
(In reply to mlehrer from comment #4)
> Any chance to get this fix for 4.4.3 ?

Technically I can prepare a patch for 4.4.3, the question is -  would it fit into the release schedule, Martin?

Comment 8 Michal Skrivanek 2020-11-02 13:39:49 UTC
let's backport to 4.4.3

Comment 9 David Vaanunu 2020-11-05 14:51:07 UTC
Verified version:

[root@rhev-red-01 httpd]# rpm -qa | grep -i rel
rhv-release-4.4.3-13-001.noarch
redhat-release-8.3-1.0.el8.x86_64

[root@rhev-red-01 httpd]# rpm -qa | grep -i ovirt-engine-4
ovirt-engine-4.4.3.10-0.1.el8ev.noarch



Eclipse Memory Analyzer report (zip):
   https://drive.google.com/drive/folders/1AfNoi9gQkMyLEB0U7It_m7jL2LFSe6Ex?usp=sharing


Still have a memory leak (Results looks the same)

Also, updated google doc (Attached)- Includes before & after fix results

Comment 10 Michal Skrivanek 2020-11-06 11:19:50 UTC
when was a similar test run on earlier versions? is this a regression from 4.4.2?

Comment 11 David Vaanunu 2020-11-08 08:18:30 UTC
As mention in https://bugzilla.redhat.com/show_bug.cgi?id=1891332#c7.

we succeed to reproduce it on our system, with versions:
rel8.3
rhv4.4.3-7
ovirt-engine-4.4.3.5

Comment 12 Michal Skrivanek 2020-11-09 13:10:05 UTC
(In reply to David Vaanunu from comment #11)
> As mention in https://bugzilla.redhat.com/show_bug.cgi?id=1891332#c7.
> 
> we succeed to reproduce it on our system, with versions:
> rel8.3
> rhv4.4.3-7
> ovirt-engine-4.4.3.5

That wasn't the question, but anyway, in offline discussion it was mentioned that it's happening in 4.4.1, so not a 4.4.2/4.4.3 regression

postponing for further investigation in 4.4.4 since it doesn't seem to be fixed.

Comment 17 David Vaanunu 2020-11-17 10:49:56 UTC
Verified version:

[root@rhev-red-01 tmp]# rpm -qa | grep -i rel
rhv-release-4.4.3-14-001.noarch
redhat-release-8.3-1.0.el8.x86_64

[root@rhev-red-01 tmp]# rpm -qa | grep -i ovirt-engine-4
ovirt-engine-4.4.3.12-0.1.el8ev.noarch


Results Doc:
https://docs.google.com/document/d/1HLkJTT5Ph2-DobNIQVJJFdDNLQc2wJ3LCBcwtopI-Yc/edit?usp=sharing