Bug 1575562 - Memory leak in ovirt-engine deployed as RHHI
Summary: Memory leak in ovirt-engine deployed as RHHI
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.1.11
Hardware: x86_64
OS: Linux
Target Milestone: ovirt-4.3.1
: 4.3.0
Assignee: Sahina Bose
Depends On:
TreeView+ depends on / blocked
Reported: 2018-05-07 10:17 UTC by Mauro Oddi
Modified: 2020-08-03 15:38 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2019-02-11 10:38:13 UTC
oVirt Team: Gluster
Target Upstream Version:
lsvaty: testing_plan_complete-

Attachments (Terms of Use)
engine.log after ParOldGen gets to 99% (RHV 4.1.11) before engine restart (1.99 MB, application/x-gzip)
2018-05-28 15:30 UTC, Mauro Oddi
no flags Details

Description Mauro Oddi 2018-05-07 10:17:44 UTC
Description of problem:
RHHI infrastructure (RHV 4.1.11 + Gluster 3.3 ) starts to show increasing amounts of VDSNetworkException ERRORs in the engine.log until hosted_engine glusterfs Storage Domain fails and the engine is restated.

Analysis has shown there is no indicators a network issue or exhaustion. However  it was detected that the ParOldGen heap area gets to 99% after 10 days more or less. Increasing usage 9/10% a day.

When the use is close to 99% the aforementioned exceptions start to show up and the problem reproduces again.
The customer provided a heap dump for further analysis.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:

Actual results:

Expected results:

Additional info:

Comment 1 Yaniv Kaul 2018-05-08 02:49:00 UTC
Any logs?

Comment 12 Yaniv Kaul 2018-05-17 13:47:40 UTC
Ravi, any news?

Comment 13 Ravi Nori 2018-05-17 13:55:35 UTC
From the logs I see that in 8 hours time frame GlusterServersListVDSCommand and GlusterVolumesListVDSCommand is executed 9665 times each. Every three seconds there is an execution of the commands.

Apart from the above issue I don't see anything else in the logs or the thread dump. 

It looks like there is an issue with Gluster integration.

Comment 19 Mauro Oddi 2018-05-28 15:30:51 UTC
Created attachment 1443379 [details]
engine.log after ParOldGen gets to 99% (RHV 4.1.11) before engine restart

Comment 54 Sandro Bonazzola 2019-01-28 09:40:50 UTC
This bug has not been marked as blocker for oVirt 4.3.0.
Since we are releasing it tomorrow, January 29th, this bug has been re-targeted to 4.3.1.

Comment 56 Yaniv Kaul 2019-01-28 10:26:45 UTC
What's the next step here?

Comment 57 Sahina Bose 2019-01-28 12:19:51 UTC
(In reply to Yaniv Kaul from comment #56)
> What's the next step here?

We have not been able to reproduce the memory leak issue, and there has not been any further information from the customer.

Mauro, can we close this bug?

Comment 61 Sahina Bose 2019-02-11 06:22:04 UTC
Mauro, any update? Should we continue to keep this bug open?

Comment 63 Sahina Bose 2019-02-11 10:38:13 UTC
Closing as we do not have enough data/ understanding of customer specific issue to proceed

Note You need to log in before you can comment on or make changes to this bug.