1514595 – Memory issue on appliance with Amazon provider

Bug 1514595 - Memory issue on appliance with Amazon provider

Summary: Memory issue on appliance with Amazon provider

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat CloudForms Management Engine
Classification:	Red Hat
Component:	Providers
Sub Component:
Version:	5.9.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	GA
Target Release:	5.9.0
Assignee:	Ladislav Smola
QA Contact:	Matouš Mojžíš
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-11-17 20:30 UTC by Jerome Marc
Modified:	2018-06-20 13:26 UTC (History)
CC List:	12 users (show)
Fixed In Version:	5.9.0.11
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-03-06 15:17:42 UTC
Category:	---
Cloudforms Team:	AWS
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
evm.log (4.38 MB, application/x-gzip) 2017-11-17 20:35 UTC, Jerome Marc	no flags	Details
View All

Description Jerome Marc 2017-11-17 20:30:28 UTC

Description of problem:
I have isolated each provider in its own zone in my lab deployment. The appliance for Amazon provider is swapping and requires a restart every few hours.

Version-Release number of selected component (if applicable):
5.9.0.9.20171115202245_7429f75

How reproducible:
Always

Steps to Reproduce:
1. Deploy CloudForms and add Amazon provider (C&U capture is on)
2. 
3.

Actual results:
The appliance starts to consume all memory and goes to swap.

Expected results:
No swap.

Additional info:

Comment 2 Jerome Marc 2017-11-17 20:35:52 UTC

Created attachment 1354332 [details]
evm.log

Comment 12 Greg Blomquist 2017-11-30 22:18:00 UTC

Ladas, can you get https://github.com/ManageIQ/manageiq/pull/16502 into a mergeable state?  Once that's merged, this BZ can be moved to POST.

Comment 13 Ladislav Smola 2017-12-01 11:03:45 UTC

After talking with Adam, https://github.com/ManageIQ/manageiq/pull/16432 is enough for this BZ. We might or might not merge the https://github.com/ManageIQ/manageiq/pull/16502 in the future.

Comment 14 Matouš Mojžíš 2018-01-25 16:47:33 UTC

Ladas,
how can I reproduce & verify this issue?
Thanks

Comment 15 Ladislav Smola 2018-01-25 18:55:14 UTC

So part of it was caused by the mem leak. Part of it by the MiqQueue issues. So the verification is just keeping the appliance running for some time(few days) and checking the memory is not rising.

Comment 16 Matouš Mojžíš 2018-02-12 10:20:37 UTC

So I had appliance running for four days and used memory was increased by 300MB. I think this should be okay?

Comment 17 Ladislav Smola 2018-02-12 11:05:11 UTC

Can you try for couple more days? Also make sure you test it with the latest memory leak fixes. https://bugzilla.redhat.com/show_bug.cgi?id=1535720

Comment 18 dmetzger 2018-02-12 13:07:15 UTC

What build were you testing with when you saw the 300Mb growth? Also, can you post a full log set from the appliance (or share the IP/creds)? 

Please test with 5.9.0.19 or 5.9.0.20

Comment 20 Matouš Mojžíš 2018-02-14 10:37:57 UTC

Memory used is same as yesterday. Also I am trying to reproduce it on region that is not often used. Shall I try it with busy region?

Comment 22 Matouš Mojžíš 2018-02-22 13:16:14 UTC

So, I ran appliance with busy ec2 region.
After first day memory usage bumped by 100MB and then it didn't increase anymore after 7 days.
So it is enough for verification?

Comment 23 dmetzger 2018-02-22 13:29:40 UTC

That's sufficient to verify the fix.

Comment 24 Matouš Mojžíš 2018-02-22 13:43:12 UTC

Verified in 5.9.0.20. Appliance with busy ec2 region was running for a week with no sign of memory leaks.

Note You need to log in before you can comment on or make changes to this bug.