Description of problem:
I have isolated each provider in its own zone in my lab deployment. The appliance for Amazon provider is swapping and requires a restart every few hours.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Deploy CloudForms and add Amazon provider (C&U capture is on)
The appliance starts to consume all memory and goes to swap.
Created attachment 1354332 [details]
Ladas, can you get https://github.com/ManageIQ/manageiq/pull/16502 into a mergeable state? Once that's merged, this BZ can be moved to POST.
After talking with Adam, https://github.com/ManageIQ/manageiq/pull/16432 is enough for this BZ. We might or might not merge the https://github.com/ManageIQ/manageiq/pull/16502 in the future.
how can I reproduce & verify this issue?
So part of it was caused by the mem leak. Part of it by the MiqQueue issues. So the verification is just keeping the appliance running for some time(few days) and checking the memory is not rising.
So I had appliance running for four days and used memory was increased by 300MB. I think this should be okay?
Can you try for couple more days? Also make sure you test it with the latest memory leak fixes. https://bugzilla.redhat.com/show_bug.cgi?id=1535720
What build were you testing with when you saw the 300Mb growth? Also, can you post a full log set from the appliance (or share the IP/creds)?
Please test with 184.108.40.206 or 220.127.116.11
Memory used is same as yesterday. Also I am trying to reproduce it on region that is not often used. Shall I try it with busy region?
So, I ran appliance with busy ec2 region.
After first day memory usage bumped by 100MB and then it didn't increase anymore after 7 days.
So it is enough for verification?
That's sufficient to verify the fix.
Verified in 18.104.22.168. Appliance with busy ec2 region was running for a week with no sign of memory leaks.