Bug 1654571

Summary: container manager providers make cfme appliances run out of memory
Product: Red Hat CloudForms Management Engine Reporter: Niladri Roy <niroy>
Component: PerformanceAssignee: Josh Carter <jocarter>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Sudhir Mallamprabhakara <smallamp>
Severity: medium Docs Contact: Red Hat CloudForms Documentation <cloudforms-docs>
Priority: unspecified    
Version: 5.9.4CC: dmetzger, izapolsk, obarenbo
Target Milestone: GA   
Target Release: cfme-future   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-03-26 17:17:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: Container Management Target Upstream Version:
Embargoed:

Description Niladri Roy 2018-11-29 06:32:44 UTC
Description of problem:
When multiple OpenShift providers are added, the appliances memory usage increases
the container manager refresh worker keeps exceeding memory threshold and the appliance exceeds swap limit, not allowing multiple workers to start

[----] W, [2018-11-25T03:18:31.919464 #5016:c3f118]  WARN -- : MIQ(MiqServer#validate_worker) Worker [ManageIQ::Providers::Openshift::ContainerManager::RefreshWorker] with ID: [1000000094029], PID: [13267], GUID: [fe78c350-fe8d-4550-b40f-161ca9227ab5] process memory usage [2389044000] exceeded limit [2147483648], requesting worker to exit


Version-Release number of selected component (if applicable):
5.9.4.7

How reproducible:
Everytime at customer environment

Steps to Reproduce:
1. Add multiple Openshift Providers (Cu has 4)
2. wait for several refreshes to occur
3.

Actual results:
worker memory exceeded notification in the CloudForms UI
the appliances run out of memory 
memory usage doesn't get distributed between available appliances

Expected results:
Workers shouldn't cross the memory limit (1 GB)
workload should get distributed among workers process of other appliances

Additional info:
Cu had 2 worker appliances, they recently added one more to their environment still the load it not distributed evenly between the appliances.