Bug 1511136

Summary: Ems refresh core worker exceeds memory threshold
Product: Red Hat CloudForms Management Engine Reporter: Ryan Spagnola <rspagnol>
Component: ApplianceAssignee: Joe Rafaniello <jrafanie>
Status: CLOSED DUPLICATE QA Contact: Dave Johnson <dajohnso>
Severity: high Docs Contact:
Priority: high    
Version: 5.7.0CC: abellott, jhardy, jrafanie, ncarboni, obarenbo, rspagnol
Target Milestone: GA   
Target Release: cfme-future   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-02-28 18:25:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: Bug
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: CFME Core Target Upstream Version:
Embargoed:

Comment 3 Joe Rafaniello 2018-02-23 18:12:35 UTC
From just the current logs, I would agree with Nick.

The server process grows to over 1 GB (archived logs show it growing to 3.5 GB). See below [1] This would cause all new workers to inherit a huge amount of shared memory which would trigger our PSS memory threshold  checks.

Therefore, I'd say this is a duplicate of bug 1535720 (memory leak) and bug 1479356 (use USS instead of PSS for checking worker memory).  We should ensure these two fixes are applied, the memory leak one being more crucial, and see if this still happens.  We can open a new bug if this issue still happens with the core worker after both fixes are applied.

What do you think Ryan?

[1] 
zgrep "MIQ Server" top_output.log | cut -d " " -f 19 | sort | uniq -c
   1 1.175g
  34 1.176g
  17 1.177g
  13 1.178g
  26 1.179g
   9 1.180g
  38 1.181g
  13 1.182g
  22 1.183g
  16 1.184g
  27 1.185g
  28 1.186g
  15 1.187g
  15 1.188g
  29 1.189g
  24 1.190g
  15 1.191g
  25 1.192g
   3 1.193g
  42 1.194g
  11 1.195g
  15 1.196g
  36 1.197g
  23 1.198g

Comment 4 Joe Rafaniello 2018-02-28 18:25:57 UTC
Closing a duplicate for now.  The fixes for the above BZ should resolve the problem.

*** This bug has been marked as a duplicate of bug 1535720 ***