Bug 1294678 - [scale] - High Memory and CPU Usage in Chrome and Firefox
Summary: [scale] - High Memory and CPU Usage in Chrome and Firefox
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: Frontend.WebAdmin
Version: 3.6.1.1
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ovirt-3.6.5
: 3.6.5.3
Assignee: Greg Sheremeta
QA Contact: mlehrer
URL:
Whiteboard:
: 1264809 1306261 (view as bug list)
Depends On:
Blocks: 1213937
TreeView+ depends on / blocked
 
Reported: 2015-12-29 15:44 UTC by mlehrer
Modified: 2016-08-07 14:19 UTC (History)
16 users (show)

Fixed In Version: 3.6.5-4
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-04-21 14:35:58 UTC
oVirt Team: UX
Embargoed:
ecohen: ovirt-3.6.z?
rule-engine: ovirt-4.0.0+
mgoldboi: planning_ack+
ecohen: devel_ack+
pstehlik: testing_ack+


Attachments (Terms of Use)
examples of HTML_Images in dettached dom blocked from GC marked in rd (222.31 KB, image/png)
2015-12-29 15:46 UTC, mlehrer
no flags Details
Slight listener and node growth (37.52 KB, image/png)
2015-12-29 15:47 UTC, mlehrer
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 55546 0 master MERGED userportal, webadmin: use a custom patched bootstrap.js 2016-04-05 14:07:56 UTC
oVirt gerrit 55547 0 master MERGED webadmin: fix memory leak in alerts / events footer 2016-04-06 14:23:47 UTC
oVirt gerrit 55554 0 ovirt-engine-3.6 MERGED userportal, webadmin: use a custom patched bootstrap.js 2016-04-06 15:15:10 UTC
oVirt gerrit 55555 0 ovirt-engine-3.6 MERGED webadmin: fix memory leak in alerts / events footer 2016-04-06 16:24:36 UTC
oVirt gerrit 55788 0 ovirt-engine-3.6.5 MERGED userportal, webadmin: use a custom patched bootstrap.js 2016-04-07 07:13:45 UTC
oVirt gerrit 55789 0 ovirt-engine-3.6.5 MERGED webadmin: fix memory leak in alerts / events footer 2016-04-07 07:13:56 UTC

Description mlehrer 2015-12-29 15:44:26 UTC
Description of problem:

When browsing a heavily populated enviroment (in this case 3k vms), Chrome uses lost of memory and becomes sluggish over time.

More specifically browsing and expanding the Tree, and Tabs:
a) results in high native memory use varied between 450-750mb of ram for 1 user accessing Admin on a populated scale enviroment using Chrome, while Firefox varied 350-450mb.
b) Multiple (HTMLImageElements) are found in detached doms (and are blocked from garbage collection see attached image)
c) Very slight leak of Listeners/Nodes after 3 forced garbage collections, possibly related to issue (b)

Version-Release number of selected component (if applicable):

Version 3.6.1.1-0.1.el6

How reproducible:
Very


Steps to Reproduce:
1.Created and populate an enviroment with:
DC 2
SD 12
Hosts 36
VMs 3100

2. Log in and continually navigate "System" Tree by expanding and selecting nodes, or tabs.

3. Browse between 10-15 expansion/selections, and then repeat this process 2-3 times.


Actual results:

Chrome Memory for this specific tab will grow to 750mb
Browsing becomes slow, and sluggish over time.
Multiple (HTMLImageElements) are found in detached doms
Slight growth of listeners and nodes after multiple forced GCs
Expected results:

Chrome browser experience will remain responsive, memory foot print will peak around 650mb but return to 350-450mb as Firefox does.
Browsing the same exact trees/tabs will not increase memory or result in listener/node growth.

Additional info:

Comment 1 mlehrer 2015-12-29 15:46:41 UTC
Created attachment 1110261 [details]
examples of HTML_Images in dettached dom blocked from GC marked in rd

Comment 2 mlehrer 2015-12-29 15:47:53 UTC
Created attachment 1110262 [details]
Slight listener and node growth

Comment 4 Yaniv Kaul 2015-12-30 09:30:37 UTC
Which version of Chrome? 
Does it depend on the number of VMs? (specifically item B above) ?

Comment 5 mlehrer 2015-12-30 12:15:30 UTC
Chrome Version: 47.0.2526.73

# of VMs matters in regards to total native memory size of chrome tab, but item b (retainment caused by elements in dettached doms) can be reproduced with a mostly empty enviroment.

So for example a mostly empty enviroment is using around 332mb, while a heavily populated 3.6 enviroment containing 3k vms was using between 450mb - 700mb.

Comment 6 Einav Cohen 2016-01-12 17:40:26 UTC
Alexander - please work with Mordechai and see if we can find the root cause for the high memory usage and whether we can address them. 

Also: Any chance that your patch for compressing the GWT RPC responses [1] may assist in somewhat mitigating the problem? Is it worth backporting [1] to ovirt-engine-3.6?

[1] https://gerrit.ovirt.org/#/c/51402/

Comment 7 Alexander Wels 2016-01-13 13:38:19 UTC
No the compression only applies to using the http port on jboss, any 'normal' installation will have Apache in front of it which will do the compression for us.

Comment 8 Vojtech Szocs 2016-01-19 20:00:08 UTC
First of all, thanks Mordechai for reporting this issue.

> a) results in high native memory use varied between 450-750mb of ram for 1
> user accessing Admin on a populated scale enviroment using Chrome, while
> Firefox varied 350-450mb.

Different browsers have different (in-memory) representations of DOM elements, listeners, etc. as well as different ways to optimize their handling.

Since Firefox 26, memory consumption was improved with focus on image-heavy pages [1].

[1] http://www.ghacks.net/2013/10/01/firefox-24-will-ship-serious-memory-consumption-improvements-image-heavy-pages/

This is probably the reason why you see Firefox using less RAM than Chrome when browsing the same (WebAdmin) application.

> b) Multiple (HTMLImageElements) are found in detached doms (and are blocked
> from garbage collection see attached image)

This is the actual problem.

I have a strong feeling we already encountered this in past. BZ [2] which was reported against Firefox 24 ESR / RHEL mentioned an issue with HTML image nodes. The end result was suggesting `layout.imagevisibility.enabled=false` Firefox setting as a workaround.

[2] https://bugzilla.redhat.com/show_bug.cgi?id=1054242

The presence of orphan nodes (nodes present in detached DOM) is considered a bug which should be investigated and fixed. The impact/severity might not be that high, though (unless you have heavily populated environment reflected through UI, you should be fine).

> c) Very slight leak of Listeners/Nodes after 3 forced garbage collections,
> possibly related to issue (b)

Related to orphan nodes, so indeed related to above point, I believe.

> # of VMs matters in regards to total native memory size of chrome tab, but
> item b (retainment caused by elements in dettached doms) can be reproduced
> with a mostly empty enviroment.

The more UI is populated (e.g. more data rows within tabs), the more exposed the issue of orphan nodes becomes, making RAM usage proportionally bigger.

Comment 9 Greg Sheremeta 2016-01-19 22:26:13 UTC
> The impact/severity might not be that high

I don't think there's anything wrong with using 700mb in a Chrome tab nowadays, especially for an app as large as this. Gmail consistently hovers at 800-900mb for me.

Yes, there are leaks we could fix (cleaning up the image leaks, for example). But if there are no performance implications from our small leaks, and the app is stable, this is a minor issue to me.

Further, any of the leaks we find are likely to be in GWT itself. So it's pretty unlikely they'd get fixed (because Google isn't working hard on GWT), and thus our time would be completely wasted.

Comment 10 Vojtech Szocs 2016-01-20 15:33:13 UTC
Yeah, that's why I suggested that severity might be lower as the application is still somewhat usable.

It would be interesting to see where those orphan HTML image nodes are originating from; if it's an issue in GWT itself, the chance of fixing that would be much lower, as Greg mentioned.

Comment 14 Red Hat Bugzilla Rules Engine 2016-02-02 06:16:07 UTC
This bug is marked for z-stream, yet the milestone is for a major version, therefore the milestone has been reset.
Please set the correct milestone or drop the z stream flag.

Comment 19 Pavel Stehlik 2016-02-12 07:54:43 UTC
Safari 9.0.3 - 760-850MB

Comment 21 Oved Ourfali 2016-03-16 07:29:42 UTC
*** Bug 1264809 has been marked as a duplicate of this bug. ***

Comment 22 Oved Ourfali 2016-03-16 07:29:55 UTC
*** Bug 1306261 has been marked as a duplicate of this bug. ***

Comment 23 Greg Sheremeta 2016-03-16 14:26:38 UTC
@Fred, bug 1294678 was marked a duplicate of this one, so let's continue here.

Do you encounter the 100% cpu issue regularly? If so, I'll ask you to try a few things for me. First, try applying this patch [ https://gerrit.ovirt.org/#/c/54503/ ] and see if the issue goes away. If it doesn't, next, apply this patch [ https://gerrit.ovirt.org/#/c/54310/ ].

@Nir, please try the patches as well.


Note You need to log in before you can comment on or make changes to this bug.