Bug 1399489

Summary: [z-stream clone - 3.6.10] memory consumption improvements
Product: Red Hat Enterprise Virtualization Manager Reporter: rhev-integ
Component: ovirt-engineAssignee: Greg Sheremeta <gshereme>
Status: CLOSED ERRATA QA Contact: Pavel Novotny <pnovotny>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 3.6.7CC: aperotti, awels, eberman, eedri, gscott, gshereme, gveitmic, lsurette, mgoldboi, mkalinin, mlehrer, mperina, obockows, oourfali, pbrilla, pstehlik, rbalakri, Rhev-m-bugs, rhodain, sapandit, srevivo, usurse, vszocs, ykaul
Target Milestone: ovirt-3.6.10Keywords: Performance, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1398546 Environment:
Last Closed: 2017-01-17 18:05:49 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: UX RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1398546    
Bug Blocks:    
Attachments:
Description Flags
RAM usage chart: 3.6.9 vs. 3.6.10 none

Comment 3 rhev-integ 2016-11-29 07:32:23 UTC
Description of problem:
The web admin portal is slow in IE 11. The same client behaves much better with chrome browser on the same client system. Although the chrome browser behaves better, the performance degradates over the time.

Version-Release number of selected component (if applicable):
    rhevm-3.6.6.2-0.1.el6

How reproducible:
    100% with IE11
    after a while with other browsers

Steps to Reproduce:
1. Start IE11 and connect to the webAmin
2. Work with that for a while

Actual results:
    The memory grows rapidly 1GRAM in 10 min. The average CPU load is ~60% and 99% in peaks. even though the browser is not used. This is most visible in the Virtual Machine tab.

Expected results:
    The usage is fluent. No delay between tab switching. (not counting the data loading)

This comment was originaly posted by rhodain

This comment was originaly posted by rhev-integ

Comment 59 rhev-integ 2016-11-29 07:40:17 UTC
*** Bug 1348150 has been marked as a duplicate of this bug. ***

This comment was originaly posted by gshereme

This comment was originaly posted by rhev-integ

Comment 66 rhev-integ 2016-11-29 07:41:20 UTC
I have good news on our progress. After applying several patches to fix memory leaks (and one more not-yet-merged patch), I can say we are now very stable on 4.1 master. I've done several performance tests to verify. The most important one was an endurance test I did that shows webadmin still very stable after 6.25 hours of heavy use under webdriver.

See attachment '6 hour webdriver results'.

This shows that we're still leaking some memory over time (graph on right), and while we have more work to do on that, the relatively flat response time line (graph on left) means the app stayed performant over the 6 hour test.

[Ignore the spike in the left graph. That was a one-time webdriver hiccup.]

This comment was originaly posted by gshereme

This comment was originaly posted by rhev-integ

Comment 67 rhev-integ 2016-11-29 07:41:30 UTC
Created attachment 1220554 [details]
6 hour webdriver results

This comment was originaly posted by gshereme

This comment was originaly posted by rhev-integ

Comment 71 rhev-integ 2016-11-29 07:42:05 UTC
(In reply to Greg Sheremeta from comment #62)
> I have good news on our progress. After applying several patches to fix
> memory leaks (and one more not-yet-merged patch), I can say we are now very
> stable on 4.1 master. I've done several performance tests to verify. The
> most important one was an endurance test I did that shows webadmin still
> very stable after 6.25 hours of heavy use under webdriver.

This is great news.

All memory leak fixes should land in 4.0.6 eventually.

Once that happens, we can decide how to proceed regarding 3.6.z.

> 
> See attachment '6 hour webdriver results'.
> 
> This shows that we're still leaking some memory over time (graph on right),
> and while we have more work to do on that, the relatively flat response time
> line (graph on left) means the app stayed performant over the 6 hour test.
> 
> [Ignore the spike in the left graph. That was a one-time webdriver hiccup.]

Given Greg's performance analysis results, I'd conclude this BZ once the "not-yet-merged patch" lands in 4.0.6.

Any further improvements should be done as part of tracker bug 1378935.

This comment was originaly posted by vszocs

This comment was originaly posted by rhev-integ

Comment 72 rhev-integ 2016-11-29 07:42:13 UTC
Attaching another test result that shows the dramatic improvements made by Vojtech's awesome tooltips fix.

See 'tooltip leak fix results.png'

The flat lines are with tooltips fix onboard. The mountainous slope is without it. The degradation was severe, as you can see.

That patch alone shaved 40 minutes off of an hour run, so we are now 66% faster.

:)

This comment was originaly posted by gshereme

This comment was originaly posted by rhev-integ

Comment 73 rhev-integ 2016-11-29 07:42:22 UTC
Created attachment 1220952 [details]
tooltip leak fix results

This comment was originaly posted by gshereme

This comment was originaly posted by rhev-integ

Comment 74 rhev-integ 2016-11-29 07:42:30 UTC
*** Bug 1395911 has been marked as a duplicate of this bug. ***

This comment was originaly posted by oourfali

This comment was originaly posted by rhev-integ

Comment 79 Pavel Novotny 2017-01-06 17:28:27 UTC
Verified in 
rhevm-3.6.10.2-0.2.el6.noarch
rhevm-webadmin-portal-3.6.10.2-0.2.el6.noarch

Browser: Firefox 45.6 ESR @ RHEL 6.8

I ran a test which (in a loop) went through the main tabs and then opened & closed the New VM dialog. 
I used 100 passes and for each measured the RAM usage via the browser built-in tool at the about:memory page.

Results:
3.6.9: RAM usage started around 120 MB and continuously grew up to ~860 MB.
3.6.10: Started at ~130 MB and continuously grew up to ~500 MB. The peaks in memory usage were less prominent then in 3.6.9.
(chart attached)

Timewise, test with 100 passes took:
3.6.9: 3h40m
3.6.10: 2h00m

Please see the RAM usage chart attached.

Comment 80 Pavel Novotny 2017-01-06 17:29:08 UTC
Created attachment 1238056 [details]
RAM usage chart: 3.6.9 vs. 3.6.10

Comment 82 errata-xmlrpc 2017-01-17 18:05:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2017-0108.html