Bug 1368101

Summary: RHV-M Web UI performance degrades over time
Product: Red Hat Enterprise Virtualization Manager Reporter: Roman Hodain <rhodain>
Component: ovirt-engineAssignee: Alexander Wels <awels>
Status: CLOSED ERRATA QA Contact: Pavel Novotny <pnovotny>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 3.6.7CC: amarchuk, aperotti, awels, bmcclain, eedri, gscott, gshereme, gveitmic, ivan.makfinsky, jmoon, lsurette, mamorim, mgoldboi, mkalinin, mlehrer, mperina, obockows, oourfali, pbrilla, pnovotny, pstehlik, rbalakri, Rhev-m-bugs, rhodain, sapandit, srevivo, usurse, vszocs, ykaul
Target Milestone: ovirt-4.1.0-alphaKeywords: Performance, ZStream
Target Release: ---Flags: bmcclain: priority_rfe_tracking+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1398546 (view as bug list) Environment:
Last Closed: 2017-04-25 00:47:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: UX RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1293920, 1388462, 1398546, 1417161    
Attachments:
Description Flags
6 hour webdriver results
none
tooltip leak fix results
none
RAM usage chart: 4.0.6 vs. 4.1.0 none

Description Roman Hodain 2016-08-18 12:15:55 UTC
Description of problem:
The web admin portal is slow in IE 11. The same client behaves much better with chrome browser on the same client system. Although the chrome browser behaves better, the performance degradates over the time.

Version-Release number of selected component (if applicable):
    rhevm-3.6.6.2-0.1.el6

How reproducible:
    100% with IE11
    after a while with other browsers

Steps to Reproduce:
1. Start IE11 and connect to the webAmin
2. Work with that for a while

Actual results:
    The memory grows rapidly 1GRAM in 10 min. The average CPU load is ~60% and 99% in peaks. even though the browser is not used. This is most visible in the Virtual Machine tab.

Expected results:
    The usage is fluent. No delay between tab switching. (not counting the data loading)

Comment 55 Greg Sheremeta 2016-11-09 14:04:59 UTC
*** Bug 1348150 has been marked as a duplicate of this bug. ***

Comment 62 Greg Sheremeta 2016-11-14 22:59:11 UTC
I have good news on our progress. After applying several patches to fix memory leaks (and one more not-yet-merged patch), I can say we are now very stable on 4.1 master. I've done several performance tests to verify. The most important one was an endurance test I did that shows webadmin still very stable after 6.25 hours of heavy use under webdriver.

See attachment '6 hour webdriver results'.

This shows that we're still leaking some memory over time (graph on right), and while we have more work to do on that, the relatively flat response time line (graph on left) means the app stayed performant over the 6 hour test.

[Ignore the spike in the left graph. That was a one-time webdriver hiccup.]

Comment 63 Greg Sheremeta 2016-11-14 23:00:48 UTC
Created attachment 1220554 [details]
6 hour webdriver results

Comment 67 Vojtech Szocs 2016-11-15 16:14:14 UTC
(In reply to Greg Sheremeta from comment #62)
> I have good news on our progress. After applying several patches to fix
> memory leaks (and one more not-yet-merged patch), I can say we are now very
> stable on 4.1 master. I've done several performance tests to verify. The
> most important one was an endurance test I did that shows webadmin still
> very stable after 6.25 hours of heavy use under webdriver.

This is great news.

All memory leak fixes should land in 4.0.6 eventually.

Once that happens, we can decide how to proceed regarding 3.6.z.

> 
> See attachment '6 hour webdriver results'.
> 
> This shows that we're still leaking some memory over time (graph on right),
> and while we have more work to do on that, the relatively flat response time
> line (graph on left) means the app stayed performant over the 6 hour test.
> 
> [Ignore the spike in the left graph. That was a one-time webdriver hiccup.]

Given Greg's performance analysis results, I'd conclude this BZ once the "not-yet-merged patch" lands in 4.0.6.

Any further improvements should be done as part of tracker bug 1378935.

Comment 68 Greg Sheremeta 2016-11-15 20:22:07 UTC
Attaching another test result that shows the dramatic improvements made by Vojtech's awesome tooltips fix.

See 'tooltip leak fix results.png'

The flat lines are with tooltips fix onboard. The mountainous slope is without it. The degradation was severe, as you can see.

That patch alone shaved 40 minutes off of an hour run, so we are now 66% faster.

:)

Comment 69 Greg Sheremeta 2016-11-15 20:22:55 UTC
Created attachment 1220952 [details]
tooltip leak fix results

Comment 70 Oved Ourfali 2016-11-23 11:48:34 UTC
*** Bug 1395911 has been marked as a duplicate of this bug. ***

Comment 73 Vojtech Szocs 2016-11-25 16:35:22 UTC
(In reply to Greg Sheremeta from comment #68)
> Attaching another test result that shows the dramatic improvements made by
> Vojtech's awesome tooltips fix.

It's the combination of all memory leak fixes, not only the tooltip one :) but at least we've somewhat stabilized the memory growth, so users don't have to reload WebAdmin UI that often.

Comment 79 Greg Sheremeta 2017-02-08 14:25:20 UTC
Hi Jack,

Since you have both UI and API slowness, that makes me wonder if engine slowness is the cause of both. But in order to test that theory, you would need to upgrade your engine to 3.6.10 as Oved said in Comment 78, so you get our UI improvements.

Is it possible to upgrade to 3.6.10?

Comment 80 Pavel Novotny 2017-02-08 16:13:20 UTC
Verified in 
rhevm-4.1.0.4-0.1.el7.noarch
ovirt-engine-webadmin-portal-4.1.0.4-0.1.el7.noarch

Browser: Firefox 45.7 ESR @ RHEL 6.8

I ran a test which went in a loop through the main tabs and then opened & closed the New VM dialog. 
I used 100 passes and for each one I measured the RAM usage via the browser built-in tool at the about:memory page.
I compared the data against rhevm-4.0.6.3-0.1.el7ev, which already contains
some memory usage improvements (see bug 1398546).

Results:
4.0.6: Started at ~140 MB and stayed around this value. The average was 160 MB.
4.1.0: Started at ~120 MB or less and stayed around this value. The average was 119 MB.
(chart attached)

Timewise, 100 passes took:
4.0.6: 2h6m
4.1.0: 1h34m (25 % faster)

Comment 81 Pavel Novotny 2017-02-08 16:14:21 UTC
Created attachment 1248637 [details]
RAM usage chart: 4.0.6 vs. 4.1.0

Comment 82 Yaniv Kaul 2017-02-08 17:38:18 UTC
(In reply to Pavel Novotny from comment #80)
> Verified in 
> rhevm-4.1.0.4-0.1.el7.noarch
> ovirt-engine-webadmin-portal-4.1.0.4-0.1.el7.noarch
> 
> Browser: Firefox 45.7 ESR @ RHEL 6.8
> 
> I ran a test which went in a loop through the main tabs and then opened &
> closed the New VM dialog. 
> I used 100 passes and for each one I measured the RAM usage via the browser
> built-in tool at the about:memory page.

How many objects (hosts, VMs) did you have in the system?

> I compared the data against rhevm-4.0.6.3-0.1.el7ev, which already contains
> some memory usage improvements (see bug 1398546).
> 
> Results:
> 4.0.6: Started at ~140 MB and stayed around this value. The average was 160
> MB.
> 4.1.0: Started at ~120 MB or less and stayed around this value. The average
> was 119 MB.
> (chart attached)
> 
> Timewise, 100 passes took:
> 4.0.6: 2h6m
> 4.1.0: 1h34m (25 % faster)

Nice. Any measurements of network usage? average latency, etc.?

Comment 83 Pavel Novotny 2017-02-08 17:51:02 UTC
(In reply to Yaniv Kaul from comment #82)
[snip]
> > I ran a test which went in a loop through the main tabs and then opened &
> > closed the New VM dialog. 
> > I used 100 passes and for each one I measured the RAM usage via the browser
> > built-in tool at the about:memory page.
> 
> How many objects (hosts, VMs) did you have in the system?

Relatively small environment:
1 host, 1 NFS storage, 41 VMs (blank, w/o OS installed), 4 of them Up

> 
[snip]
> > Timewise, 100 passes took:
> > 4.0.6: 2h6m
> > 4.1.0: 1h34m (25 % faster)
> 
> Nice. Any measurements of network usage? average latency, etc.?

No, I just measured the RAM usage and overall elapsed time.

Comment 84 Yaniv Kaul 2017-02-08 18:23:45 UTC
(In reply to Pavel Novotny from comment #83)
> (In reply to Yaniv Kaul from comment #82)
> [snip]
> > > I ran a test which went in a loop through the main tabs and then opened &
> > > closed the New VM dialog. 
> > > I used 100 passes and for each one I measured the RAM usage via the browser
> > > built-in tool at the about:memory page.
> > 
> > How many objects (hosts, VMs) did you have in the system?
> 
> Relatively small environment:
> 1 host, 1 NFS storage, 41 VMs (blank, w/o OS installed), 4 of them Up

Quite a small environment - if you have a bigger one to test on, that'd be great (with nested virt you can have many more, I reckon).

> 
> > 
> [snip]
> > > Timewise, 100 passes took:
> > > 4.0.6: 2h6m
> > > 4.1.0: 1h34m (25 % faster)
> > 
> > Nice. Any measurements of network usage? average latency, etc.?
> 
> No, I just measured the RAM usage and overall elapsed time.

OK, thanks.