Bug 1795457
Summary: | RHV-M causing high load on PostgreSQL DB after upgrade to 4.2 | ||
---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | hhaberma |
Component: | ovirt-web-ui | Assignee: | Ben Amsalem <bamsalem> |
Status: | CLOSED ERRATA | QA Contact: | David Vaanunu <dvaanunu> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 4.2.8 | CC: | achareka, bamsalem, dagur, fgarciad, lleistne, michal.skrivanek, mlehrer, nashok, pelauter, sdickers, sgratch |
Target Milestone: | ovirt-4.4.5-1 | Keywords: | Performance |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | ovirt-web-ui-1.6.8-1 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-04-14 11:43:08 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | UX | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
hhaberma
2020-01-28 02:00:42 UTC
As a first step in dealing with VM Portal generating a lot of REST calls, a few changes have been made: https://github.com/oVirt/ovirt-web-ui/pull/1238 For webadmin this is most likely bug 1845747 which has been fixed for 4.4 and 4.3.11 for 4.4.1 there's a partial VM Portal fix too(https://github.com/oVirt/ovirt-web-ui/issues/1240). We haven't measured the difference, it could be significant and solve the problem already, or maybe not. We are targeting further improvement in 4.4.2 so we'll keep the bug open Tested version: rhv-release-4.4.2-4 redhat-release-8.2-25.0 ovirt-engine-4.4.2.3-0.6 ovirt-web-ui-1.6.4-1 Flow: Open 50 sessions of VM portal and scroll down. After login to VM-Portal, 20 VMs are loading. Each scroll down trigger to load another 20 VMs. While tested on older version (4.4.1), all the VMs are loading till reached the end. Results: hosted-engine (16 cores & 32GB) usage: 95% CPU and 20GB RAM engine usage: 7% cpu , 5GB Memory postgress usage: 90% usage , 6.5GB memory Just adding to the previous comment 18 The amount of vms requested is reduced by the dev fix but the main issue is the following sql query: select * from getdisksvmguid which takes about 19s and is executed 200 times from the single api call of '/ovirt-engine/api/vms;max=100 follow=graphics_consoles' which comes from the vm portal. Once issuing multiple instances of this query generated from multiple vm portal calls the PostgreSQL gets saturated by concurrent api requests from vm portal. we should eliminate the whole follow=graphics_consoles, it's whole reason for existence is just the spice/vnc console selection and that would be best to simplify into no choice, just select one internally (or via system/user settings) that would take care of both the getdisksvmguid query but also an extra API call for each VM (In reply to Michal Skrivanek from comment #20) > we should eliminate the whole follow=graphics_consoles, it's whole reason > for existence is just the spice/vnc console selection and that would be best > to simplify into no choice, just select one internally (or via system/user > settings) > > that would take care of both the getdisksvmguid query but also an extra API > call for each VM I'm re-targetting this for 4.4.4 since we won't have time for completing that for 4.4.3. Target milestone is set to 4.4.4 is this still accurate? This bug is in NEW status for ovirt 4.4.4. We are now in blocker only phase, please either mark this as a blocker or please re-target. (In reply to Sandro Bonazzola from comment #25) > This bug is in NEW status for ovirt 4.4.4. We are now in blocker only phase, > please either mark this as a blocker or please re-target. re-targeted (In reply to mlehrer from comment #24) > Target milestone is set to 4.4.4 is this still accurate? This is still in-progress so postponed to 4.4.5. #Summary 50 Users continually scrolling concurrently creates a moderate load, but PostgreSQL in addition to overall cpu utilization is reduced from previous version testing. #How was this tested Puppeteer script simulating 50 users with each a browsers logs into vm portal and continues scrolling down the page every few seconds. Users are loaded every few seconds and continue to scroll down the page. Backed actions taking over 1 second are collected by Glowroot, resources monitored by Nmon Basic Webadmin functionality was checked during peak 50 user scroll load scenario. #Env 4215 Vms and 260 Hosts rhv-release-4.4.5-11-001.noarch ovirt-web-ui-1.6.8-1.el8ev.noarch #Findings Reduction in postgresql cpu utilization by about 30% less when compared to previous test Overall engine cpu utilization reduced, including average run queue length reduced in half when compared to previous test Webadmin actions may degrade by a few seconds during peak load Moving to verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: RHV Manager (ovirt-engine) 4.4.z [ovirt-4.4.5] 0-day security, bug fix, enhance), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:1186 |