1345934 – [scale] - dashboard landing page generate inefficient query - history.cpu_per_socket

Bug 1345934 - [scale] - dashboard landing page generate inefficient query - history.cpu_per_socket

Summary: [scale] - dashboard landing page generate inefficient query - history.cpu_per...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	ovirt-engine
Classification:	oVirt
Component:	General
Sub Component:
Version:	4.0.0
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	ovirt-4.0.1
Target Release:	4.0.0
Assignee:	Alexander Wels
QA Contact:	Eldad Marciano
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-06-13 13:10 UTC by Eldad Marciano
Modified:	2016-08-12 14:11 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2016-08-12 14:11:52 UTC
oVirt Team:	UX
Embargoed:
Dependent Products:
Flags:	rule-engine: ovirt-4.0.z+ rule-engine: planning_ack+ rule-engine: devel_ack+ pnovotny: testing_ack+

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
oVirt gerrit	59089	0	master	MERGED	dashboard: Optimize VM status query	2016-06-21 19:07:29 UTC
oVirt gerrit	59570	0	ovirt-engine-4.0	MERGED	dashboard: Optimize VM status query	2016-06-22 06:40:52 UTC

Description Eldad Marciano 2016-06-13 13:10:24 UTC

Description of problem:
when browsing to landing page it takes very long time to fetch the dashboard


the query is:
SELECT SUM(COALESCE(history.cpu_per_socket, 0) * COALESCE (history.number_of_sockets, 0)) AS cpu_total_vms, SUM(CASE WHEN samples.vm_status = 1 THEN COALESCE (history.cpu_per_socket, 0) * COALESCE (history.number_of_sockets, 0) ELSE 0 END ) AS cpu_used_vms, CAST(SUM(COALESCE(history.memory_size_mb, 0.00)) AS float) AS mem_total_vms, CAST(SUM(CASE WHEN samples.vm_status = 1 THEN COALESCE(history.memory_size_mb, 0.00) ELSE 0.00 END) AS float) AS mem_used_vms FROM v4_0_statistics_vms_resources_usage_samples samples INNER JOIN v4_0_configuration_history_vms history ON history.history_id = samples.vm_configuration_version INNER JOIN v4_0_latest_configuration_vms a ON a.vm_id = samples.vm_id WHERE history_datetime >= CURRENT_TIMESTAMP - INTERVAL '5 minute' AND history_datetime <= CURRENT_TIMESTAMP AND samples.history_id IN (SELECT MAX(history_id) FROM v4_0_statistics_vms_resources_usage_samples b GROUP BY vm_id )


Version-Release number of selected component (if applicable):
4.0.7

How reproducible:
100%

Steps to Reproduce:
1. large scale env (39 hosts, 3K vms)
2. fetch the dashboard.
3.

Actual results:
very long response time and sometimes the UI get stuck 

Expected results:
fast response time, stable UI

Additional info:

Comment 1 Eldad Marciano 2016-06-13 13:11:01 UTC

I forgot to mention this is the most heavy query and it takes like 132s to resolve it.

Comment 2 Yaniv Kaul 2016-06-13 14:39:23 UTC

Reducing severity - this is taking a lot of time on high scale. I assume it is working in a sane manner on a smaller scale setup.

Comment 3 Alexander Wels 2016-06-13 14:43:25 UTC

I am analyzing the queries now, looks like we might need some indexes on some tables that we are using.

Comment 4 Alexander Wels 2016-06-13 19:40:20 UTC

Got the query down to about 20 seconds. Not sure what else to do to improve it.

Comment 5 Pavel Novotny 2016-07-07 08:23:07 UTC

Moving verification to Scale team, as they have the appropriate environment for this.

Comment 6 Eldad Marciano 2016-08-09 11:45:26 UTC

verified execution time is 2 sec.

Note You need to log in before you can comment on or make changes to this bug.