Created attachment 1581662 [details] setup logs and ovirt-engine-dwh log Description of problem: I upgraded our installation from 4.2.8 to 4.3.4 and I am unable to access the dashboard in UI as there is only “Loading data…” displayed. There are postgresql processes running on the engine taking 100% cpu. One was running for more than 14 hours: SELECT name, AVG(total) AS total, AVG(used) AS used, MAX(previous_used) AS previous_used, MAX(previous_total) AS previous_total, MAX(vm_disk_ usage_percent) AS usage_percent FROM (SELECT history_datetime, vms.vm_name AS name, SUM(vm_disk_size_mb) AS total, SUM(vm_disk_actual_size_mb) AS used, COALESCE(MAX(previous_actual_size), 0) AS previous_used, COALESCE(MAX(previous_total_size), 0) AS previous_total, CASE WHEN MAX(vm_disk_size_mb) = 0 THEN 0 ELSE (CAST(MAX(vm_disk_actual_size_mb) AS DECIMAL)/ MAX(vm_disk_size_mb) * 100) END AS vm_disk_usage_percent FROM v4_2_statistics_vms_disks_resources_usage_samples samples INNER JOIN v4_2_latest_configuration_vms_devices devices ON devices.device_id = vm_disk_id IN NER JOIN v4_2_latest_configuration_vms vms ON vms.vm_id = devices.vm_id INNER JOIN v4_2_latest_configuration_vms_disks disks ON disks.vm_disk_id = samples.vm_disk_id LEFT OUTER JOIN (SELECT v m_name, AVG(previous_actual_size) AS previous_actual_size, AVG(previous_total_size) AS previous_total_size FROM (SELEC 13783 | 2019-06-15 08:41:52.349769+02 | active | SELECT name, AVG(total) AS total, AVG(used) AS used, MAX(previous_used) AS previous_used, MAX(previous_total) AS previous_total, MAX(vm_disk_ usage_percent) AS usage_percent FROM (SELECT history_datetime, vms.vm_name AS name, SUM(vm_disk_size_mb) AS total, SUM(vm_disk_actual_size_mb) AS used, COALESCE(MAX(previous_actual_size), 0) AS previous_used, COALESCE(MAX(previous_total_size), 0) AS previous_total, CASE WHEN MAX(vm_disk_size_mb) = 0 THEN 0 ELSE (CAST(MAX(vm_disk_actual_size_mb) AS DECIMAL)/ MAX(vm_disk_size_mb) * 100) END AS vm_disk_usage_percent FROM v4_2_statistics_vms_disks_resources_usage_samples samples INNER JOIN v4_2_latest_configuration_vms_devices devices ON devices.device_id = vm_disk_id IN NER JOIN v4_2_latest_configuration_vms vms ON vms.vm_id = devices.vm_id INNER JOIN v4_2_latest_configuration_vms_disks disks ON disks.vm_disk_id = samples.vm_disk_id LEFT OUTER JOIN (SELECT v m_name, AVG(previous_actual_size) AS previous_actual_size, AVG(previous_total_size) AS previous_total_size FROM (SELEC… After about 20 hours, this query completed but currently there are two identical queries (sent from ovirt-engine) running for more than 50 hours consuming lots of cpu cycles: 11414 | 2019-06-16 10:40:09.636273+02 | active | SELECT name, AVG(total) AS total, AVG(used) AS used, MAX(previous_used) AS previous_used, MAX(previous_total) AS previous_total, MAX(vm_disk_usage_percent) AS usage_percent FROM (SELECT history_datetime, vms.vm_name AS name, SUM(vm_disk_size_mb) AS total, SUM(vm_disk_actual_size_mb) AS used, COALESCE(MAX(previous_a ctual_size), 0) AS previous_used, COALESCE(MAX(previous_total_size), 0) AS previous_total, CASE WHEN MAX(vm_disk_size_mb) = 0 THEN 0 ELSE (CAST(MAX(vm_disk_actual_size_mb) AS DECIMAL)/ MAX(vm_disk_size_mb) * 100) END AS vm_disk_usage_percent FROM v4_2_statistics_vms_disks_resources_usage_samples samples INNER JOIN v4_2_latest_configuration_vms_devices devices ON d evices.device_id = vm_disk_id INNER JOIN v4_2_latest_configuration_vms vms ON vms.vm_id = devices.vm_id INNER JOIN v4_2_latest_configuration_vms_disks disks ON disks.vm_disk_id = samples.vm_disk_id LEFT OUTER JOIN (SELECT vm_name, AVG(previous_actual_size) AS previous_actual_size, AVG(previous_total_size) AS previous_total_size FROM (SELEC 11900 | 2019-06-16 10:40:09.641453+02 | active | SELECT name, AVG(total) AS total, AVG(used) AS used, MAX(previous_used) AS previous_used, MAX(previous_total) AS previous_total, MAX(vm_disk_usage_percent) AS usage_percent FROM (SELECT history_datetime, vms.vm_name AS name, SUM(vm_disk_size_mb) AS total, SUM(vm_disk_actual_size_mb) AS used, COALESCE(MAX(previous_a ctual_size), 0) AS previous_used, COALESCE(MAX(previous_total_size), 0) AS previous_total, CASE WHEN MAX(vm_disk_size_mb) = 0 THEN 0 ELSE (CAST(MAX(vm_disk_actual_size_mb) AS DECIMAL)/ MAX(vm_disk_size_mb) * 100) END AS vm_disk_usage_percent FROM v4_2_statistics_vms_disks_resources_usage_samples samples INNER JOIN v4_2_latest_configuration_vms_devices devices ON d evices.device_id = vm_disk_id INNER JOIN v4_2_latest_configuration_vms vms ON vms.vm_id = devices.vm_id INNER JOIN v4_2_latest_configuration_vms_disks disks ON disks.vm_disk_id = samples.vm_disk_id LEFT OUTER JOIN (SELECT vm_name, AVG(previous_actual_size) AS previous_actual_size, AVG(previous_total_size) AS previous_total_size FROM (SELEC In addition to the two long running queries, there are occasionally the following queries running for minutes (but completing), causing additional load on the engine: pid | query_start | state | left -------+-------------------------------+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------------ 11414 | 2019-06-16 10:40:09.636273+02 | active | SELECT name, AVG(total) AS total, AVG(used) AS used, MAX(previous_used) AS previous_used, MAX(previous_total) AS previous_total, MAX(vm_disk_usage_percent) AS u 11900 | 2019-06-16 10:40:09.641453+02 | active | SELECT name, AVG(total) AS total, AVG(used) AS used, MAX(previous_used) AS previous_used, MAX(previous_total) AS previous_total, MAX(vm_disk_usage_percent) AS u 23694 | 2019-06-18 13:05:19.447169+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 28243 | 2019-06-18 13:05:23.279162+02 | active | autovacuum: VACUUM public.vm_disk_samples_history 23682 | 2019-06-18 13:05:23.519138+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 23585 | 2019-06-18 13:05:30.044553+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 28352 | 2019-06-18 13:05:34.304185+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 28351 | 2019-06-18 13:05:37.407647+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 23587 | 2019-06-18 13:05:38.959182+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 11805 | 2019-06-18 13:05:41.09852+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 21823 | 2019-06-18 13:05:46.11394+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 11733 | 2019-06-18 13:05:49.545165+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 20276 | 2019-06-18 13:08:22.709253+02 | active | autovacuum: VACUUM public.vm_disks_usage_samples_history 11521 | 2019-06-18 13:09:04.439345+02 | active | select * from deletevmstatic($1, $2) as result 23463 | 2019-06-18 13:09:30.912788+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 23688 | 2019-06-18 13:09:44.400232+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 23464 | 2019-06-18 13:09:48.975726+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 23693 | 2019-06-18 13:10:06.827176+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 11586 | 2019-06-18 13:10:14.591432+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 28346 | 2019-06-18 13:10:20.55725+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 29941 | 2019-06-18 13:10:31.23287+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 23697 | 2019-06-18 13:10:39.736149+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 19195 | 2019-06-18 13:10:40.45711+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 23700 | 2019-06-18 13:10:40.879134+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 23460 | 2019-06-18 13:10:45.100639+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 11585 | 2019-06-18 13:10:46.859267+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 11591 | 2019-06-18 13:10:55.923129+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 11524 | 2019-06-18 13:10:59.314155+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 23698 | 2019-06-18 13:10:59.856495+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 23577 | 2019-06-18 13:11:02.712207+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 11745 | 2019-06-18 13:11:04.553137+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 30411 | 2019-06-18 13:11:06.647954+02 | active | autovacuum: VACUUM public.host_interface_samples_history 29932 | 2019-06-18 13:11:07.034129+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 23581 | 2019-06-18 13:11:08.123161+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 11579 | 2019-06-18 13:11:08.703078+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 23576 | 2019-06-18 13:11:08.774152+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 11768 | 2019-06-18 13:11:09.003263+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 23461 | 2019-06-18 13:11:09.041185+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 11806 | 2019-06-18 13:11:11.043209+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 19193 | 2019-06-18 13:11:11.053464+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 11739 | 2019-06-18 13:11:12.549026+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 23799 | 2019-06-18 13:11:13.073306+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 20020 | 2019-06-18 13:11:13.919161+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 11582 | 2019-06-18 13:11:14.43512+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 11784 | 2019-06-18 13:11:16.626666+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 10003 | 2019-06-18 13:11:17.70016+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 11792 | 2019-06-18 13:11:18.056021+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 11776 | 2019-06-18 13:11:18.240969+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 29935 | 2019-06-18 13:11:18.368681+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 4580 | 2019-06-18 13:11:18.61815+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 11526 | 2019-06-18 13:11:19.397513+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 21648 | 2019-06-18 13:11:20.286662+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 23702 | 2019-06-18 13:11:21.016036+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 11795 | 2019-06-18 13:11:21.565148+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 11588 | 2019-06-18 13:11:22.063073+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 11808 | 2019-06-18 13:11:22.262166+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 23978 | 2019-06-18 13:11:22.560168+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 28344 | 2019-06-18 13:11:22.904197+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 11789 | 2019-06-18 13:11:23.345251+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 28354 | 2019-06-18 13:11:24.999153+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 11744 | 2019-06-18 13:11:25.433987+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 19194 | 2019-06-18 13:11:25.631639+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 24001 | 2019-06-18 13:11:25.771428+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 11767 | 2019-06-18 13:11:26.367648+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 28355 | 2019-06-18 13:11:26.978236+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 22082 | 2019-06-18 13:11:27.510145+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 23962 | 2019-06-18 13:11:28.033392+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 30369 | 2019-06-18 13:11:28.36871+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 23562 | 2019-06-18 13:11:28.621243+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 23584 | 2019-06-18 13:11:28.857389+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 27655 | 2019-06-18 13:11:28.861159+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 24109 | 2019-06-18 13:11:28.967137+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 29024 | 2019-06-18 13:11:29.664194+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 11781 | 2019-06-18 13:11:29.973295+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 11742 | 2019-06-18 13:11:30.06306+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 23696 | 2019-06-18 13:11:30.246082+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 23695 | 2019-06-18 13:11:30.354733+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 29936 | 2019-06-18 13:11:30.526112+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 23701 | 2019-06-18 13:11:30.767991+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 11785 | 2019-06-18 13:11:30.76935+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 23766 | 2019-06-18 13:11:30.818788+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 23706 | 2019-06-18 13:11:30.944849+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 11587 | 2019-06-18 13:11:31.518477+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 30367 | 2019-06-18 13:11:31.527025+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 20023 | 2019-06-18 13:11:31.556656+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, 30361 | 2019-06-18 13:11:31.846585+02 | active | select * from insertvmstatic($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, ... oVirt Engine (no self hosted engine) is configured with 128 GB memory and 24 cores and running on a dedicated system. Upgrade from 4.2.8 to 4.3.4 was performed with the following steps: yum install http://resources.ovirt.org/pub/yum-repo/ovirt-release43.rpm yum update "ovirt-*-setup*" engine-setup yum update Version-Release number of selected component (if applicable): ovirt-engine.noarch 4.3.4.3-1.el7 @ovirt-4.3 ovirt-engine-dbscripts.noarch 4.3.4.3-1.el7 @ovirt-4.3 ovirt-engine-dwh.noarch 4.3.0-1.el7 @ovirt-4.3 ovirt-engine-dwh-setup.noarch 4.3.0-1.el7 @ovirt-4.3 How reproducible: Occurred once when upgrading from 4.2.8 to 4.3.4 Steps to Reproduce: 1. 2. 3. Actual results: High load and inaccessible dashboard in UI. Expected results: Additional info: ovirt_engine_history=# SELECT id, version, script, checksum, installed_by, started_at, ended_at, state, current, comment FROM schema_version ORDER BY version DESC limit 10; id | version | script | checksum | installed_by | started_at | ended_at | state | current | comment ----+----------+-------------------------------------------------------------------+----------------------------------+----------------------+----------------------------+----------------------------+-----------+---------+------------------------------- 50 | 04030020 | upgrade/04_03_0020_update_rx_tx_rate_percition.sql | c480b070cc69681cf62fb853f3d7139f | ovirt_engine_history | 2019-06-14 18:14:44.960503 | 2019-06-14 18:14:44.991593 | SKIPPED | t | Installed already by 04020040 49 | 04030010 | upgrade/04_03_0010_update_network_name_length.sql | a1a0d75560575cdc60c0bbaad2cda773 | ovirt_engine_history | 2019-06-14 18:14:44.904426 | 2019-06-14 18:14:44.935725 | SKIPPED | f | Installed already by 04020030 48 | 04020040 | upgrade/04_02_0040_update_rx_tx_rate_percition.sql | c480b070cc69681cf62fb853f3d7139f | ovirt_engine_history | 2018-07-07 14:34:42.505446 | 2018-07-07 14:36:31.662577 | INSTALLED | f | 47 | 04020030 | upgrade/04_02_0030_update_network_name_length.sql | a1a0d75560575cdc60c0bbaad2cda773 | ovirt_engine_history | 2018-07-07 14:34:42.438056 | 2018-07-07 14:34:42.482705 | INSTALLED | f | 46 | 04020020 | upgrade/04_02_0020__updated_vm_interface_history_id_to_bigint.sql | 58a8afa29fc720dc87f37b7f9c9e0151 | ovirt_engine_history | 2018-04-18 17:17:04.908667 | 2018-04-18 17:17:39.111339 | INSTALLED | f | 45 | 04020010 | upgrade/04_02_0010_updated_vm_template_name_length.sql | 4b5391f40e8787e3b1033635aafe18a1 | ovirt_engine_history | 2018-01-05 09:56:39.213757 | 2018-01-05 09:56:39.238775 | SKIPPED | f | Installed already by 04010020 44 | 04010020 | upgrade/04_01_0020_updated_vm_template_name_lentgh.sql | 4b5391f40e8787e3b1033635aafe18a1 | ovirt_engine_history | 2017-10-05 13:53:04.225474 | 2017-10-05 13:53:04.269508 | INSTALLED | f | 43 | 04010010 | upgrade/04_01_0010_added_seconds_in_status_to_sample_tables.sql | be7a1b2fc7f03d263b45a613d5bced03 | ovirt_engine_history | 2017-02-03 13:16:18.29672 | 2017-02-03 13:16:18.320728 | SKIPPED | f | Installed already by 04000050 42 | 04000050 | upgrade/04_00_0050_added_seconds_in_status_to_sample_tables.sql | be7a1b2fc7f03d263b45a613d5bced03 | ovirt_engine_history | 2016-10-03 15:13:33.856501 | 2016-10-03 15:13:34.010135 | INSTALLED | f | 41 | 04000040 | upgrade/04_00_0040_drop_all_history_db_foreign_keys.sql | ed8b2c02bea97d0ee21f737614a2d5e3 | ovirt_engine_history | 2016-10-03 15:13:33.763905 | 2016-10-03 15:13:33.839532 | INSTALLED | f | ovirt_engine_history=# select min(history_datetime),max(history_datetime),count(*) from v4_2_statistics_vms_disks_resources_usage_samples; min | max | count ----------------------------+----------------------------+--------- 2019-06-17 04:13:02.661+02 | 2019-06-18 13:27:58.154+02 | 6066162
The two parallel queries finished after about 65 hours, after that one of them started again (currently running for about 14 hours): SELECT name, AVG(total) AS total, AVG(used) AS used, MAX(previous_used) AS previous_used, MAX(previous_total) AS previous_total, MAX(vm_disk_usage_percent) AS usage_percent FROM (SELECT history_datetime, vms.vm_name AS name, SUM(vm_disk_size_mb) AS total, SUM(vm_disk_actual_size_mb) AS used, COALESCE(MAX(previous_actual_size), 0) AS previous_used, COALESCE(MAX(previous_total_size), 0) AS previous_total, CASE WHEN MAX(vm_disk_size_mb) = 0 THEN 0 ELSE (CAST(MAX(vm_disk_actual_size_mb) AS DECIMAL)/ MAX(vm_disk_size_mb) * 100) END AS vm_disk_usage_percent FROM v4_2_statistics_vms_disks_resources_usage_samples samples INNER JOIN v4_2_latest_configuration_vms_devices devices ON devices.device_id = vm_disk_id INNER JOIN v4_2_latest_configuration_vms vms ON vms.vm_id = devices.vm_id INNER JOIN v4_2_latest_configuration_vms_disks disks ON disks.vm_disk_id = samples.vm_disk_id LEFT OUTER JOIN (SELECT vm_name, AVG(previous_actual_size) AS previous_actual_size, AVG(previous_total_size) AS previous_total_size FROM (SELEC
Can you please try to do a full vacuum on your databases? This requires engine downtime. The simplest way to do that is probably 'engine-setup --offline' and reply 'Yes' to both questions about vacuum (engine and dwh). You can also run the vacuum scripts manually - something like: systemctl stop ovirt-engine systemctl stop ovirt-engine-dwhd /usr/share/ovirt-engine/bin/engine-vacuum.sh /usr/share/ovirt-engine-dwh/bin/dwh-vacuum.sh systemctl start ovirt-engine systemctl start ovirt-engine-dwhd
I ran the full vacuum during engine setup, just before the queries started: 2019-06-14 18:10:03,375+0200 DEBUG otopi.plugins.otopi.dialog.human human.queryString:159 query DWH_VACUUM_FULL 2019-06-14 18:10:03,375+0200 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:SEND Perform full vacuum on the oVirt engine history 2019-06-14 18:10:03,375+0200 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:SEND database ovirt_engine_history@localhost? 2019-06-14 18:10:03,376+0200 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:SEND This operation may take a while depending on this setup health and the 2019-06-14 18:10:03,376+0200 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:SEND configuration of the db vacuum process. 2019-06-14 18:10:03,376+0200 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:SEND See https://www.postgresql.org/docs/9.0/static/sql-vacuum.html 2019-06-14 18:10:03,376+0200 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:SEND (Yes, No) [No]: 2019-06-14 18:10:12,267+0200 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:RECEIVE Yes 2019-06-14 18:10:12,269+0200 DEBUG otopi.context context.dumpEnvironment:731 ENVIRONMENT DUMP - BEGIN 2019-06-14 18:10:12,269+0200 DEBUG otopi.context context.dumpEnvironment:741 ENV OVESETUP_DB/dwhVacuumFull=bool:'True' 2019-06-14 18:10:12,270+0200 DEBUG otopi.context context.dumpEnvironment:741 ENV QUESTION/1/DWH_VACUUM_FULL=str:'yes' 2019-06-14 18:10:12,270+0200 DEBUG otopi.context context.dumpEnvironment:745 ENVIRONMENT DUMP - END 2019-06-14 18:10:12,271+0200 DEBUG otopi.context context._executeMethod:127 Stage customization METHOD otopi.plugins.ovirt_engine_setup.ovirt_engine_common.dialog.titles.Plugin._title_e_database 2019-06-14 18:10:12,274+0200 DEBUG otopi.context context._executeMethod:127 Stage customization METHOD otopi.plugins.ovirt_engine_setup.ovirt_engine_common.dialog.titles.Plugin._title_s_engine 2019-06-14 18:10:12,274+0200 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:SEND 2019-06-14 18:10:12,275+0200 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:SEND --== OVIRT ENGINE CONFIGURATION ==-- 2019-06-14 18:10:12,275+0200 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:SEND 2019-06-14 18:10:12,277+0200 DEBUG otopi.context context._executeMethod:127 Stage customization METHOD otopi.plugins.ovirt_engine_setup.ovirt_engine.config.aaa.Plugin._customization 2019-06-14 18:10:12,278+0200 DEBUG otopi.context context._executeMethod:136 otopi.plugins.ovirt_engine_setup.ovirt_engine.config.aaa.Plugin._customization condition False 2019-06-14 18:10:12,280+0200 DEBUG otopi.context context._executeMethod:127 Stage customization METHOD otopi.plugins.ovirt_engine_setup.ovirt_engine.config.appmode.Plugin._customization_enable 2019-06-14 18:10:12,281+0200 DEBUG otopi.context context._executeMethod:136 otopi.plugins.ovirt_engine_setup.ovirt_engine.config.appmode.Plugin._customization_enable condition False 2019-06-14 18:10:12,283+0200 DEBUG otopi.context context._executeMethod:127 Stage customization METHOD otopi.plugins.ovirt_engine_setup.ovirt_engine.db.vacuum.Plugin._customization 2019-06-14 18:10:12,284+0200 DEBUG otopi.plugins.otopi.dialog.human human.queryString:159 query ENGINE_VACUUM_FULL 2019-06-14 18:10:12,284+0200 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:SEND Perform full vacuum on the engine database engine@localhost? 2019-06-14 18:10:12,284+0200 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:SEND This operation may take a while depending on this setup health and the 2019-06-14 18:10:12,284+0200 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:SEND configuration of the db vacuum process. 2019-06-14 18:10:12,284+0200 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:SEND See https://www.postgresql.org/docs/10/sql-vacuum.html 2019-06-14 18:10:12,284+0200 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:SEND (Yes, No) [No]: 2019-06-14 18:10:15,617+0200 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:RECEIVE Yes Do you think an additional full vacuum would address this?
(In reply to oliver.albl from comment #3) > I ran the full vacuum during engine setup, just before the queries started: Sorry I didn't notice this, I checked only the last setup log. > Do you think an additional full vacuum would address this? No :-(. Are you sure, in your analysis above, that the mentioned queries indeed ran that long? Asking, because we realized that dwhd does not close connections to the database in some cases, and this might cause the last query ran to still appear as running although it's not anymore. See bug 1673808. If you are not certain, perhaps try to log all pg queries on pg level, to see which queries actually took a long time. See e.g.: https://stackoverflow.com/questions/12670745/can-i-log-query-execution-time-in-postgresql-8-4 . Thanks :-)
Yes, I am sure. The query, that always starts again after its completion is currently running for more than 35 hours, the corresponding pid continuously consumes cpu cycles: ovirt_engine_history=# select pid,query_start,now()-query_start,query from pg_stat_activity where state = 'active' order by query_start; pid | query_start | ?column? | query -------+-------------------------------+-----------------------+------------------------------------------------------------------------------------------------------------------------------------ 11900 | 2019-06-24 23:57:29.009306+02 | 1 day 10:50:49.001449 | SELECT name, AVG(total) AS total, AVG(used) AS used, MAX(previous_used) AS previous_used, MAX(previous_total) AS previous_total, MA X(vm_disk_usage_percent) AS usage_percent FROM (SELECT history_datetime, vms.vm_name AS name, SUM(vm_disk_size_mb) AS total, SUM(vm_disk_actual_size_mb) AS used, COALESCE(MAX(previous_actual_size) , 0) AS previous_used, COALESCE(MAX(previous_total_size), 0) AS previous_total, CASE WHEN MAX(vm_disk_size_mb) = 0 THEN 0 ELSE (CAST(MAX(vm_disk_actual_size_mb) AS DECIMAL)/ MAX(vm_disk_size_mb) * 100) END AS vm_disk_usage_percent FROM v4_2_statistics_vms_disks_resources_usage_samples samples INNER JOIN v4_2_latest_configuration_vms_devices devices ON devices.device_id = vm_disk_id INNER J OIN v4_2_latest_configuration_vms vms ON vms.vm_id = devices.vm_id INNER JOIN v4_2_latest_configuration_vms_disks disks ON disks.vm_disk_id = samples.vm_disk_id LEFT OUTER JOIN (SELECT vm_name, AV G(previous_actual_size) AS previous_actual_size, AVG(previous_total_size) AS previous_total_size FROM (SELEC
Here ist the "TOP" result showing PID and CPU usage: top - 10:54:40 up 10 days, 17 min, 2 users, load average: 3.35, 3.33, 2.97 Tasks: 457 total, 7 running, 449 sleeping, 0 stopped, 1 zombie %Cpu(s): 21.6 us, 1.0 sy, 0.0 ni, 77.3 id, 0.1 wa, 0.0 hi, 0.1 si, 0.0 st KiB Mem : 13186128+total, 17253100 free, 40479864 used, 74128320 buff/cache KiB Swap: 8257532 total, 8257532 free, 0 used. 69029696 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 11900 postgres 20 0 22.0g 15.5g 14.2g R 100.0 12.4 14210:15 postmaster
(In reply to oliver.albl from comment #5) > Yes, I am sure. The query, that always starts again after its completion is > currently running for more than 35 hours, the corresponding pid continuously > consumes cpu cycles: > > ovirt_engine_history=# select pid,query_start,now()-query_start,query from > pg_stat_activity where state = 'active' order by query_start; > pid | query_start | ?column? | query > -------+-------------------------------+-----------------------+------------- > ----------------------------------------------------------------------------- > ------------------------------------------ > 11900 | 2019-06-24 23:57:29.009306+02 | 1 day 10:50:49.001449 | SELECT > name, AVG(total) AS total, AVG(used) AS used, MAX(previous_used) AS > previous_used, MAX(previous_total) AS previous_total, MA > X(vm_disk_usage_percent) AS usage_percent FROM (SELECT history_datetime, > vms.vm_name AS name, SUM(vm_disk_size_mb) AS total, > SUM(vm_disk_actual_size_mb) AS used, COALESCE(MAX(previous_actual_size) > , 0) AS previous_used, COALESCE(MAX(previous_total_size), 0) AS > previous_total, CASE WHEN MAX(vm_disk_size_mb) = 0 THEN 0 ELSE > (CAST(MAX(vm_disk_actual_size_mb) AS DECIMAL)/ MAX(vm_disk_size_mb) * > 100) END AS vm_disk_usage_percent FROM > v4_2_statistics_vms_disks_resources_usage_samples samples INNER JOIN > v4_2_latest_configuration_vms_devices devices ON devices.device_id = > vm_disk_id INNER J > OIN v4_2_latest_configuration_vms vms ON vms.vm_id = devices.vm_id INNER > JOIN v4_2_latest_configuration_vms_disks disks ON disks.vm_disk_id = > samples.vm_disk_id LEFT OUTER JOIN (SELECT vm_name, AV > G(previous_actual_size) AS previous_actual_size, AVG(previous_total_size) AS > previous_total_size FROM (SELEC Just to make sure - is this the behavior you see (taking a very long time) also if you try to run it manually with psql? If so, we can try further analyzing this using 'explain'. Please run 'explain $query' with the query that you see that takes a lot of time and check/share the output. If you only see it partially (as above), you can (very likely) find the original in one of the files in [1]. Thanks. [1] https://github.com/oVirt/ovirt-engine/tree/master/frontend/webadmin/modules/frontend/src/main/resources/org/ovirt/engine/ui/frontend/server/dashboard/dao
I do not see the complete query to run explain plan, but the used views seem to be quite "complex", e.g. ovirt_engine_history=# \d+ v4_2_latest_configuration_vms_devices View "public.v4_2_latest_configuration_vms_devices" Column | Type | Modifiers | Storage | Description ------------------------------+--------------------------+-----------+----------+------------- history_id | integer | | plain | vm_id | uuid | | plain | device_id | uuid | | plain | type | character varying(30) | | extended | address | character varying(255) | | extended | is_managed | boolean | | plain | is_plugged | boolean | | plain | is_readonly | boolean | | plain | vm_configuration_version | integer | | plain | device_configuration_version | integer | | plain | create_date | timestamp with time zone | | plain | update_date | timestamp with time zone | | plain | View definition: SELECT vm_device_history.history_id, vm_device_history.vm_id, vm_device_history.device_id, vm_device_history.type, vm_device_history.address, vm_device_history.is_managed, vm_device_history.is_plugged, vm_device_history.is_readonly, vm_device_history.vm_configuration_version, vm_device_history.device_configuration_version, vm_device_history.create_date, vm_device_history.update_date FROM vm_device_history WHERE (vm_device_history.history_id IN ( SELECT max(a.history_id) AS max FROM vm_device_history a GROUP BY a.vm_id, a.device_id)) AND vm_device_history.delete_date IS NULL; ovirt_engine_history=# select count(*) from vm_device_history; count --------- 3050662 Some basic or partial plans: # explain SELECT * FROM v4_2_latest_configuration_vms_devices WHERE device_id = 'fdba1dd3-a5d0-4067-8021-b6880ef9d761'; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=95993.00..97677.57 rows=38 width=657) -> HashAggregate (cost=95992.57..95994.57 rows=200 width=4) Group Key: max(a.history_id) -> Finalize HashAggregate (cost=95092.57..95492.57 rows=40000 width=36) Group Key: a.vm_id, a.device_id -> Gather (cost=86092.57..94492.57 rows=80000 width=36) Workers Planned: 2 -> Partial HashAggregate (cost=85092.57..85492.57 rows=40000 width=36) Group Key: a.vm_id, a.device_id -> Parallel Seq Scan on vm_device_history a (cost=0.00..75559.61 rows=1271061 width=36) -> Index Scan using vm_device_history_pkey on vm_device_history (cost=0.43..8.41 rows=1 width=657) Index Cond: (history_id = (max(a.history_id))) Filter: ((delete_date IS NULL) AND (device_id = 'fdba1dd3-a5d0-4067-8021-b6880ef9d761'::uuid)) (13 rows) ovirt_engine_history=# explain SELECT history_datetime, vms.vm_name AS name, SUM(vm_disk_size_mb) AS total, SUM(vm_disk_actual_size_mb) AS used, CASE WHEN MAX(vm_disk_size_mb) = 0 THEN 0 ELSE (CAST(MAX(vm_disk_actual_size_mb) AS DECIMAL)/ MAX(vm_disk_size_mb) * 100) END AS vm_disk_usage_percent FROM v4_2_statistics_vms_disks_resources_usage_samples samples INNER JOIN v4_2_latest_configuration_vms_devices devices ON devices.device_id = vm_disk_id INNER JOIN v4_2_latest_configuration_vms vms ON vms.vm_id = devices.vm_id INNER JOIN v4_2_latest_configuration_vms_disks disks ON disks.vm_disk_id = samples.vm_disk_id GROUP BY 1,2; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------- GroupAggregate (cost=167037.85..167079.07 rows=970 width=572) Group Key: vm_disk_samples_history.history_datetime, vm_configuration.vm_name -> Sort (cost=167037.85..167040.27 rows=970 width=532) Sort Key: vm_disk_samples_history.history_datetime, vm_configuration.vm_name -> Nested Loop (cost=163041.29..166989.73 rows=970 width=532) -> Nested Loop (cost=163040.73..166710.47 rows=1 width=552) Join Filter: (vm_device_history.history_id = (max(a_2.history_id))) -> Nested Loop (cost=67048.16..70713.40 rows=1 width=556) Join Filter: (vm_disk_configuration.vm_disk_id = vm_device_history.device_id) -> Nested Loop (cost=30743.22..32414.25 rows=1 width=20) -> HashAggregate (cost=30742.79..30744.79 rows=200 width=4) Group Key: max(a.history_id) -> Finalize GroupAggregate (cost=30735.29..30740.29 rows=200 width=20) Group Key: a.vm_disk_id -> Sort (cost=30735.29..30736.29 rows=400 width=20) Sort Key: a.vm_disk_id -> Gather (cost=30676.00..30718.00 rows=400 width=20) Workers Planned: 2 -> Partial HashAggregate (cost=29676.00..29678.00 rows=200 width=20) Group Key: a.vm_disk_id -> Parallel Seq Scan on vm_disk_configuration a (cost=0.00..26859.00 rows=563400 width=20) -> Index Scan using vm_disk_configuration_pkey on vm_disk_configuration (cost=0.43..8.34 rows=1 width=24) Index Cond: (history_id = (max(a.history_id))) Filter: (delete_date IS NULL) -> Nested Loop (cost=36304.94..38298.21 rows=76 width=536) -> Nested Loop (cost=36304.51..37971.64 rows=1 width=532) -> HashAggregate (cost=36304.08..36306.08 rows=200 width=4) Group Key: max(a_1.history_id) -> Finalize GroupAggregate (cost=36296.58..36301.58 rows=200 width=20) Group Key: a_1.vm_id -> Sort (cost=36296.58..36297.58 rows=400 width=20) Sort Key: a_1.vm_id -> Gather (cost=36237.29..36279.29 rows=400 width=20) Workers Planned: 2 -> Partial HashAggregate (cost=35237.29..35239.29 rows=200 width=20) Group Key: a_1.vm_id -> Parallel Seq Scan on vm_configuration a_1 (cost=0.00..32972.86 rows=452886 width=20) -> Index Scan using vm_configuration_pkey on vm_configuration (cost=0.43..8.32 rows=1 width=536) Index Cond: (history_id = (max(a_1.history_id))) Filter: (delete_date IS NULL) -> Index Scan using idx_vm_device_history_vm_id_type on vm_device_history (cost=0.43..325.81 rows=76 width=36) Index Cond: (vm_id = vm_configuration.vm_id) Filter: (delete_date IS NULL) -> HashAggregate (cost=95992.57..95994.57 rows=200 width=4) Group Key: max(a_2.history_id) -> Finalize HashAggregate (cost=95092.57..95492.57 rows=40000 width=36) Group Key: a_2.vm_id, a_2.device_id -> Gather (cost=86092.57..94492.57 rows=80000 width=36) Workers Planned: 2 -> Partial HashAggregate (cost=85092.57..85492.57 rows=40000 width=36) Group Key: a_2.vm_id, a_2.device_id -> Parallel Seq Scan on vm_device_history a_2 (cost=0.00..75559.61 rows=1271061 width=36) -> Index Scan using vm_disk_samples_history_vm_disk_id_idx on vm_disk_samples_history (cost=0.56..228.40 rows=5085 width=28) Index Cond: (vm_disk_id = vm_device_history.device_id) (54 rows)
Here is the complete plan: QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=335189.54..335189.55 rows=1 width=676) -> Sort (cost=335189.54..335189.55 rows=1 width=676) Sort Key: (avg(prev_avg.used)) DESC -> GroupAggregate (cost=335189.50..335189.53 rows=1 width=676) Group Key: prev_avg.name -> Sort (cost=335189.50..335189.50 rows=1 width=628) Sort Key: prev_avg.name -> Subquery Scan on prev_avg (cost=335189.43..335189.49 rows=1 width=628) -> GroupAggregate (cost=335189.43..335189.48 rows=1 width=636) Group Key: vm_disk_samples_history.history_datetime, vm_configuration.vm_name -> Sort (cost=335189.43..335189.43 rows=1 width=596) Sort Key: vm_disk_samples_history.history_datetime, vm_configuration.vm_name -> Hash Join (cost=331586.16..335189.42 rows=1 width=596) Hash Cond: (vm_device_history.vm_id = vm_configuration.vm_id) -> Nested Loop (cost=126738.03..130341.21 rows=18 width=32) Join Filter: (vm_device_history.device_id = vm_disk_samples_history.vm_disk_id) -> Nested Loop (cost=30743.66..32567.45 rows=1 width=48) Join Filter: (vm_disk_configuration.vm_disk_id = vm_disk_samples_history.vm_disk_id) -> Nested Loop (cost=30743.22..32414.25 rows=1 width=20) -> HashAggregate (cost=30742.79..30744.79 rows=200 width=4) Group Key: max(a.history_id) -> Finalize GroupAggregate (cost=30735.29..30740.29 rows=200 width=20) Group Key: a.vm_disk_id -> Sort (cost=30735.29..30736.29 rows=400 width=20) Sort Key: a.vm_disk_id -> Gather (cost=30676.00..30718.00 rows=400 width=20) Workers Planned: 2 -> Partial HashAggregate (cost=29676.00..29678.00 rows=200 width=20) Group Key: a.vm_disk_id -> Parallel Seq Scan on vm_disk_configuration a (cost=0.00..26859.00 rows=563400 width=20) -> Index Scan using vm_disk_configuration_pkey on vm_disk_configuration (cost=0.43..8.34 rows=1 width=24) Index Cond: (history_id = (max(a.history_id))) Filter: (delete_date IS NULL) -> Index Scan using idx_vm_disk_history_datetime_samples on vm_disk_samples_history (cost=0.45..126.00 rows=2177 width=28) Index Cond: ((history_datetime >= (CURRENT_TIMESTAMP - '00:05:00'::interval)) AND (history_datetime < CURRENT_TIMESTAMP)) -> Nested Loop (cost=95994.36..97678.43 rows=7626 width=32) -> HashAggregate (cost=95993.93..95995.93 rows=200 width=4) Group Key: max(a_1.history_id) -> Finalize HashAggregate (cost=95093.93..95493.93 rows=40000 width=36) Group Key: a_1.vm_id, a_1.device_id -> Gather (cost=86093.93..94493.93 rows=80000 width=36) Workers Planned: 2 -> Partial HashAggregate (cost=85093.93..85493.93 rows=40000 width=36) Group Key: a_1.vm_id, a_1.device_id -> Parallel Seq Scan on vm_device_history a_1 (cost=0.00..75560.82 rows=1271082 width=36) -> Index Scan using vm_device_history_pkey on vm_device_history (cost=0.43..8.41 rows=1 width=36) Index Cond: (history_id = (max(a_1.history_id))) Filter: (delete_date IS NULL) -> Hash (cost=204848.12..204848.12 rows=1 width=596) -> Nested Loop Left Join (cost=203180.94..204848.12 rows=1 width=596) Join Filter: ((previous_sum.vm_name)::text = (vm_configuration.vm_name)::text) -> Nested Loop (cost=36304.51..37971.64 rows=1 width=532) -> HashAggregate (cost=36304.08..36306.08 rows=200 width=4) Group Key: max(a_2.history_id) -> Finalize GroupAggregate (cost=36296.58..36301.58 rows=200 width=20) Group Key: a_2.vm_id -> Sort (cost=36296.58..36297.58 rows=400 width=20) Sort Key: a_2.vm_id -> Gather (cost=36237.29..36279.29 rows=400 width=20) Workers Planned: 2 -> Partial HashAggregate (cost=35237.29..35239.29 rows=200 width=20) Group Key: a_2.vm_id -> Parallel Seq Scan on vm_configuration a_2 (cost=0.00..32972.86 rows=452886 width=20) -> Index Scan using vm_configuration_pkey on vm_configuration (cost=0.43..8.32 rows=1 width=536) Index Cond: (history_id = (max(a_2.history_id))) Filter: (delete_date IS NULL) -> GroupAggregate (cost=166876.43..166876.45 rows=1 width=580) Group Key: previous_sum.vm_name -> Sort (cost=166876.43..166876.43 rows=1 width=532) Sort Key: previous_sum.vm_name -> Subquery Scan on previous_sum (cost=166876.38..166876.42 rows=1 width=532) -> GroupAggregate (cost=166876.38..166876.41 rows=1 width=540) Group Key: vm_disk_samples_history_1.history_datetime, vm_configuration_1.vm_name -> Sort (cost=166876.38..166876.39 rows=1 width=532) Sort Key: vm_disk_samples_history_1.history_datetime, vm_configuration_1.vm_name -> Nested Loop (cost=163202.59..166876.37 rows=1 width=532) Join Filter: (vm_device_history_1.history_id = (max(a_5.history_id))) -> Nested Loop (cost=67208.66..70877.94 rows=1 width=536) -> Nested Loop (cost=67048.16..70713.40 rows=1 width=556) Join Filter: (vm_disk_configuration_1.vm_disk_id = vm_device_history_1.device_id) -> Nested Loop (cost=30743.22..32414.25 rows=1 width=20) -> HashAggregate (cost=30742.79..30744.79 rows=200 width=4) Group Key: max(a_3.history_id) -> Finalize GroupAggregate (cost=30735.29..30740.29 rows=200 width=20) Group Key: a_3.vm_disk_id -> Sort (cost=30735.29..30736.29 rows=400 width=20) Sort Key: a_3.vm_disk_id -> Gather (cost=30676.00..30718.00 rows=400 width=20) Workers Planned: 2 -> Partial HashAggregate (cost=29676.00..29678.00 rows=200 width=20) Group Key: a_3.vm_disk_id -> Parallel Seq Scan on vm_disk_configuration a_3 (cost=0.00..26859.00 rows=563400 width=20) -> Index Scan using vm_disk_configuration_pkey on vm_disk_configuration vm_disk_configuration_1 (cost=0.43..8.34 rows=1 width=24) Index Cond: (history_id = (max(a_3.history_id))) Filter: (delete_date IS NULL) -> Nested Loop (cost=36304.94..38298.21 rows=76 width=536) -> Nested Loop (cost=36304.51..37971.64 rows=1 width=532) -> HashAggregate (cost=36304.08..36306.08 rows=200 width=4) Group Key: max(a_4.history_id) -> Finalize GroupAggregate (cost=36296.58..36301.58 rows=200 width=20) Group Key: a_4.vm_id -> Sort (cost=36296.58..36297.58 rows=400 width=20) Sort Key: a_4.vm_id -> Gather (cost=36237.29..36279.29 rows=400 width=20) Workers Planned: 2 -> Partial HashAggregate (cost=35237.29..35239.29 rows=200 width=20) Group Key: a_4.vm_id -> Parallel Seq Scan on vm_configuration a_4 (cost=0.00..32972.86 rows=452886 width=20) -> Index Scan using vm_configuration_pkey on vm_configuration vm_configuration_1 (cost=0.43..8.32 rows=1 width=536) Index Cond: (history_id = (max(a_4.history_id))) Filter: (delete_date IS NULL) -> Index Scan using idx_vm_device_history_vm_id_type on vm_device_history vm_device_history_1 (cost=0.43..325.81 rows=76 width=36) Index Cond: (vm_id = vm_configuration_1.vm_id) Filter: (delete_date IS NULL) -> Bitmap Heap Scan on vm_disk_samples_history vm_disk_samples_history_1 (cost=160.50..164.53 rows=1 width=28) Recheck Cond: ((history_datetime >= (CURRENT_TIMESTAMP - '00:10:00'::interval)) AND (history_datetime < (CURRENT_TIMESTAMP - '00:05:00'::interval)) AND (vm_disk_id = vm_device_history_1.device_id)) -> BitmapAnd (cost=160.50..160.50 rows=1 width=0) -> Bitmap Index Scan on idx_vm_disk_history_datetime_samples (cost=0.00..78.10 rows=2565 width=0) Index Cond: ((history_datetime >= (CURRENT_TIMESTAMP - '00:10:00'::interval)) AND (history_datetime < (CURRENT_TIMESTAMP - '00:05:00'::interval))) -> Bitmap Index Scan on vm_disk_samples_history_vm_disk_id_idx (cost=0.00..81.51 rows=5085 width=0) Index Cond: (vm_disk_id = vm_device_history_1.device_id) -> HashAggregate (cost=95993.93..95995.93 rows=200 width=4) Group Key: max(a_5.history_id) -> Finalize HashAggregate (cost=95093.93..95493.93 rows=40000 width=36) Group Key: a_5.vm_id, a_5.device_id -> Gather (cost=86093.93..94493.93 rows=80000 width=36) Workers Planned: 2 -> Partial HashAggregate (cost=85093.93..85493.93 rows=40000 width=36) Group Key: a_5.vm_id, a_5.device_id -> Parallel Seq Scan on vm_device_history a_5 (cost=0.00..75560.82 rows=1271082 width=36) (130 rows)
Roy, can you please have a look? Thanks.
I believe I am seeing something similar? I have upgraded from a hosted engine running RHV 4.2.8.7 to RHV 4.3.4.3. This was a very small environment with very little activity. The upgrade was completed per the upgrade guide, and we ran into zero issues and zero error messages during the process. I did notice the shift from Postgres 9.* to 10. Datawarehouse was setup and running previously before the upgrade. After the upgrade, when I log into the webui, I get the error: "Could not fetch dashboard data. Please ensure that data warehouse is properly installed and configured" in the dashboard pane. dwhd is running, and there do not seem to be any blaring errors in the logs. I am not 100% sure this is the exact problem, as I do not have postgresql queries taking a long time, nor do I have any CPU consumers.
Lynn, could you please provide logs from RHV Manager using sos logcollector and attach it to the bug?
*** Bug 1723872 has been marked as a duplicate of this bug. ***
The complete explain doesn't show how much time it took to run it - do you have the numbers? Oliver, can you try to backup/restore the ovirt_engine_history DB and then run this query? I'm trying to eliminate a bad behaviour of the pg_upgrde tool Also we to increase the pg logging verbosity to uncover more info while these queries execute.
Query runs for more than 48 hours and starts again as soon as it finished. Unfortunately a restore/recovery of ovirt_engine_history is currently not possible.
Please have a look at bug 1673808 which lands in 4.3.5 - it is now resolved and verified and it may very well fix a related issue.
Might be related but the problem I see was introduced when I upgraded our installation from 4.2.8 to 4.3.4, so first seen in 4.3. And my problem is not the vacuum, but the recurring active SQL query running for days every time.
Hi, Should be the same issue. Please try upgrading to 4.3.5.
(In reply to oliver.albl from comment #18) > Might be related but the problem I see was introduced when I upgraded our > installation from 4.2.8 to 4.3.4, so first seen in 4.3. And my problem is > not the vacuum, but the recurring active SQL query running for days every > time. Actually it won't solve it, because the query is 'active' - it means it is performing all time, just slow - Oliver is this right that the query is always active? Focusing on the move from PG 9.5 -> 10, what would help is to compare the EXPLAIN of this query from both versions. It is possible that 10 planner is abusing parallelism for this case. QE owner and/or Oliver please help supply and explain on postgres 95 (hence 4.2) and to compare it with what's here. Bare in mind to have the same amount of data. Oliver can you try to disable parallelism in pg and rerun the explain? To disable do this in psql: ovirt_engine_history=> set max_parallel_workers_per_gather 0;
ovirt_engine_history=# select * from pg_settings where name = 'max_parallel_workers_per_gather'; name | setting | unit | category | short_desc | extra_desc | context | vartype | source | min_val | max_val | enumvals | boot_val | reset_val | sourcefile | sourceline | pending_restart ---------------------------------+---------+------+----------------------------------------+------------------------------------------------------------------+------------+---------+---------+---------+---------+---------+----------+----------+-----------+------------+------------+----------------- max_parallel_workers_per_gather | 2 | | Resource Usage / Asynchronous Behavior | Sets the maximum number of parallel processes per executor node. | | user | integer | default | 0 | 1024 | | 2 | 2 | | | f (1 row) ovirt_engine_history=# explain SELECT ovirt_engine_history-# name, ovirt_engine_history-# AVG(total) AS total, ovirt_engine_history-# AVG(used) AS used, ovirt_engine_history-# MAX(previous_used) AS previous_used, ovirt_engine_history-# MAX(previous_total) AS previous_total, ovirt_engine_history-# MAX(vm_disk_usage_percent) AS usage_percent ovirt_engine_history-# FROM ovirt_engine_history-# ( ovirt_engine_history(# SELECT ovirt_engine_history(# history_datetime, ovirt_engine_history(# vms.vm_name AS name, ovirt_engine_history(# SUM(vm_disk_size_mb) AS total, ovirt_engine_history(# SUM(vm_disk_actual_size_mb) AS used, ovirt_engine_history(# COALESCE(MAX(previous_actual_size), 0) AS previous_used, ovirt_engine_history(# COALESCE(MAX(previous_total_size), 0) AS previous_total, ovirt_engine_history(# CASE WHEN ovirt_engine_history(# MAX(vm_disk_size_mb) = 0 ovirt_engine_history(# THEN ovirt_engine_history(# 0 ovirt_engine_history(# ELSE ovirt_engine_history(# (CAST(MAX(vm_disk_actual_size_mb) AS DECIMAL)/ MAX(vm_disk_size_mb) * 100) ovirt_engine_history(# END AS vm_disk_usage_percent ovirt_engine_history(# FROM ovirt_engine_history(# v4_2_statistics_vms_disks_resources_usage_samples samples ovirt_engine_history(# INNER JOIN ovirt_engine_history(# v4_2_latest_configuration_vms_devices devices ovirt_engine_history(# ON ovirt_engine_history(# devices.device_id = vm_disk_id ovirt_engine_history(# INNER JOIN ovirt_engine_history(# v4_2_latest_configuration_vms vms ovirt_engine_history(# ON ovirt_engine_history(# vms.vm_id = devices.vm_id ovirt_engine_history(# INNER JOIN ovirt_engine_history(# v4_2_latest_configuration_vms_disks disks ovirt_engine_history(# ON ovirt_engine_history(# disks.vm_disk_id = samples.vm_disk_id ovirt_engine_history(# LEFT OUTER JOIN ovirt_engine_history(# ( ovirt_engine_history(# SELECT ovirt_engine_history(# vm_name, ovirt_engine_history(# AVG(previous_actual_size) AS previous_actual_size, ovirt_engine_history(# AVG(previous_total_size) AS previous_total_size ovirt_engine_history(# FROM ovirt_engine_history(# ( ovirt_engine_history(# SELECT ovirt_engine_history(# history_datetime, ovirt_engine_history(# vm_name, ovirt_engine_history(# SUM(vm_disk_actual_size_mb) AS previous_actual_size, ovirt_engine_history(# SUM(vm_disk_size_mb) AS previous_total_size ovirt_engine_history(# FROM ovirt_engine_history(# v4_2_statistics_vms_disks_resources_usage_samples samples ovirt_engine_history(# INNER JOIN ovirt_engine_history(# v4_2_latest_configuration_vms_devices devices ovirt_engine_history(# ON ovirt_engine_history(# devices.device_id = vm_disk_id ovirt_engine_history(# INNER JOIN ovirt_engine_history(# v4_2_latest_configuration_vms vms ovirt_engine_history(# ON ovirt_engine_history(# vms.vm_id = devices.vm_id ovirt_engine_history(# INNER JOIN ovirt_engine_history(# v4_2_latest_configuration_vms_disks disks ovirt_engine_history(# ON ovirt_engine_history(# disks.vm_disk_id = samples.vm_disk_id ovirt_engine_history(# WHERE ovirt_engine_history(# history_datetime >= (CURRENT_TIMESTAMP - INTERVAL '10 minute') AND ovirt_engine_history(# history_datetime < (CURRENT_TIMESTAMP - INTERVAL '5 minute') ovirt_engine_history(# GROUP BY ovirt_engine_history(# history_datetime, vm_name ovirt_engine_history(# ) AS previous_sum ovirt_engine_history(# GROUP BY vm_name ovirt_engine_history(# ) AS previous_trend ovirt_engine_history(# ON ovirt_engine_history(# previous_trend.vm_name = vms.vm_name ovirt_engine_history(# WHERE ovirt_engine_history(# history_datetime >= (CURRENT_TIMESTAMP - INTERVAL '5 minute') AND ovirt_engine_history(# history_datetime < CURRENT_TIMESTAMP ovirt_engine_history(# GROUP BY ovirt_engine_history(# history_datetime, vms.vm_name ovirt_engine_history(# ) AS prev_avg ovirt_engine_history-# GROUP BY ovirt_engine_history-# name ovirt_engine_history-# ORDER BY ovirt_engine_history-# used DESC ovirt_engine_history-# LIMIT 10; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Limit (cost=1612829.56..1612829.59 rows=10 width=181) -> Sort (cost=1612829.56..1612830.06 rows=200 width=181) Sort Key: (avg((sum(vm_disk_samples_history.vm_disk_actual_size_mb)))) DESC -> HashAggregate (cost=1612822.24..1612825.24 rows=200 width=181) Group Key: vm_configuration.vm_name -> GroupAggregate (cost=1576756.97..1600385.94 rows=497452 width=141) Group Key: vm_disk_samples_history.history_datetime, vm_configuration.vm_name -> Sort (cost=1576756.97..1578000.60 rows=497452 width=101) Sort Key: vm_disk_samples_history.history_datetime, vm_configuration.vm_name -> Hash Join (cost=1364206.56..1502482.07 rows=497452 width=101) Hash Cond: (vm_configuration.history_id = "ANY_subquery".max) -> Hash Left Join (cost=1307371.94..1437501.71 rows=994900 width=105) Hash Cond: ((vm_configuration.vm_name)::text = (previous_trend.vm_name)::text) -> Nested Loop (cost=494510.60..622028.26 rows=994900 width=41) -> Hash Join (cost=494510.17..569156.00 rows=85837 width=32) Hash Cond: (vm_disk_configuration.history_id = "ANY_subquery_1".max) -> Nested Loop (cost=436751.70..509991.95 rows=171674 width=36) -> Hash Join (cost=436751.27..493097.59 rows=27776 width=60) Hash Cond: (vm_disk_samples_history.vm_disk_id = vm_device_history.device_id) -> Bitmap Heap Scan on vm_disk_samples_history (cost=581.48..18787.98 rows=5175 width=28) Recheck Cond: ((history_datetime >= (CURRENT_TIMESTAMP - '00:05:00'::interval)) AND (history_datetime < CURRENT_TIMESTAMP)) -> Bitmap Index Scan on idx_vm_disk_history_datetime_samples (cost=0.00..580.19 rows=5175 width=0) Index Cond: ((history_datetime >= (CURRENT_TIMESTAMP - '00:05:00'::interval)) AND (history_datetime < CURRENT_TIMESTAMP)) -> Hash (cost=414142.66..414142.66 rows=1139130 width=32) -> Nested Loop (cost=412536.75..414142.66 rows=1139130 width=32) -> HashAggregate (cost=412536.32..412538.32 rows=200 width=4) Group Key: max(a.history_id) -> Finalize GroupAggregate (cost=293857.97..407900.68 rows=370851 width=36) Group Key: a.vm_id, a.device_id -> Gather Merge (cost=293857.97..398629.41 rows=741702 width=36) Workers Planned: 2 -> Partial GroupAggregate (cost=292857.95..312018.57 rows=370851 width=36) Group Key: a.vm_id, a.device_id -> Sort (cost=292857.95..296720.98 rows=1545211 width=36) Sort Key: a.vm_id, a.device_id -> Parallel Seq Scan on vm_device_history a (cost=0.00..91763.11 rows=1545211 width=36) -> Index Scan using vm_device_history_pkey on vm_device_history (cost=0.43..8.43 rows=1 width=36) Index Cond: (history_id = (max(a.history_id))) Filter: (delete_date IS NULL) -> Index Scan using vm_disk_configuration_vm_disk_id_idx on vm_disk_configuration (cost=0.43..0.58 rows=3 width=24) Index Cond: (vm_disk_id = vm_device_history.device_id) Filter: (delete_date IS NULL) -> Hash (cost=57755.97..57755.97 rows=200 width=4) -> HashAggregate (cost=57753.97..57755.97 rows=200 width=4) Group Key: "ANY_subquery_1".max -> Subquery Scan on "ANY_subquery_1" (cost=50679.59..56967.93 rows=314417 width=4) -> HashAggregate (cost=50679.59..53823.76 rows=314417 width=20) Group Key: a_1.vm_disk_id -> Seq Scan on vm_disk_configuration a_1 (cost=0.00..42471.06 rows=1641706 width=20) -> Index Scan using vm_configuration_vm_id_idx on vm_configuration (cost=0.43..0.59 rows=3 width=41) Index Cond: (vm_id = vm_device_history.vm_id) Filter: (delete_date IS NULL) -> Hash (cost=812858.85..812858.85 rows=200 width=85) -> Subquery Scan on previous_trend (cost=812853.85..812858.85 rows=200 width=85) -> HashAggregate (cost=812853.85..812856.85 rows=200 width=85) Group Key: vm_configuration_1.vm_name -> GroupAggregate (cost=786100.17..801149.11 rows=668842 width=45) Group Key: vm_disk_samples_history_1.history_datetime, vm_configuration_1.vm_name -> Sort (cost=786100.17..787772.27 rows=668842 width=37) Sort Key: vm_disk_samples_history_1.history_datetime, vm_configuration_1.vm_name -> Merge Join (cost=692516.87..703094.34 rows=668842 width=37) Merge Cond: (vm_device_history_1.vm_id = vm_configuration_1.vm_id) -> Sort (cost=601513.44..601801.97 rows=115412 width=32) Sort Key: vm_device_history_1.vm_id -> Hash Join (cost=494708.45..591809.35 rows=115412 width=32) Hash Cond: (vm_disk_configuration_1.history_id = "ANY_subquery_2".max) -> Nested Loop (cost=436949.98..532161.00 rows=230823 width=36) -> Hash Join (cost=436949.55..509445.82 rows=37346 width=60) Hash Cond: (vm_disk_samples_history_1.vm_disk_id = vm_device_history_1.device_id) -> Bitmap Heap Scan on vm_disk_samples_history vm_disk_samples_history_1 (cost=779.76..24679.57 rows=6958 width=28) Recheck Cond: ((history_datetime >= (CURRENT_TIMESTAMP - '00:10:00'::interval)) AND (history_datetime < (CURRENT_TIMESTAMP - '00:05:00'::interval))) -> Bitmap Index Scan on idx_vm_disk_history_datetime_samples (cost=0.00..778.02 rows=6958 width=0) Index Cond: ((history_datetime >= (CURRENT_TIMESTAMP - '00:10:00'::interval)) AND (history_datetime < (CURRENT_TIMESTAMP - '00:05:00'::interval))) -> Hash (cost=414142.66..414142.66 rows=1139130 width=32) -> Nested Loop (cost=412536.75..414142.66 rows=1139130 width=32) -> HashAggregate (cost=412536.32..412538.32 rows=200 width=4) Group Key: max(a_2.history_id) -> Finalize GroupAggregate (cost=293857.97..407900.68 rows=370851 width=36) Group Key: a_2.vm_id, a_2.device_id -> Gather Merge (cost=293857.97..398629.41 rows=741702 width=36) Workers Planned: 2 -> Partial GroupAggregate (cost=292857.95..312018.57 rows=370851 width=36) Group Key: a_2.vm_id, a_2.device_id -> Sort (cost=292857.95..296720.98 rows=1545211 width=36) Sort Key: a_2.vm_id, a_2.device_id -> Parallel Seq Scan on vm_device_history a_2 (cost=0.00..91763.11 rows=1545211 width=36) -> Index Scan using vm_device_history_pkey on vm_device_history vm_device_history_1 (cost=0.43..8.43 rows=1 width=36) Index Cond: (history_id = (max(a_2.history_id))) Filter: (delete_date IS NULL) -> Index Scan using vm_disk_configuration_vm_disk_id_idx on vm_disk_configuration vm_disk_configuration_1 (cost=0.43..0.58 rows=3 width=24) Index Cond: (vm_disk_id = vm_device_history_1.device_id) Filter: (delete_date IS NULL) -> Hash (cost=57755.97..57755.97 rows=200 width=4) -> HashAggregate (cost=57753.97..57755.97 rows=200 width=4) Group Key: "ANY_subquery_2".max -> Subquery Scan on "ANY_subquery_2" (cost=50679.59..56967.93 rows=314417 width=4) -> HashAggregate (cost=50679.59..53823.76 rows=314417 width=20) Group Key: a_3.vm_disk_id -> Seq Scan on vm_disk_configuration a_3 (cost=0.00..42471.06 rows=1641706 width=20) -> Sort (cost=90991.11..91873.95 rows=353134 width=37) Sort Key: vm_configuration_1.vm_id -> Nested Loop (cost=56830.54..58450.07 rows=353134 width=37) -> HashAggregate (cost=56830.11..56832.11 rows=200 width=4) Group Key: max(a_4.history_id) -> HashAggregate (cost=51372.29..53797.99 rows=242570 width=20) Group Key: a_4.vm_id -> Seq Scan on vm_configuration a_4 (cost=0.00..45179.86 rows=1238486 width=20) -> Index Scan using vm_configuration_pkey on vm_configuration vm_configuration_1 (cost=0.43..8.34 rows=1 width=41) Index Cond: (history_id = (max(a_4.history_id))) Filter: (delete_date IS NULL) -> Hash (cost=56832.11..56832.11 rows=200 width=4) -> HashAggregate (cost=56830.11..56832.11 rows=200 width=4) Group Key: "ANY_subquery".max -> Subquery Scan on "ANY_subquery" (cost=51372.29..56223.69 rows=242570 width=4) -> HashAggregate (cost=51372.29..53797.99 rows=242570 width=20) Group Key: a_5.vm_id -> Seq Scan on vm_configuration a_5 (cost=0.00..45179.86 rows=1238486 width=20) (117 rows) ovirt_engine_history=# set max_parallel_workers_per_gather = 0; SET ovirt_engine_history=# select * from pg_settings where name = 'max_parallel_workers_per_gather'; name | setting | unit | category | short_desc | extra_desc | context | vartype | source | min_val | max_val | enumvals | boot_val | reset_val | sourcefile | sourceline | pending_restart ---------------------------------+---------+------+----------------------------------------+------------------------------------------------------------------+------------+---------+---------+---------+---------+---------+----------+----------+-----------+------------+------------+----------------- max_parallel_workers_per_gather | 0 | | Resource Usage / Asynchronous Behavior | Sets the maximum number of parallel processes per executor node. | | user | integer | session | 0 | 1024 | | 2 | 2 | | | f (1 row) ovirt_engine_history=# explain SELECT ovirt_engine_history-# name, ovirt_engine_history-# AVG(total) AS total, ovirt_engine_history-# AVG(used) AS used, ovirt_engine_history-# MAX(previous_used) AS previous_used, ovirt_engine_history-# MAX(previous_total) AS previous_total, ovirt_engine_history-# MAX(vm_disk_usage_percent) AS usage_percent ovirt_engine_history-# FROM ovirt_engine_history-# ( ovirt_engine_history(# SELECT ovirt_engine_history(# history_datetime, ovirt_engine_history(# vms.vm_name AS name, ovirt_engine_history(# SUM(vm_disk_size_mb) AS total, ovirt_engine_history(# SUM(vm_disk_actual_size_mb) AS used, ovirt_engine_history(# COALESCE(MAX(previous_actual_size), 0) AS previous_used, ovirt_engine_history(# COALESCE(MAX(previous_total_size), 0) AS previous_total, ovirt_engine_history(# CASE WHEN ovirt_engine_history(# MAX(vm_disk_size_mb) = 0 (CAST(MAX(vm_disk_actual_size_mb) AS DECIMAL)/ MAX(vm_disk_size_mb) * 100) END AS vm_disk_usage_percent FROM v4_2_statistics_vms_disks_resources_usage_samples samples INNER JOIN v4_2_latest_configuration_vms_devices devices ON devices.device_id = vm_disk_id INNER JOIN v4_2_latest_configuration_vms vms ON vms.vm_id = devices.vm_id INNER JOIN v4_2_latest_configuration_vms_disks disks ON disks.vm_disk_id = samples.vm_disk_id LEFT OUTER JOIN ( SELECT vm_name, AVG(previous_actual_size) AS previous_actual_size, AVG(previous_total_size) AS previous_total_size FROM ( SELECT history_datetime, vm_name, SUM(vm_disk_actual_size_mb) AS previous_actual_size, SUM(vm_disk_size_mb) AS previous_total_size FROM v4_2_statistics_vms_disks_resources_usage_samples samples INNER JOIN v4_2_latest_configuration_vms_devices devices ON devices.device_id = vm_disk_id INNER JOIN v4_2_latest_configuration_vms vms ON vms.vm_id = devices.vm_id INNER JOIN v4_2_latest_configuration_vms_disks disks ON disks.vm_disk_id = samples.vm_disk_id WHERE history_datetime >= (CURRENT_TIMESTAMP - INTERVAL '10 minute') AND history_datetime < (CURRENT_TIMESTAMP - INTERVAL '5 minute') GROUP BY history_datetime, vm_name ) AS previous_sum GROUP BY vm_name ) AS previous_trend ON previous_trend.vm_name = vms.vm_name WHERE history_datetime >= (CURRENT_TIMESTAMP - INTERVAL '5 minute') AND history_datetime < CURRENT_TIMESTAMP GROUP BY history_datetime, vms.vm_name ) AS prev_avg GROUP BY name ORDER BY used DESC LIMIT 10;ovirt_engine_history(# THEN ovirt_engine_history(# 0 ovirt_engine_history(# ELSE ovirt_engine_history(# (CAST(MAX(vm_disk_actual_size_mb) AS DECIMAL)/ MAX(vm_disk_size_mb) * 100) ovirt_engine_history(# END AS vm_disk_usage_percent ovirt_engine_history(# FROM ovirt_engine_history(# v4_2_statistics_vms_disks_resources_usage_samples samples ovirt_engine_history(# INNER JOIN ovirt_engine_history(# v4_2_latest_configuration_vms_devices devices ovirt_engine_history(# ON ovirt_engine_history(# devices.device_id = vm_disk_id ovirt_engine_history(# INNER JOIN ovirt_engine_history(# v4_2_latest_configuration_vms vms ovirt_engine_history(# ON ovirt_engine_history(# vms.vm_id = devices.vm_id ovirt_engine_history(# INNER JOIN ovirt_engine_history(# v4_2_latest_configuration_vms_disks disks ovirt_engine_history(# ON ovirt_engine_history(# disks.vm_disk_id = samples.vm_disk_id ovirt_engine_history(# LEFT OUTER JOIN ovirt_engine_history(# ( ovirt_engine_history(# SELECT ovirt_engine_history(# vm_name, ovirt_engine_history(# AVG(previous_actual_size) AS previous_actual_size, ovirt_engine_history(# AVG(previous_total_size) AS previous_total_size ovirt_engine_history(# FROM ovirt_engine_history(# ( ovirt_engine_history(# SELECT ovirt_engine_history(# history_datetime, ovirt_engine_history(# vm_name, ovirt_engine_history(# SUM(vm_disk_actual_size_mb) AS previous_actual_size, ovirt_engine_history(# SUM(vm_disk_size_mb) AS previous_total_size ovirt_engine_history(# FROM ovirt_engine_history(# v4_2_statistics_vms_disks_resources_usage_samples samples ovirt_engine_history(# INNER JOIN ovirt_engine_history(# v4_2_latest_configuration_vms_devices devices ovirt_engine_history(# ON ovirt_engine_history(# devices.device_id = vm_disk_id ovirt_engine_history(# INNER JOIN ovirt_engine_history(# v4_2_latest_configuration_vms vms ovirt_engine_history(# ON ovirt_engine_history(# vms.vm_id = devices.vm_id ovirt_engine_history(# INNER JOIN ovirt_engine_history(# v4_2_latest_configuration_vms_disks disks ovirt_engine_history(# ON ovirt_engine_history(# disks.vm_disk_id = samples.vm_disk_id ovirt_engine_history(# WHERE ovirt_engine_history(# history_datetime >= (CURRENT_TIMESTAMP - INTERVAL '10 minute') AND ovirt_engine_history(# history_datetime < (CURRENT_TIMESTAMP - INTERVAL '5 minute') ovirt_engine_history(# GROUP BY ovirt_engine_history(# history_datetime, vm_name ovirt_engine_history(# ) AS previous_sum ovirt_engine_history(# GROUP BY vm_name ovirt_engine_history(# ) AS previous_trend ovirt_engine_history(# ON ovirt_engine_history(# previous_trend.vm_name = vms.vm_name ovirt_engine_history(# WHERE ovirt_engine_history(# history_datetime >= (CURRENT_TIMESTAMP - INTERVAL '5 minute') AND ovirt_engine_history(# history_datetime < CURRENT_TIMESTAMP ovirt_engine_history(# GROUP BY ovirt_engine_history(# history_datetime, vms.vm_name ovirt_engine_history(# ) AS prev_avg ovirt_engine_history-# GROUP BY ovirt_engine_history-# name ovirt_engine_history-# ORDER BY ovirt_engine_history-# used DESC ovirt_engine_history-# LIMIT 10; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Limit (cost=2088866.44..2088866.47 rows=10 width=181) -> Sort (cost=2088866.44..2088866.94 rows=200 width=181) Sort Key: (avg((sum(vm_disk_samples_history.vm_disk_actual_size_mb)))) DESC -> HashAggregate (cost=2088859.12..2088862.12 rows=200 width=181) Group Key: vm_configuration.vm_name -> GroupAggregate (cost=2056863.27..2077826.07 rows=441322 width=141) Group Key: vm_disk_samples_history.history_datetime, vm_configuration.vm_name -> Sort (cost=2056863.27..2057966.58 rows=441322 width=101) Sort Key: vm_disk_samples_history.history_datetime, vm_configuration.vm_name -> Hash Left Join (cost=1868814.73..1991350.09 rows=441322 width=101) Hash Cond: ((vm_configuration.vm_name)::text = (previous_trend.vm_name)::text) -> Hash Join (cost=803616.30..924992.98 rows=441322 width=37) Hash Cond: (vm_configuration.history_id = "ANY_subquery".max) -> Nested Loop (cost=746781.69..860931.72 rows=882643 width=41) -> Hash Join (cost=746781.26..814025.05 rows=76152 width=32) Hash Cond: (vm_disk_configuration.history_id = "ANY_subquery_1".max) -> Nested Loop (cost=689022.79..755019.58 rows=152304 width=36) -> Hash Join (cost=689022.36..740031.43 rows=24642 width=60) Hash Cond: (vm_disk_samples_history.vm_disk_id = vm_device_history.device_id) -> Bitmap Heap Scan on vm_disk_samples_history (cost=515.50..16809.83 rows=4591 width=28) Recheck Cond: ((history_datetime >= (CURRENT_TIMESTAMP - '00:05:00'::interval)) AND (history_datetime < CURRENT_TIMESTAMP)) -> Bitmap Index Scan on idx_vm_disk_history_datetime_samples (cost=0.00..514.35 rows=4591 width=0) Index Cond: ((history_datetime >= (CURRENT_TIMESTAMP - '00:05:00'::interval)) AND (history_datetime < CURRENT_TIMESTAMP)) -> Hash (cost=666479.74..666479.74 rows=1139130 width=32) -> Nested Loop (cost=664873.83..666479.74 rows=1139130 width=32) -> HashAggregate (cost=664873.40..664875.40 rows=200 width=4) Group Key: max(a.history_id) -> GroupAggregate (cost=619444.19..660237.76 rows=370851 width=36) Group Key: a.vm_id, a.device_id -> Sort (cost=619444.19..628715.46 rows=3708506 width=36) Sort Key: a.vm_id, a.device_id -> Seq Scan on vm_device_history a (cost=0.00..113396.06 rows=3708506 width=36) -> Index Scan using vm_device_history_pkey on vm_device_history (cost=0.43..8.43 rows=1 width=36) Index Cond: (history_id = (max(a.history_id))) Filter: (delete_date IS NULL) -> Index Scan using vm_disk_configuration_vm_disk_id_idx on vm_disk_configuration (cost=0.43..0.58 rows=3 width=24) Index Cond: (vm_disk_id = vm_device_history.device_id) Filter: (delete_date IS NULL) -> Hash (cost=57755.97..57755.97 rows=200 width=4) -> HashAggregate (cost=57753.97..57755.97 rows=200 width=4) Group Key: "ANY_subquery_1".max -> Subquery Scan on "ANY_subquery_1" (cost=50679.59..56967.93 rows=314417 width=4) -> HashAggregate (cost=50679.59..53823.76 rows=314417 width=20) Group Key: a_1.vm_disk_id -> Seq Scan on vm_disk_configuration a_1 (cost=0.00..42471.06 rows=1641706 width=20) -> Index Scan using vm_configuration_vm_id_idx on vm_configuration (cost=0.43..0.59 rows=3 width=41) Index Cond: (vm_id = vm_device_history.vm_id) Filter: (delete_date IS NULL) -> Hash (cost=56832.11..56832.11 rows=200 width=4) -> HashAggregate (cost=56830.11..56832.11 rows=200 width=4) Group Key: "ANY_subquery".max -> Subquery Scan on "ANY_subquery" (cost=51372.29..56223.69 rows=242570 width=4) -> HashAggregate (cost=51372.29..53797.99 rows=242570 width=20) Group Key: a_2.vm_id -> Seq Scan on vm_configuration a_2 (cost=0.00..45179.86 rows=1238486 width=20) -> Hash (cost=1065195.92..1065195.92 rows=200 width=85) -> Subquery Scan on previous_trend (cost=1065190.92..1065195.92 rows=200 width=85) -> HashAggregate (cost=1065190.92..1065193.92 rows=200 width=85) Group Key: vm_configuration_1.vm_name -> GroupAggregate (cost=1038437.24..1053486.19 rows=668842 width=45) Group Key: vm_disk_samples_history_1.history_datetime, vm_configuration_1.vm_name -> Sort (cost=1038437.24..1040109.35 rows=668842 width=37) Sort Key: vm_disk_samples_history_1.history_datetime, vm_configuration_1.vm_name -> Merge Join (cost=944853.95..955431.41 rows=668842 width=37) Merge Cond: (vm_device_history_1.vm_id = vm_configuration_1.vm_id) -> Sort (cost=853850.51..854139.04 rows=115412 width=32) Sort Key: vm_device_history_1.vm_id -> Hash Join (cost=747045.53..844146.42 rows=115412 width=32) Hash Cond: (vm_disk_configuration_1.history_id = "ANY_subquery_2".max) -> Nested Loop (cost=689287.05..784498.08 rows=230823 width=36) -> Hash Join (cost=689286.63..761782.90 rows=37346 width=60) Hash Cond: (vm_disk_samples_history_1.vm_disk_id = vm_device_history_1.device_id) -> Bitmap Heap Scan on vm_disk_samples_history vm_disk_samples_history_1 (cost=779.76..24679.57 rows=6958 width=28) Recheck Cond: ((history_datetime >= (CURRENT_TIMESTAMP - '00:10:00'::interval)) AND (history_datetime < (CURRENT_TIMESTAMP - '00:05:00'::interval))) -> Bitmap Index Scan on idx_vm_disk_history_datetime_samples (cost=0.00..778.02 rows=6958 width=0) Index Cond: ((history_datetime >= (CURRENT_TIMESTAMP - '00:10:00'::interval)) AND (history_datetime < (CURRENT_TIMESTAMP - '00:05:00'::interval))) -> Hash (cost=666479.74..666479.74 rows=1139130 width=32) -> Nested Loop (cost=664873.83..666479.74 rows=1139130 width=32) -> HashAggregate (cost=664873.40..664875.40 rows=200 width=4) Group Key: max(a_3.history_id) -> GroupAggregate (cost=619444.19..660237.76 rows=370851 width=36) Group Key: a_3.vm_id, a_3.device_id -> Sort (cost=619444.19..628715.46 rows=3708506 width=36) Sort Key: a_3.vm_id, a_3.device_id -> Seq Scan on vm_device_history a_3 (cost=0.00..113396.06 rows=3708506 width=36) -> Index Scan using vm_device_history_pkey on vm_device_history vm_device_history_1 (cost=0.43..8.43 rows=1 width=36) Index Cond: (history_id = (max(a_3.history_id))) Filter: (delete_date IS NULL) -> Index Scan using vm_disk_configuration_vm_disk_id_idx on vm_disk_configuration vm_disk_configuration_1 (cost=0.43..0.58 rows=3 width=24) Index Cond: (vm_disk_id = vm_device_history_1.device_id) Filter: (delete_date IS NULL) -> Hash (cost=57755.97..57755.97 rows=200 width=4) -> HashAggregate (cost=57753.97..57755.97 rows=200 width=4) Group Key: "ANY_subquery_2".max -> Subquery Scan on "ANY_subquery_2" (cost=50679.59..56967.93 rows=314417 width=4) -> HashAggregate (cost=50679.59..53823.76 rows=314417 width=20) Group Key: a_4.vm_disk_id -> Seq Scan on vm_disk_configuration a_4 (cost=0.00..42471.06 rows=1641706 width=20) -> Sort (cost=90991.11..91873.95 rows=353134 width=37) Sort Key: vm_configuration_1.vm_id -> Nested Loop (cost=56830.54..58450.07 rows=353134 width=37) -> HashAggregate (cost=56830.11..56832.11 rows=200 width=4) Group Key: max(a_5.history_id) -> HashAggregate (cost=51372.29..53797.99 rows=242570 width=20) Group Key: a_5.vm_id -> Seq Scan on vm_configuration a_5 (cost=0.00..45179.86 rows=1238486 width=20) -> Index Scan using vm_configuration_pkey on vm_configuration vm_configuration_1 (cost=0.43..8.34 rows=1 width=41) Index Cond: (history_id = (max(a_5.history_id))) Filter: (delete_date IS NULL) (109 rows)
After setting all cluster compatibility levels to 4.3 (which was not possible before because of >500 VMs were previewing snapshots) I do not see this query any longer.
Shirly, why would cluster compat level will change something?
Not sure. The latest RHV should have the fixes for the DWH open connections. Perhaps they did not upgrade before to latest.
Moving to QE to retest with 4.4
Tested on: ovirt-engine-4.4.0-0.33.master.el8ev.noarch Steps: 1. Upgrade an engine from 4.3 to 4.4 Results: Dashboards showed as expected, not stuck queries (even after few days of upgrade)
typo: few days after the*** upgrade
This bugzilla is included in oVirt 4.4.0 release, published on May 20th 2020. Since the problem described in this bug report should be resolved in oVirt 4.4.0 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.