Description of problem: After upgrading to 4.3 and updating the cluster, VM tab is extremely slow, until VM's are restarted Version-Release number of selected component (if applicable): ovirt-engine-4.2.8.9-0.1.el7ev.noarch ovirt-engine-4.3.10.4-0.1.el7.noarch How reproducible: 100% so far Steps to Reproduce: 1. Upgrade from 4.2 to 4.3, with multiple VM's running 2. Upgrade the cluster version from 4.2 to 4.3 3. Observer that VM's need restarting to effect config update Actual results: VM tab takes around 10 seconds to complete (62 VM's). Via api, request takes over 1 minute. This gets worse if there are more VM's Expected results: performance should be similar to before (~1sec via gui or api) Additional info: While restarting the VM's is required and does resolve the issue, it's not always practical to restart all VM's immediately. This can mean that GUI performance will be degraded for quite some time. (until a significant number of VM's are rebooted)
Summarize an offline discussion about this: 1. Additional database queries that were added as part of retrieving the changed fields in next-run configuration compared to the current configuration probably have a significant effect on the overhead added to the search query. 2. While eliminating the additional database queries seems possible, we'll still need to execute an additional database query per-VM and parse the OVF on every refresh. 3. The downsides of in-memory caching: (a) we'll need to make sure it's in-sync; and (b) increase memory consumption. So avoiding the computation of the changed-fields on refreshes by retrieving them from the database makes sense.
Regarding the reproducer steps for scale: #Seems like we'd need to: Create 4.2 Engine and host with 120 vms on 1 host. Update engine only to 4.3 Edit Cluster which has 120 vms running 4.2 and set compatibility version to 4.3 Click VM tab - scale will generate trace of all sqls related to VM tabs UI action and engine utilization. Then, While 4.2 vms are still running (without being rebooted) - bring down 4.3 engine, upgrade to fixed in version of 4.3.11. Click VM tab - to generate trace of all sqls related to VM tabs UI. Expected result is what a reduction in sqls called upon UI view of VM tabs? Reduction in Engine CPU utilization for UI VM tab view? Are there specific queries we shouldn't see in the trace once we've upgraded to 4.3.11? Do you agree with the above?
Yes, that's correct. As for the questions - Most importantly, we should see reduction in CPU utilization by the engine (both at the database level and at the Java level). With the fix, we should also see much lower amount of database queries on the 'snapshots' table - could be that not at all.
Comparing Results on rhev versions : 1. rhv-release-4.2.13-2 ( baseline) 2. rhv-release-4.3.10-7 ( bad version as reported on the BZ) 3. rhv-release-4.3.11-4 ( fix version ) environment: VMs Count : 180 the API command that was using for this test > curl -k -u admin@1 https://rhev-green-01./ovirt-engine/api/vms 1. The baseline cycle was on rhv-release-4.2.13-2 a. API call using curl command took : 0m0.657s b. CPU Utilization is normal 2. Problematic version was on rhv-release-4.3.10-7 a. API call using curl command took : 0m44.725s b. from the GUI > Compute > VMs ,loading/response took around 10 seconds c. CPU utilization was 95% which is very high !!! d. Java & postmaster consuming 75% CPU usage 3. Fix version that was tested rhv-release-4.3.11-4 a. API call using curl command took 1.1 sec b. CPU Utilization is normal c. Java & postmaster consuming 5% CPU usage which is normal Notes: 1. with the fix jdbc total time took 260.3 ms on 435 queries , the only usage of snapshot queries is call delete_entity_snapshot_by_command_id 2. without the fix jdbc total time took 10.9 sec on 4215 queries, the usage of snapshot queries is as following: getsnapshotbysnapshotid ,getsnapshotbyvmidandtype ,getsnapshotsbyvmsnapshotid total execution count : 540 times
*** Bug 1877120 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat Virtualization Engine security, bug fix 4.3.11), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4112