Created attachment 1079381 [details] Numbers of command executions for 100 calls with 10 vms to /api/vms Description of problem: When calling the /api/vms endpoint the command execution count grows lineary with the number of the vms in the datacenter and so do the database queries. The attached screenshot shows that GetGraphicsDevices is called N times when we have N vms in the cluster. Version-Release number of selected component (if applicable): How reproducible: To reproduce the numbers in the screenshot, create 10 vms and query /api/vms one hundred times with a maximum of 10 requests in parallel. Steps to Reproduce: 1. Create 10 vms 2. run seq 1 100 | parallel -j 10 curl -H "Accept: application/json" -H "Content-type: application/json" -X GET --user admin@internal:engine http://localhost:8080/ovirt-engine/api/vms 3. Actual results: You will see that SearchVm and GetVmsInit is called exactly 100 times, like expected. Further you will see that GetGraphicsDevices is called 1000 times. This means 10 times per call which corresponds to the number of vms. Expected results: Do a bulk query for GetGraphicsDevices and/or hit a cache after the first query Additional info: Note that the average execution time for GetGraphicsDevices went up to 138 ms. When not under stress the exectuion time is just about 8 ms on my notebook. So when querying a datacenter with 1000 VMs a single call takes on my notebook at least 8 seconds longer than necessary.
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.
Additional info: the table vm_device is missing the right index to make the query 'GetVmDeviceByVmIdTypeAndDevice' fast. So consider adding > CREATE UNIQUE INDEX idx_combined_vmid_type_device > ON vm_device > USING btree > (vm_id, type, device); Note that this might not give much additional boost because the db function is also using the permission_view and I did not check if the indices there are done right. Might be worth a look too.
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions
Moving from 4.0 alpha to 4.0 beta since 4.0 alpha has been already released and bug is not ON_QA.
Verify with: Engine: 4.0.2.4-0.1.el7ev Host: OS Version:RHEL - 7.2 - 9.el7_2.1 Kernel Version:3.10.0 - 327.22.2.el7.x86_64 KVM Version:2.3.0 - 31.el7_2.21 LIBVIRT Version:libvirt-1.2.17-13.el7_2.5 VDSM Version:vdsm-4.18.5.1-1.el7ev SPICE Version:0.12.4 - 15.el7_2.1 Steps: 1. Create 100 VMs on engine with last version (4.0.2.4) Create 100 VMs on engine without the fix (3.6.8.1) 2. Query engine vms status on both engines: eval date +%s >> time_to_exe.out ; seq 1 100 | parallel -j 10 curl -H "Accept: application/json" -H "Content-type: application/json" -X GET --user admin@internal:engine <engine FQND>/ovirt-engine/api/vms >> test.out; eval date +%s >> time_to_exe.out 3. Compare results (time_to_exe.out file) without the fix: 113 sec with the fix: 68 sec PASS
*** Bug 1194291 has been marked as a duplicate of this bug. ***