Hide Forgot
Description of problem: When clicking on "New VM" in the "Virtual machines" the pop up show only a spinning circle for many minutes. Version-Release number of selected component (if applicable): RHEV-M 3.5.7 How reproducible: 100% Steps to Reproduce: 1. Create 1000 templates 2. Create a new VM Actual results: Then "New VM" window shows the spinning circle for minutes. Expected results: The pop up shows the form immediately. Additional info: Webadmin calls GetVmTemplatesByStoragePoolIdQuery in order to populate the form. It is very expensive operation. It gets the list of templates by GetVmTemplatesByStoragePoolId stored procedure which is very fast, but then it tries to get the related disks for each of the templates and that is very expensive. The query is started for each of the templates. select * from getdisksvmguid('UUID', 'f' NULL, 'f') In this case it will be called 1000 times. It is very expensive as it involves view all_disks_including_snapshots which is complicated on its own. Here is an example of one of the SQL commands: SELECT all_disks_including_snapshots.* FROM all_disks_including_snapshots LEFT JOIN vm_device ON vm_device.device_id = all_disks_including_snapshots.image_group_id AND (NOT 'f' OR is_plugged) WHERE vm_device.vm_id = '03c6b305-9a14-4038-a613-b8adde9e291b' AND ((vm_device.snapshot_id IS NULL AND all_disks_including_snapshots.active IS NOT FALSE) OR vm_device.snapshot_id = all_disks_including_snapshots.vm_snapshot_id) AND (NOT 'f' OR EXISTS (SELECT 1 FROM user_disk_permissions_view WHERE user_id is NULL AND entity_id = all_disks_including_snapshots.disk_id)); The execution may take more than one second. We should load the information about the disks on demand or optimise the SQL.
Created attachment 1127228 [details] explain of the SQL command
we don't need disks before you choose a particular template, right?
No, we don't need it in advance - it is enough to have it only for the selected one.
Product Management has reviewed and declined this request. You may appeal this decision by reopening this request.
this bug still reproduced in 3.6.5.1, the spining ring at least not running forever. the dialog box became available after ~20-30 sec. and seems like getdisksvmguid has been called several time serially while the dialog loading. any chance to execute them in parallel? moreover each one of the execution takes around 5 sec. is getstorage_domains_by_storagepoolid executed by this flow as well? further profiler session required. here we have the pg log while we hit the diablog box for new templates. LOG: duration: 5995.970 ms execute <unnamed>: select * from getdisksvmguid($1, $2, $3, $4) LOG: duration: 5857.623 ms execute <unnamed>: select * from getstorage_domains_by_storagepoolid($1, $2, $3) LOG: duration: 5905.632 ms execute <unnamed>: select * from getdisksvmguid($1, $2, $3, $4) LOG: duration: 5607.681 ms execute <unnamed>: select * from getdisksvmguid($1, $2, $3, $4) LOG: duration: 5619.016 ms execute <unnamed>: select * from getdisksvmguid($1, $2, $3, $4) LOG: duration: 5926.344 ms execute <unnamed>: select * from getdisksvmguid($1, $2, $3, $4) LOG: duration: 5957.711 ms execute <unnamed>: select * from getdisksvmguid($1, $2, $3, $4) LOG: duration: 5897.029 ms execute <unnamed>: select * from getstorage_domains_by_storagepoolid($1, $2, $3) LOG: duration: 5942.826 ms execute <unnamed>: select * from getdisksvmguid($1, $2, $3, $4) LOG: duration: 5644.935 ms execute <unnamed>: select * from getdisksvmguid($1, $2, $3, $4) LOG: duration: 5976.697 ms execute <unnamed>: select * from getdisksvmguid($1, $2, $3, $4) LOG: duration: 5897.564 ms execute <unnamed>: select * from getdisksvmguid($1, $2, $3, $4) LOG: duration: 5877.607 ms execute S_31: select * from getstorage_domains_by_storagepoolid($1, $2, $3) LOG: duration: 6021.145 ms execute <unnamed>: select * from getdisksvmguid($1, $2, $3, $4) LOG: duration: 6105.581 ms execute <unnamed>: select * from getstorage_domains_by_storage_pool_id_with_permitted_action($1, $2, $3) LOG: duration: 6477.493 ms execute S_29: select * from getstorage_domains_by_storagepoolid($1, $2, $3) LOG: duration: 5572.327 ms execute S_32: select * from getstorage_domains_by_storagepoolid($1, $2, $3)
(In reply to Eldad Marciano from comment #6) > this bug still reproduced in 3.6.5.1, the spining ring at least not running > forever. > the dialog box became available after ~20-30 sec. How many VMs, disks, templates, storage domains, ... ? > and seems like > getdisksvmguid has been called several time serially while the dialog > loading. > > any chance to execute them in parallel? Why? If it is a high scale env., I'm OK with 20-30 seconds. What is the storage of the DB? Just a plain disk? Move it to something better.
(In reply to Yaniv Kaul from comment #7) > (In reply to Eldad Marciano from comment #6) > > this bug still reproduced in 3.6.5.1, the spining ring at least not running > > forever. > > the dialog box became available after ~20-30 sec. > > How many VMs, disks, templates, storage domains, ... ? overall: vms 10K disks 29.8K templates 400 storage domains 13 for this particular data center (real): vms 3K disks 1 disk per vm 3K templates 400 storage domains 12 > > > and seems like > > getdisksvmguid has been called several time serially while the dialog > > loading. > > > > any chance to execute them in parallel? > > Why? If it is a high scale env., I'm OK with 20-30 seconds. What is the > storage of the DB? Just a plain disk? Move it to something better. Seems like it's inefficent query, we probably can make it faster, by optimize it, without any hardware changes. I need to check with Allon. seems like we already optimize this query.
let's keep it for further investigation for a while. But technically I think it improved enough (no "minutes" anymore) so Roman, I guess you can report that back. What do you think?
(In reply to Michal Skrivanek from comment #9) > let's keep it for further investigation for a while. But technically I think > it improved enough (no "minutes" anymore) so Roman, I guess you can report > that back. What do you think? It was communicated and we are waiting for response.
oVirt 4.0 Alpha has been released, moving to oVirt 4.0 Beta target.
Seems good enough in 3.6.5