Bug 1302752 - [scale] - getdisksvmguid inefficient query, hit the performance
[scale] - getdisksvmguid inefficient query, hit the performance
Status: CLOSED CURRENTRELEASE
Product: ovirt-engine
Classification: oVirt
Component: Database.Core (Show other bugs)
3.6.2
x86_64 Linux
high Severity high (vote)
: ovirt-4.0.4
: 4.0.4
Assigned To: Allon Mureinik
Eldad Marciano
: Performance
Depends On: ovirt_refactor_disk_class_hierarchy
Blocks:
  Show dependency treegraph
 
Reported: 2016-01-28 09:32 EST by Eldad Marciano
Modified: 2016-09-26 08:37 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-09-26 08:37:57 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Storage
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
amureini: ovirt‑4.0.z?
gklein: blocker?
rule-engine: planning_ack?
rule-engine: devel_ack+
rule-engine: testing_ack+


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 62286 ovirt-engine-4.0 MERGED database: memory_and_disk_images_storage_domain_view 2016-08-13 07:26 EDT

  None (edit)
Description Eldad Marciano 2016-01-28 09:32:43 EST
Description of problem:
engine=> EXPLAIN ANALYZE select * from  getdisksvmguid('fa464094-a656-49dc-8384-465c5308cff7', 't', NULL, 'f');
                                                        QUERY PLAN                                                         
---------------------------------------------------------------------------------------------------------------------------
 Function Scan on getdisksvmguid  (cost=0.00..260.00 rows=1000 width=3170) (actual time=6643.344..6643.345 rows=1 loops=1)
 Total runtime: 6643.390 ms


Version-Release number of selected component (if applicable):
3.6.2

How reproducible:
100%

Steps to Reproduce:
1. loaded engine (500 hosts 7Kvms * 3 disks)

Actual results:
slow query, hit the CPU usage.

Expected results:
stable CPU, faster query.

Additional info:
Comment 1 Eldad Marciano 2016-01-28 10:19:32 EST
removing the "GROUP BY" from all_disks_including_snapshots save ~6 sec.

by doing that the same amount of rows returns and query runs much faster:
"Function Scan on getdisksvmguid  (cost=0.00..260.00 rows=1000 width=3170) (actual time=1735.764..1735.764 rows=1 loops=1)"
"Total runtime: 1735.820 ms"



but it still needs some improvements. 
cause it takes too much CPU when running it in bulks.
Comment 2 Allon Mureinik 2016-01-31 14:13:20 EST
(In reply to Eldad Marciano from comment #1)
> removing the "GROUP BY" from all_disks_including_snapshots save ~6 sec.
> 
> by doing that the same amount of rows returns and query runs much faster:
> "Function Scan on getdisksvmguid  (cost=0.00..260.00 rows=1000 width=3170)
> (actual time=1735.764..1735.764 rows=1 loops=1)"
> "Total runtime: 1735.820 ms"
> 
> 
> 
> but it still needs some improvements. 
> cause it takes too much CPU when running it in bulks.

Removing the GROUP BY is wrong - it will return multiple entries for templates that have multiple copies.
Comment 3 Eldad Marciano 2016-04-21 06:20:57 EDT
this bug has big impact in terms of performance wise.

any chance to fix it for the next release?
Comment 4 Allon Mureinik 2016-04-21 08:05:46 EDT
(In reply to Eldad Marciano from comment #3)
> this bug has big impact in terms of performance wise.
> 
> any chance to fix it for the next release?

This mainly depends on the RFE in bug 1142762. Once that's done, we could have a better estimation.
Comment 5 Sandro Bonazzola 2016-05-02 05:53:44 EDT
Moving from 4.0 alpha to 4.0 beta since 4.0 alpha has been already released and bug is not ON_QA.
Comment 6 Yaniv Lavi 2016-05-23 09:16:07 EDT
oVirt 4.0 beta has been released, moving to RC milestone.
Comment 7 Yaniv Lavi 2016-05-23 09:21:54 EDT
oVirt 4.0 beta has been released, moving to RC milestone.
Comment 8 Allon Mureinik 2016-07-17 06:53:55 EDT
This bug was reported on 3.6, and probably existed way before hand. This is not a blocker.
Comment 9 Yaniv Kaul 2016-07-17 09:43:23 EDT
(In reply to Allon Mureinik from comment #8)
> This bug was reported on 3.6, and probably existed way before hand. This is
> not a blocker.

Eldad - do we have the latest on 4.0? Just for comparison? (we've changed so much - platform, Postgres, JBoss, Java, disk refactoring in engine... I hope we are in the same or better performance than 3.6!)
Comment 10 Eldad Marciano 2016-07-20 10:10:10 EDT
we about to start with 4.0 very soon. 
anyway it seems like that SP getdisksvmguid didn't change since we discovered this problem so we probably face it in 4.0 as well.
w'll update as soon as we have some results.
Comment 12 Eldad Marciano 2016-08-10 15:50:24 EDT
by using that patch https://gerrit.ovirt.org/#/c/62044/2
this query SP runs by ~200 ms.
Comment 13 Allon Mureinik 2016-08-11 08:29:59 EDT
(In reply to Eldad Marciano from comment #0)
> Description of problem:
> engine=> EXPLAIN ANALYZE select * from 
> getdisksvmguid('fa464094-a656-49dc-8384-465c5308cff7', 't', NULL, 'f');
>                                                         QUERY PLAN          
> 
> -----------------------------------------------------------------------------
> ----------------------------------------------
>  Function Scan on getdisksvmguid  (cost=0.00..260.00 rows=1000 width=3170)
> (actual time=6643.344..6643.345 rows=1 loops=1)
>  Total runtime: 6643.390 ms

(In reply to Eldad Marciano from comment #12)
> by using that patch https://gerrit.ovirt.org/#/c/62044/2
> this query SP runs by ~200 ms.

So it improves by a factor of ~33, which sounds like a pretty decent improvement to me.
Thanks Eldad!
Comment 14 Allon Mureinik 2016-08-11 08:40:12 EDT
Retargetting to 4.0.4 based on this comment.

Note You need to log in before you can comment on or make changes to this bug.