Bug 112367 - Persistence is generating overly constrained join conditions which are poorly optimized on PG
Persistence is generating overly constrained join conditions which are poorly...
Status: CLOSED WONTFIX
Product: Red Hat Web Application Framework
Classification: Retired
Component: persistence (Show other bugs)
nightly
All Linux
medium Severity medium
: ---
: ---
Assigned To: ccm-bugs-list
Jon Orris
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2003-12-18 09:29 EST by Daniel Berrange
Modified: 2007-04-18 13:00 EDT (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2003-12-18 12:29:04 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Explain plans (3.88 KB, text/plain)
2003-12-18 09:32 EST, Daniel Berrange
no flags Details
Explain plans on PG 7.3 (4.27 KB, text/plain)
2003-12-18 12:08 EST, Daniel Berrange
no flags Details

  None (edit)
Description Daniel Berrange 2003-12-18 09:29:53 EST
Description of problem:
A call to Folder@getIndexItem results int he following query being run:

select ci.item_id as c_1,
       ao.object_type as c_2,
       ao.display_name as c_3,
       ao.default_domain_class as c_4,
       vo.is_deleted as c_5,
       ci.version as c_6,
       ci.name as c_7,
       ci.language as c_8,
       ci.ancestors as c_9
from cms_items ci
     join acs_objects ao on ao.object_id = ci.item_id
     join vc_objects vo on vo.object_id = ci.item_id
     join cms_folders start____cf on start____cf.folder_id = '3202'
where start____cf.index_id = ci.item_id;

Developer support shows this as taking 1608ms, which is clearly far
too long for such a simple query.

Running VACUUM ANALYSE has no effect on reducing execution time.

Here is output of running 'explain analyse' on the query:

 dan_rickshaw=# explain analyse select ci.item_id as c_1,
dan_rickshaw-#        ao.object_type as c_2,
dan_rickshaw-#        ao.display_name as c_3,
dan_rickshaw-#        ao.default_domain_class as c_4,
dan_rickshaw-#        vo.is_deleted as c_5,
dan_rickshaw-#        ci.version as c_6,
dan_rickshaw-#        ci.name as c_7,
dan_rickshaw-#        ci.language as c_8,
dan_rickshaw-#        ci.ancestors as c_9
dan_rickshaw-# from cms_items ci
dan_rickshaw-#      join acs_objects ao on ao.object_id = ci.item_id
dan_rickshaw-#      join vc_objects vo on vo.object_id = ci.item_id
dan_rickshaw-#      join cms_folders start____cf on
start____cf.folder_id = '3202'
dan_rickshaw-# where start____cf.index_id = ci.item_id;
NOTICE:  QUERY PLAN:

Hash Join  (cost=210.19..1822.76 rows=1 width=190) (actual
time=1356.53..1356.53 rows=0 loops=1)
  ->  Hash Join  (cost=207.18..1778.06 rows=8334 width=186) (actual
time=129.51..1339.77 rows=8334 loops=1)
        ->  Merge Join  (cost=0.00..1341.70 rows=8334 width=177)
(actual time=1.71..446.10 rows=8334 loops=1)
              ->  Index Scan using cms_items_pk on cms_items ci 
(cost=0.00..673.52 rows=8334 width=63) (actual time=0.49..133.80
rows=8334 loops=1)
              ->  Index Scan using acs_objects_object_id_p_hhkb1 on
acs_objects ao  (cost=0.00..519.30 rows=9544 width=114) (actual
time=0.47..141.97 rows=9543 loops=1)
        ->  Hash  (cost=186.34..186.34 rows=8334 width=9) (actual
time=126.77..126.77 rows=0 loops=1)
              ->  Seq Scan on vc_objects vo  (cost=0.00..186.34
rows=8334 width=9) (actual time=0.14..98.14 rows=8334 loops=1)
  ->  Hash  (cost=3.01..3.01 rows=1 width=4) (actual time=0.08..0.08
rows=0 loops=1)
        ->  Index Scan using cms_folders_folder_id_p_oglqk on
cms_folders start____cf  (cost=0.00..3.01 rows=1 width=4) (actual
time=0.05..0.06 rows=1 loops=1)
Total runtime: 1357.06 msec


If I rewrite the query to use unconstrainted joins, the query plan
improves dramatically & the execution time is down to 0.85ms.
 
dan_rickshaw=# explain analyse select ci.item_id as c_1,
dan_rickshaw-#        ao.object_type as c_2,
dan_rickshaw-#        ao.display_name as c_3,
dan_rickshaw-#        ao.default_domain_class as c_4,
dan_rickshaw-#        vo.is_deleted as c_5,
dan_rickshaw-#        ci.version as c_6,
dan_rickshaw-#        ci.name as c_7,
dan_rickshaw-#        ci.language as c_8,
dan_rickshaw-#        ci.ancestors as c_9
dan_rickshaw-# from acs_objects ao, vc_objects vo, cms_items ci,
cms_folders cf
dan_rickshaw-#      where cf.folder_id = 3202
dan_rickshaw-#      and ao.object_id = ci.item_id
dan_rickshaw-#      and vo.object_id = ci.item_id
dan_rickshaw-#      and cf.index_id = ci.item_id;
NOTICE:  QUERY PLAN:

Nested Loop  (cost=0.00..15.29 rows=1 width=190) (actual
time=0.39..0.39 rows=0 loops=1)
  ->  Nested Loop  (cost=0.00..10.97 rows=1 width=127) (actual
time=0.39..0.39 rows=0 loops=1)
        ->  Nested Loop  (cost=0.00..7.31 rows=1 width=13) (actual
time=0.38..0.38 rows=0 loops=1)
              ->  Index Scan using cms_folders_folder_id_p_oglqk on
cms_folders cf  (cost=0.00..3.01 rows=1 width=4) (actual
time=0.36..0.36 rows=1 loops=1)
              ->  Index Scan using vc_objects_pk on vc_objects vo 
(cost=0.00..4.29 rows=1 width=9) (actual time=0.00..0.00 rows=0 loops=1)
        ->  Index Scan using acs_objects_object_id_p_hhkb1 on
acs_objects ao  (cost=0.00..3.65 rows=1 width=114)
  ->  Index Scan using cms_items_pk on cms_items ci  (cost=0.00..4.31
rows=1 width=63)
Total runtime: 0.85 msec


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:
Takes 1600ms

Expected results:
Takes <10ms

Additional info:

This DB is populated with about 2700 items & a few hundred files & images:

dan_rickshaw=# select count(*) from acs_objects;
 count 
-------
  9544
(1 row)

dan_rickshaw=# select count(*) from vc_objects;
 count 
-------
  8334
(1 row)

dan_rickshaw=# select count(*) from cms_items;
 count 
-------
  8334
(1 row)

dan_rickshaw=# select count(*) from cms_folders;
 count 
-------
   266
(1 row)

dan_rickshaw=# select count(*) from cms_items where type_id is not null;
 count 
-------
  2780
(1 row)

dan_rickshaw=# select count(*) from cms_images;
 count 
-------
    71
(1 row)

dan_rickshaw=# select count(*) from cms_files;
 count 
-------
   392
(1 row)

dan_rickshaw=#
Comment 1 Daniel Berrange 2003-12-18 09:31:40 EST
BZ screws up formatting by wrapping long lines, I'll attach the query
explain plans instead....
Comment 2 Daniel Berrange 2003-12-18 09:32:00 EST
Created attachment 96608 [details]
Explain plans
Comment 3 Rafael H. Schloming 2003-12-18 11:00:40 EST
What version of postgres are you using? I know that older versions of
pg have this problem but more recent versions are supposed to treat
the two different syntaxes as equivalent.
Comment 4 Daniel Berrange 2003-12-18 11:10:43 EST
This is using RHDB 2.1 aka PG 7.2.
I'm going to try out on 7.3 too.

If they are equivalent, why not go for the simpler less constrained
join option anyway? I find the queries much easier to read and
understand when using basic 'from' list with 'and' clauses for each join.
Comment 5 Daniel Berrange 2003-12-18 12:08:15 EST
Have verified on pg 7.3 & performance of both is very similar - and
they have almost identical query plans. See new attachment.
Comment 6 Daniel Berrange 2003-12-18 12:08:36 EST
Created attachment 96612 [details]
Explain plans on PG 7.3
Comment 7 Rafael H. Schloming 2003-12-18 12:25:43 EST
There are a number of reasons to prefer the ANSI syntax:

  1. Although the two syntaxes are supposed to be equivalent in this
case, they are not equivalent in general, for example there is no way
to express outer joins on pg using the syntax in what you are calling
the "less constrained" version of the query.

  2. Even oracle is moving away from the old style syntax since even
using the oracle specific outer join syntax there are certain kinds of
joins you can only express using the ANSI style syntax.

  3. Once you are familiar with the ANSI style join syntax it is much
more readable since you don't need to scan back and forth between the
where clause and the from clause in order to match up join criteria
with joins. This is particularly true of larger queries that use
aliases or queries against unfamiliar schemas since in those cases you
can't use the fact that you have all the table and column names
memorized in order to speed up the matching process.

With that said we can use the old style syntax in those cases where it
is possible (e.g. inner joins), although I don't know if that would be
a complete fix since outer joins may still be very sensetive to join
order on pg 7.2. It's probably best to recommend people upgrade to 7.3
or later if we can.

(As a side note I'm not sure I understand what you mean by the "less
constrained" terminology since both versions of the query include the
exact same join constraints and so I would think that neither version
is any more or less constrained.)
Comment 8 Daniel Berrange 2003-12-18 12:29:04 EST
Yeah, I'd forgotten about outer joins. With that in mind its not
really worth fixing, particularly since we only officially support 7.3
these days.

Note You need to log in before you can comment on or make changes to this bug.