Bug 1336293

Summary: periodic high load on PostgreSQL: SELECT DISTINCT "katello_errata".* FROM "katello_errata"...
Product: Red Hat Satellite Reporter: Jan Hutař <jhutar>
Component: Errata ManagementAssignee: Chris Duryee <cduryee>
Status: CLOSED ERRATA QA Contact: Roman Plevka <rplevka>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.2.0CC: bbuckingham, cwelton, ehelms, jhutar, mmccune, psuriset, riehecky, rplevka
Target Milestone: UnspecifiedKeywords: Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: rubygem-katello-3.0.0.44-1,rubygem-katello-3.0.0.53-1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-07-27 11:10:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jan Hutař 2016-05-16 06:42:41 UTC
Description of problem:
With ~250 content hosts registered and about ~400 applicable erratas in total (only ~3 unique erratas), PostgreSQL gains big load (not sure about trigger) and blocks other actions (for example login). In our case, one run of that query took slightly below 80s on a strong machine.


Version-Release number of selected component (if applicable):
satellite-6.2.0-11.0.el7sat.noarch


How reproducible:
query itself is slow always


Steps to Reproduce:
1. Have 250 content hosts with 400 applicable erratas in total
2. Refresh https://<fqdn>/dashboard (not sure if this is a direct trigger)


Actual results:
It took me >1 minute to load the page


Expected results:
Should be quicker


Additional info:
With logging of long queries set in PostgreSQL config, it says:

==> /var/lib/pgsql/data/pg_log/postgresql-Mon.log <==
LOG:  duration: 77936.817 ms  execute <unnamed>: SELECT  DISTINCT "katello_errata".* FROM "katello_errata" INNER JOIN "katello_content_facet_errata" ON "katello_content_facet_errata"."erratum_id" = "katello_errata"."id" INNER JOIN "katello_content_facet_errata" "content_facet_errata_katello_errata_join" ON "content_facet_errata_katello_errata_join"."erratum_id" = "katello_errata"."id" INNER JOIN "katello_content_facets" ON "katello_content_facets"."id" = "content_facet_errata_katello_errata_join"."content_facet_id" WHERE "katello_content_facets"."host_id" IN (SELECT "hosts"."id" FROM "hosts"  WHERE "hosts"."type" IN ('Host::Managed') AND "hosts"."organization_id" IN (1))  ORDER BY updated desc LIMIT 6

Comment 1 Jan Hutař 2016-05-16 06:43:17 UTC
foreman=# EXPLAIN SELECT  DISTINCT "katello_errata".* FROM "katello_errata" INNER JOIN "katello_content_facet_errata" ON "katello_content_facet_errata"."erratum_id" = "katello_errata"."id" INNER JOIN "katello_content_facet_errata" "content_facet_errata_katello_errata_join" ON "content_facet_errata_katello_errata_join"."erratum_id" = "katello_errata"."id" INNER JOIN "katello_content_facets" ON "katello_content_facets"."id" = "content_facet_errata_katello_errata_join"."content_facet_id" WHERE "katello_content_facets"."host_id" IN (SELECT "hosts"."id" FROM "hosts"  WHERE "hosts"."type" IN ('Host::Managed') AND "hosts"."organization_id" IN (1))  ORDER BY updated desc LIMIT 6;
                                                                                                                                                                                       QUERY PLAN                                            
                                                                                                                                           
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=50522.02..50529.12 rows=6 width=1269)
   ->  Unique  (cost=50522.02..52115.13 rows=1346 width=1269)
         ->  Sort  (cost=50522.02..50628.22 rows=42483 width=1269)
               Sort Key: katello_errata.updated, katello_errata.id, katello_errata.uuid, katello_errata.errata_id, katello_errata.created_at, katello_errata.updated_at, katello_errata.issued, katello_errata.errata_type, katello_errata.se
verity, katello_errata.title, katello_errata.solution, katello_errata.description, katello_errata.summary, katello_errata.reboot_suggested
               ->  Hash Join  (cost=191.42..209.22 rows=42483 width=1269)
                     Hash Cond: (katello_content_facet_errata.erratum_id = katello_errata.id)
                     ->  Seq Scan on katello_content_facet_errata  (cost=0.00..14.57 rows=357 width=4)
                     ->  Hash  (cost=186.96..186.96 rows=357 width=1273)
                           ->  Hash Join  (cost=162.57..186.96 rows=357 width=1273)
                                 Hash Cond: (content_facet_errata_katello_errata_join.erratum_id = katello_errata.id)
                                 ->  Hash Join  (cost=30.43..49.91 rows=357 width=4)
                                       Hash Cond: (content_facet_errata_katello_errata_join.content_facet_id = katello_content_facets.id)
                                       ->  Seq Scan on katello_content_facet_errata content_facet_errata_katello_errata_join  (cost=0.00..14.57 rows=357 width=8)
                                       ->  Hash  (cost=27.50..27.50 rows=234 width=4)
                                             ->  Hash Join  (cost=11.27..27.50 rows=234 width=4)
                                                   Hash Cond: (hosts.id = katello_content_facets.host_id)
                                                   ->  Seq Scan on hosts  (cost=0.00..12.93 rows=261 width=4)
                                                         Filter: (((type)::text = 'Host::Managed'::text) AND (organization_id = 1))
                                                   ->  Hash  (cost=8.34..8.34 rows=234 width=8)
                                                         ->  Seq Scan on katello_content_facets  (cost=0.00..8.34 rows=234 width=8)
                                 ->  Hash  (cost=123.73..123.73 rows=673 width=1269)
                                       ->  Seq Scan on katello_errata  (cost=0.00..123.73 rows=673 width=1269)
(22 rows)

foreman=# EXPLAIN ANALYZE SELECT  DISTINCT "katello_errata".* FROM "katello_errata" INNER JOIN "katello_content_facet_errata" ON "katello_content_facet_errata"."erratum_id" = "katello_errata"."id" INNER JOIN "katello_content_facet_errata" "content_facet_errata_katello_errata_join" ON "content_facet_errata_katello_errata_join"."erratum_id" = "katello_errata"."id" INNER JOIN "katello_content_facets" ON "katello_content_facets"."id" = "content_facet_errata_katello_errata_join"."content_facet_id" WHERE "katello_content_facets"."host_id" IN (SELECT "hosts"."id" FROM "hosts"  WHERE "hosts"."type" IN ('Host::Managed') AND "hosts"."organization_id" IN (1))  ORDER BY updated desc LIMIT 6;
                                                                                                                                                                                       QUERY PLAN                                            
                                                                                                                                           
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=50522.02..50529.12 rows=6 width=1269) (actual time=63211.689..80102.616 rows=3 loops=1)
   ->  Unique  (cost=50522.02..52115.13 rows=1346 width=1269) (actual time=63211.689..80102.616 rows=3 loops=1)
         ->  Sort  (cost=50522.02..50628.22 rows=42483 width=1269) (actual time=63211.686..79656.391 rows=42483 loops=1)
               Sort Key: katello_errata.updated, katello_errata.id, katello_errata.uuid, katello_errata.errata_id, katello_errata.created_at, katello_errata.updated_at, katello_errata.issued, katello_errata.errata_type, katello_errata.se
verity, katello_errata.title, katello_errata.solution, katello_errata.description, katello_errata.summary, katello_errata.reboot_suggested
               Sort Method: external merge  Disk: 36304kB
               ->  Hash Join  (cost=191.42..209.22 rows=42483 width=1269) (actual time=1.801..36.698 rows=42483 loops=1)
                     Hash Cond: (katello_content_facet_errata.erratum_id = katello_errata.id)
                     ->  Seq Scan on katello_content_facet_errata  (cost=0.00..14.57 rows=357 width=4) (actual time=0.008..0.211 rows=357 loops=1)
                     ->  Hash  (cost=186.96..186.96 rows=357 width=1273) (actual time=1.786..1.786 rows=357 loops=1)
                           Buckets: 1024  Batches: 1  Memory Usage: 315kB
                           ->  Hash Join  (cost=162.57..186.96 rows=357 width=1273) (actual time=1.136..1.395 rows=357 loops=1)
                                 Hash Cond: (content_facet_errata_katello_errata_join.erratum_id = katello_errata.id)
                                 ->  Hash Join  (cost=30.43..49.91 rows=357 width=4) (actual time=0.323..0.450 rows=357 loops=1)
                                       Hash Cond: (content_facet_errata_katello_errata_join.content_facet_id = katello_content_facets.id)
                                       ->  Seq Scan on katello_content_facet_errata content_facet_errata_katello_errata_join  (cost=0.00..14.57 rows=357 width=8) (actual time=0.003..0.046 rows=357 loops=1)
                                       ->  Hash  (cost=27.50..27.50 rows=234 width=4) (actual time=0.317..0.317 rows=261 loops=1)
                                             Buckets: 1024  Batches: 1  Memory Usage: 10kB
                                             ->  Hash Join  (cost=11.27..27.50 rows=234 width=4) (actual time=0.097..0.280 rows=261 loops=1)
                                                   Hash Cond: (hosts.id = katello_content_facets.host_id)
                                                   ->  Seq Scan on hosts  (cost=0.00..12.93 rows=261 width=4) (actual time=0.005..0.135 rows=261 loops=1)
                                                         Filter: (((type)::text = 'Host::Managed'::text) AND (organization_id = 1))
                                                         Rows Removed by Filter: 1
                                                   ->  Hash  (cost=8.34..8.34 rows=234 width=8) (actual time=0.088..0.088 rows=261 loops=1)
                                                         Buckets: 1024  Batches: 1  Memory Usage: 11kB
                                                         ->  Seq Scan on katello_content_facets  (cost=0.00..8.34 rows=234 width=8) (actual time=0.004..0.050 rows=261 loops=1)
                                 ->  Hash  (cost=123.73..123.73 rows=673 width=1269) (actual time=0.806..0.806 rows=650 loops=1)
                                       Buckets: 1024  Batches: 1  Memory Usage: 830kB
                                       ->  Seq Scan on katello_errata  (cost=0.00..123.73 rows=673 width=1269) (actual time=0.001..0.166 rows=650 loops=1)
 Total runtime: 80108.762 ms
(29 rows)

Comment 2 Jan Hutař 2016-05-16 06:44:10 UTC
Deleting all content hosts makes the satellite to run without any issues.

Comment 3 Jan Hutař 2016-05-16 08:15:43 UTC
OK, I have deleted all content hosts and reproduced it again by registering 100 of them with 3 same erratas (all for tzdata package) applicable/installable.

Comment 12 Bryan Kearney 2016-06-03 18:16:35 UTC
Moving to POST since upstream bug http://projects.theforeman.org/issues/15253 has been closed

Comment 14 Mike McCune 2016-06-09 22:25:13 UTC
Going to failsQA this as it is causing this error:

http://paste-platops.itos.redhat.com/po0hl7blv/bqluaq

Comment 15 Mike McCune 2016-06-09 22:26:50 UTC
With SNAP 15.1 we get:

* Login
* Content -> Errata
* Page never finishes rendering

Error in logs:

2016-06-09 17:58:33 [app] [I]   Parameters: {"errata_restrict_applicable"=>"true", "organization_id"=>"1", "page"=>"1", "paged"=>"true", "search"=>"", "sort_by"=>"updated", "sort_order"=>"DESC", "api_version"=>"v2"}
2016-06-09 17:58:33 [app] [E] PG::Error: ERROR:  syntax error at or near "ON"
 | LINE 1: SELECT DISTINCT COUNT(DISTINCT ON (katello_errata.updated, k...
 |                                        ^
 | 
 | /opt/rh/rh-ror41/root/usr/share/gems/gems/activerecord-4.1.5/lib/active_record/connection_adapters/postgresql_adapter.rb:822:in `exec'
 | /opt/rh/rh-ror41/root/usr/share/gems/gems/activerecord-4.1.5/lib/active_record/connection_adapters/postgresql_adapter.rb:822:in `block in exec_no_cache'

2016-06-09 17:58:33 [app] [I]   Parameters: {"errata_restrict_applicable"=>"true", "organization_id"=>"1", "page"=>"1", "paged"=>"true", "search"=>"", "sort_by"=>"updated", "sort_order"=>"DESC", "api_version"=>"v2"}
2016-06-09 17:58:33 [app] [E] PG::Error: ERROR:  syntax error at or near "ON"
 | LINE 1: SELECT DISTINCT COUNT(DISTINCT ON (katello_errata.updated, k...
 |                                        ^
 |

Comment 16 Mike McCune 2016-06-09 22:26:57 UTC
*** Bug 1344523 has been marked as a duplicate of this bug. ***

Comment 20 Chris Duryee 2016-06-13 17:50:03 UTC
The previous fixes make the page responsive with 250-ish hosts, but there is a different issue with 6K hosts. For example, finding the most recent errata requires creating and sorting a joined table with 84M rows. This is too slow to populate on the dashboard, even if the sorting is disabled.

I am looking into this now to see if this can be slimmed down. Can this bug be tested against maybe 250 or 500 hosts, and a new bug created for 5K hosts?

Comment 21 Chris Duryee 2016-06-13 18:52:39 UTC
ignore comment #20, the updated patch should work very large numbers of hosts.

Comment 23 Brad Buckingham 2016-06-22 18:06:44 UTC
*** Bug 1348599 has been marked as a duplicate of this bug. ***

Comment 25 Roman Plevka 2016-07-19 09:36:44 UTC
VERIFIED
on sat6.2 snap20.2

for 296 content hosts with applicable errata the query takes about 4.2s to execute:

2016-07-19 05:33:51.839 EDT foreman LOG:  duration: 4.198 ms  execute <unnamed>: SELECT COUNT(DISTINCT count_column) FROM (SELECT  DISTINCT "katello_errata"."id" AS count_column FROM "katello_errata" INNER JOIN "katello_content_facet_errata" ON "katello_content_facet_errata"."erratum_id" = "katello_errata"."id" INNER JOIN "katello_content_facets" ON "katello_content_facets"."id" = "katello_content_facet_errata"."content_facet_id" WHERE "katello_content_facets"."host_id" IN (SELECT "hosts"."id" FROM "hosts"  WHERE "hosts"."type" IN ('Host::Managed') AND "hosts"."organization_id" IN (1)) LIMIT 6) subquery_for_count


No limitation noticed on loading the dashboard.
P.S. Thanks jhutar for the setup.

Comment 26 Bryan Kearney 2016-07-27 11:10:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1501