Bug 1486446 - Content view versions list has slow query for package count
Summary: Content view versions list has slow query for package count
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Content Views
Version: 6.2.12
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: 6.8.0
Assignee: Partha Aji
QA Contact: Lai
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-08-29 18:47 UTC by Justin Sherrill
Modified: 2020-10-27 12:57 UTC (History)
10 users (show)

Fixed In Version: tfm-rubygem-katello-3.15.0-0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-27 12:57:21 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Foreman Issue Tracker 28427 0 Normal Closed Content view versions list has slow query for package count 2021-01-21 20:22:35 UTC
Red Hat Product Errata RHSA-2020:4366 0 None None None 2020-10-27 12:57:47 UTC

Description Justin Sherrill 2017-08-29 18:47:29 UTC
Description of problem:

The content view versions list queries the # of rpms in each version.  This query seems slow (500ms) on a large database with lots of repos and packages

Version-Release number of selected component (if applicable):
6.2.11

How reproducible:
Always

Steps to Reproduce:
1.  On a large database visit the Content view versions page for a content view
2.  the response is slower than it shoudl be  
3.

Actual results:


Expected results:


Additional info:

foreman=# select count(*) from katello_rpms
foreman-# ;
 count  
--------
 133398
(1 row)

foreman=# select count(*) from katello_repository_rpms;
  count  
---------
 5087946
(1 row)

foreman=# select count(*) from katello_repositories;
 count 
-------
  2348
(1 row)



foreman=# EXPLAIN ANALYZE SELECT COUNT(*) FROM "katello_rpms" INNER JOIN "katello_repository_rpms" ON "katello_repository_rpms"."rpm_id" = "katello_rpms"."id" WHERE "katello_repository_rpms"."repository_id" IN (SELECT "katello_repositories"."id" FROM "katello_repositories"  WHERE "katello_repositories"."content_view_version_id" = 335 AND (environment_id is NULL))
;
                                                                                          QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------
 Aggregate  (cost=137134.88..137134.89 rows=1 width=0) (actual time=1871.574..1871.575 rows=1 loops=1)
   ->  Hash Join  (cost=9301.51..136914.74 rows=88057 width=0) (actual time=1785.780..1867.048 rows=38549 loops=1)
         Hash Cond: (katello_repository_rpms.rpm_id = katello_rpms.id)
         ->  Hash Semi Join  (cost=95.08..124962.31 rows=88057 width=4) (actual time=1687.052..1709.980 rows=38549 loops=1)
               Hash Cond: (katello_repository_rpms.repository_id = katello_repositories.id)
               ->  Seq Scan on katello_repository_rpms  (cost=0.00..112676.80 rows=4270780 width=8) (actual time=0.004..865.591 rows=5099589 loops=1)
               ->  Hash  (cost=94.85..94.85 rows=18 width=4) (actual time=0.079..0.079 rows=16 loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 1kB
                     ->  Index Scan using index_repositories_on_content_view_version_id on katello_repositories  (cost=0.00..94.85 rows=18 width=4) (actua
l time=0.060..0.071 rows=16 loops=1)
                           Index Cond: (content_view_version_id = 335)
                           Filter: (environment_id IS NULL)
                           Rows Removed by Filter: 16
         ->  Hash  (cost=7036.97..7036.97 rows=132197 width=4) (actual time=98.234..98.234 rows=134150 loops=1)
               Buckets: 4096  Batches: 8  Memory Usage: 600kB
               ->  Seq Scan on katello_rpms  (cost=0.00..7036.97 rows=132197 width=4) (actual time=0.010..61.111 rows=134150 loops=1)
 Total runtime: 1871.773 ms
(16 rows)</div><!-- end ngIf: !ie8 && !ie9 --><div class="reply text-right">

Comment 3 Bryan Kearney 2018-09-04 17:46:35 UTC
Thank you for your interest in Satellite 6. We have evaluated this request, and we do not expect this to be implemented in the product in the foreseeable future. We are therefore closing this out as WONTFIX. If you have any concerns about this, please feel free to contact Rich Jerrido or Bryan Kearney. Thank you.

Comment 8 Bryan Kearney 2019-08-05 12:22:47 UTC
The Satellite Team is attempting to provide an accurate backlog of bugzilla requests which we feel will be resolved in the next few releases. We do not believe this bugzilla will meet that criteria, and have plans to close it out in 1 month. This is not a reflection on the validity of the request, but a reflection of the many priorities for the product. If you have any concerns about this, feel free to contact Red Hat Technical Support or your account team. If we do not hear from you, we will close this bug out. Thank you.

Comment 9 Bryan Kearney 2019-09-03 18:58:45 UTC
Thank you for your interest in Satellite 6. We have evaluated this request, and while we recognize that it is a valid request, we do not expect this to be implemented in the product in the foreseeable future. This is due to other priorities for the product, and not a reflection on the request itself. We are therefore closing this out as WONTFIX. If you have any concerns about this, please do not reopen. Instead, feel free to contact Red Hat Technical Support. Thank you.

Comment 11 Bryan Kearney 2019-11-04 14:34:14 UTC
The Satellite Team is attempting to provide an accurate backlog of bugzilla requests which we feel will be resolved in the next few releases. We do not believe this bugzilla will meet that criteria, and have plans to close it out in 1 month. This is not a reflection on the validity of the request, but a reflection of the many priorities for the product. If you have any concerns about this, feel free to contact Red Hat Technical Support or your account team. If we do not hear from you, we will close this bug out. Thank you.

Comment 21 Bryan Kearney 2019-12-03 16:34:52 UTC
The Satellite Team is attempting to provide an accurate backlog of bugzilla requests which we feel will be resolved in the next few releases. We do not believe this bugzilla will meet that criteria, and have plans to close it out in 1 month. This is not a reflection on the validity of the request, but a reflection of the many priorities for the product. If you have any concerns about this, feel free to contact Red Hat Technical Support or your account team. If we do not hear from you, we will close this bug out. Thank you.

Comment 26 Partha Aji 2019-12-04 19:43:12 UTC
The count operations tend to be relatively expensive on large tables. On my 32G satellite on a table I did the following test
"""
CREATE TABLE t_test (id int4);
INSERT INTO t_test SELECT * FROM generate_series(1, 9500000) ORDER BY random();
\timing
select count(*) from t_test

count  
---------
 9500000
(1 row)

Time: 529.089 ms

"""
A count on a table with 1 column and 9.5 million rows took about 500 ms that is without any joins. Indexing did not help any either. I also tried different postgres tuning parameters and it did not make a big difference. So I believe we are hitting the natural limits of postgres here. The main issue brought forward here is that we are running this 1.3s  query 20 times (or whatever the page size) causing the slow down.
The correct resolution for this would be caching the content type counts after creating the content view version instead of individually querying each version. Going to open an upstream redmine for that.

Comment 27 Partha Aji 2019-12-04 19:59:34 UTC
Connecting redmine issue https://projects.theforeman.org/issues/28427 from this bug

Comment 28 Bryan Kearney 2019-12-09 21:04:40 UTC
Upstream bug assigned to paji@redhat.com

Comment 29 Bryan Kearney 2019-12-09 21:04:42 UTC
Upstream bug assigned to paji@redhat.com

Comment 30 Bryan Kearney 2019-12-13 23:04:39 UTC
Moving this bug to POST for triage into Satellite 6 since the upstream issue https://projects.theforeman.org/issues/28427 has been resolved.

Comment 31 Lai 2020-06-24 18:46:27 UTC
Steps to Retest:
1.  On a large database visit the Content view versions page for a content view

Actual results:


Expected results:


Additional info:

foreman=# select count(*) from katello_rpms;

 count 
-------
 45871
(1 row)

Time: 100.373 ms


foreman=# select count(*) from katello_repository_rpms;
  count  
---------
 1468096
(1 row)

Time: 157.378 ms

 count 
-------
   160
(1 row)

Time: 1.509 ms


foreman=# EXPLAIN ANALYZE SELECT COUNT(*) FROM "katello_rpms" INNER JOIN "katello_repository_rpms" ON "katello_repository_rpms"."rpm_id" = "katello_rpms"."id" WHERE "katello_repository_rpms"."repository_id" IN (SELECT "katello_repositories"."id" FROM "katello_repositories"  WHERE "katello_repositories"."content_view_version_id" = 4 AND (environment_id is NULL))
;

 Aggregate  (cost=8428.89..8428.90 rows=1 width=8) (actual time=64.250..64.250 rows=1 loops=1)
   ->  Hash Join  (cost=3884.52..8314.18 rows=45885 width=0) (actual time=22.121..60.878 rows=45878 loops=1)
         Hash Cond: (katello_repository_rpms.rpm_id = katello_rpms.id)
         ->  Nested Loop  (cost=0.43..4309.62 rows=45885 width=4) (actual time=0.779..28.105 rows=45878 loops=1)
               ->  Seq Scan on katello_repositories  (cost=0.00..7.00 rows=5 width=4) (actual time=0.014..0.066 rows=5 loops=1)
                     Filter: ((environment_id IS NULL) AND (content_view_version_id = 4))
                     Rows Removed by Filter: 155
               ->  Index Scan using index_katello_repository_rpms_on_repository_id on katello_repository_rpms  (cost=0.43..736.09 rows=12443 width=8) (actual tim
e=0.647..4.500 rows=9176 loops=5)
                     Index Cond: (repository_id = katello_repositories.id)
         ->  Hash  (cost=3310.71..3310.71 rows=45871 width=4) (actual time=21.167..21.167 rows=45871 loops=1)
               Buckets: 65536  Batches: 1  Memory Usage: 2125kB
               ->  Index Only Scan using katello_rpms_pkey on katello_rpms  (cost=0.29..3310.71 rows=45871 width=4) (actual time=0.021..13.033 rows=45871 loops=1
)
                     Heap Fetches: 17460
 Planning Time: 8.825 ms
 Execution Time: 64.342 ms
(15 rows)

Time: 73.958 ms


Additional Note: I also ran Partha's test:
CREATE TABLE t_test (id int4);
INSERT INTO t_test SELECT * FROM generate_series(1, 9500000) ORDER BY random();
\timing
select count(*) from t_test

 count  
---------
 9500000
(1 row)

Time: 347.937 ms


Overall this is an improvement in speed.  I also checked the contentview version page and it took about 2 seconds of loading for 30 published version of 45k+ packages.



Verified on 6.8.0_015 having tfm-rubygem-katello-3.16.0-0.14.rc2.el7sat.noarch.

Comment 34 errata-xmlrpc 2020-10-27 12:57:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Satellite 6.8 release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:4366


Note You need to log in before you can comment on or make changes to this bug.