Description of problem: The content view versions list queries the # of rpms in each version. This query seems slow (500ms) on a large database with lots of repos and packages Version-Release number of selected component (if applicable): 6.2.11 How reproducible: Always Steps to Reproduce: 1. On a large database visit the Content view versions page for a content view 2. the response is slower than it shoudl be 3. Actual results: Expected results: Additional info: foreman=# select count(*) from katello_rpms foreman-# ; count -------- 133398 (1 row) foreman=# select count(*) from katello_repository_rpms; count --------- 5087946 (1 row) foreman=# select count(*) from katello_repositories; count ------- 2348 (1 row) foreman=# EXPLAIN ANALYZE SELECT COUNT(*) FROM "katello_rpms" INNER JOIN "katello_repository_rpms" ON "katello_repository_rpms"."rpm_id" = "katello_rpms"."id" WHERE "katello_repository_rpms"."repository_id" IN (SELECT "katello_repositories"."id" FROM "katello_repositories" WHERE "katello_repositories"."content_view_version_id" = 335 AND (environment_id is NULL)) ; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------------------------- Aggregate (cost=137134.88..137134.89 rows=1 width=0) (actual time=1871.574..1871.575 rows=1 loops=1) -> Hash Join (cost=9301.51..136914.74 rows=88057 width=0) (actual time=1785.780..1867.048 rows=38549 loops=1) Hash Cond: (katello_repository_rpms.rpm_id = katello_rpms.id) -> Hash Semi Join (cost=95.08..124962.31 rows=88057 width=4) (actual time=1687.052..1709.980 rows=38549 loops=1) Hash Cond: (katello_repository_rpms.repository_id = katello_repositories.id) -> Seq Scan on katello_repository_rpms (cost=0.00..112676.80 rows=4270780 width=8) (actual time=0.004..865.591 rows=5099589 loops=1) -> Hash (cost=94.85..94.85 rows=18 width=4) (actual time=0.079..0.079 rows=16 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 1kB -> Index Scan using index_repositories_on_content_view_version_id on katello_repositories (cost=0.00..94.85 rows=18 width=4) (actua l time=0.060..0.071 rows=16 loops=1) Index Cond: (content_view_version_id = 335) Filter: (environment_id IS NULL) Rows Removed by Filter: 16 -> Hash (cost=7036.97..7036.97 rows=132197 width=4) (actual time=98.234..98.234 rows=134150 loops=1) Buckets: 4096 Batches: 8 Memory Usage: 600kB -> Seq Scan on katello_rpms (cost=0.00..7036.97 rows=132197 width=4) (actual time=0.010..61.111 rows=134150 loops=1) Total runtime: 1871.773 ms (16 rows)</div><!-- end ngIf: !ie8 && !ie9 --><div class="reply text-right">
Thank you for your interest in Satellite 6. We have evaluated this request, and we do not expect this to be implemented in the product in the foreseeable future. We are therefore closing this out as WONTFIX. If you have any concerns about this, please feel free to contact Rich Jerrido or Bryan Kearney. Thank you.
The Satellite Team is attempting to provide an accurate backlog of bugzilla requests which we feel will be resolved in the next few releases. We do not believe this bugzilla will meet that criteria, and have plans to close it out in 1 month. This is not a reflection on the validity of the request, but a reflection of the many priorities for the product. If you have any concerns about this, feel free to contact Red Hat Technical Support or your account team. If we do not hear from you, we will close this bug out. Thank you.
Thank you for your interest in Satellite 6. We have evaluated this request, and while we recognize that it is a valid request, we do not expect this to be implemented in the product in the foreseeable future. This is due to other priorities for the product, and not a reflection on the request itself. We are therefore closing this out as WONTFIX. If you have any concerns about this, please do not reopen. Instead, feel free to contact Red Hat Technical Support. Thank you.
The count operations tend to be relatively expensive on large tables. On my 32G satellite on a table I did the following test """ CREATE TABLE t_test (id int4); INSERT INTO t_test SELECT * FROM generate_series(1, 9500000) ORDER BY random(); \timing select count(*) from t_test count --------- 9500000 (1 row) Time: 529.089 ms """ A count on a table with 1 column and 9.5 million rows took about 500 ms that is without any joins. Indexing did not help any either. I also tried different postgres tuning parameters and it did not make a big difference. So I believe we are hitting the natural limits of postgres here. The main issue brought forward here is that we are running this 1.3s query 20 times (or whatever the page size) causing the slow down. The correct resolution for this would be caching the content type counts after creating the content view version instead of individually querying each version. Going to open an upstream redmine for that.
Connecting redmine issue https://projects.theforeman.org/issues/28427 from this bug
Upstream bug assigned to paji
Moving this bug to POST for triage into Satellite 6 since the upstream issue https://projects.theforeman.org/issues/28427 has been resolved.
Steps to Retest: 1. On a large database visit the Content view versions page for a content view Actual results: Expected results: Additional info: foreman=# select count(*) from katello_rpms; count ------- 45871 (1 row) Time: 100.373 ms foreman=# select count(*) from katello_repository_rpms; count --------- 1468096 (1 row) Time: 157.378 ms count ------- 160 (1 row) Time: 1.509 ms foreman=# EXPLAIN ANALYZE SELECT COUNT(*) FROM "katello_rpms" INNER JOIN "katello_repository_rpms" ON "katello_repository_rpms"."rpm_id" = "katello_rpms"."id" WHERE "katello_repository_rpms"."repository_id" IN (SELECT "katello_repositories"."id" FROM "katello_repositories" WHERE "katello_repositories"."content_view_version_id" = 4 AND (environment_id is NULL)) ; Aggregate (cost=8428.89..8428.90 rows=1 width=8) (actual time=64.250..64.250 rows=1 loops=1) -> Hash Join (cost=3884.52..8314.18 rows=45885 width=0) (actual time=22.121..60.878 rows=45878 loops=1) Hash Cond: (katello_repository_rpms.rpm_id = katello_rpms.id) -> Nested Loop (cost=0.43..4309.62 rows=45885 width=4) (actual time=0.779..28.105 rows=45878 loops=1) -> Seq Scan on katello_repositories (cost=0.00..7.00 rows=5 width=4) (actual time=0.014..0.066 rows=5 loops=1) Filter: ((environment_id IS NULL) AND (content_view_version_id = 4)) Rows Removed by Filter: 155 -> Index Scan using index_katello_repository_rpms_on_repository_id on katello_repository_rpms (cost=0.43..736.09 rows=12443 width=8) (actual tim e=0.647..4.500 rows=9176 loops=5) Index Cond: (repository_id = katello_repositories.id) -> Hash (cost=3310.71..3310.71 rows=45871 width=4) (actual time=21.167..21.167 rows=45871 loops=1) Buckets: 65536 Batches: 1 Memory Usage: 2125kB -> Index Only Scan using katello_rpms_pkey on katello_rpms (cost=0.29..3310.71 rows=45871 width=4) (actual time=0.021..13.033 rows=45871 loops=1 ) Heap Fetches: 17460 Planning Time: 8.825 ms Execution Time: 64.342 ms (15 rows) Time: 73.958 ms Additional Note: I also ran Partha's test: CREATE TABLE t_test (id int4); INSERT INTO t_test SELECT * FROM generate_series(1, 9500000) ORDER BY random(); \timing select count(*) from t_test count --------- 9500000 (1 row) Time: 347.937 ms Overall this is an improvement in speed. I also checked the contentview version page and it took about 2 seconds of loading for 30 published version of 45k+ packages. Verified on 6.8.0_015 having tfm-rubygem-katello-3.16.0-0.14.rc2.el7sat.noarch.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Satellite 6.8 release), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:4366