Bug 846199
Summary: | recipes page is slow | ||
---|---|---|---|
Product: | [Retired] Beaker | Reporter: | Raymond Mancy <rmancy> |
Component: | web UI | Assignee: | matt jia <mjia> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 0.9 | CC: | bpeck, dcallagh, dowang, ebaak, mjia, rjoost, stl, tools-bugs |
Target Milestone: | 23.0 | Keywords: | EasyFix, Patch |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | ImplementationQuality | ||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2016-07-07 23:12:41 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Raymond Mancy
2012-08-07 06:15:39 UTC
I guess the real problem here is that we are referencing util.js many times unnecessarily on the recipe page, due to a widget. Not sure if the point in comment 1 (about util.js referenced many times) is still true. I think we might have fixed that up a long time ago. However, visiting /recipes/ in our production environemnt, or with a production database dump, is still very slow. In some circumstances I also see it fail with a MySQL error indicating /tmp was exhausted. I suspect it's because that page is issuing a query that produces a large temp table when there are many recipes. Needs more investigation. (In reply to Dan Callaghan from comment #5) > Not sure if the point in comment 1 (about util.js referenced many times) is > still true. I think we might have fixed that up a long time ago. Yes, I think we have fixed this already as I have not seen this problem right now. It seems like the slowness is caused by the query used for counting rows. MariaDB [beaker]> SELECT count(*) AS count_1 -> FROM (SELECT recipe.id AS recipe_id, recipe.recipe_set_id AS recipe_recipe_set_id, recipe.distro_tree_id AS recipe_distro_tree_id, recipe.result AS recipe_result, recipe.status AS recipe_status, recipe.start_time AS recipe_start_time, recipe.finish_time AS recipe_finish_time, recipe._host_requires AS recipe__host_requires, recipe._distro_requires AS recipe__distro_requires, recipe.kickstart AS recipe_kickstart, recipe.type AS recipe_type, recipe.ttasks AS recipe_ttasks, recipe.ntasks AS recipe_ntasks, recipe.ptasks AS recipe_ptasks, recipe.wtasks AS recipe_wtasks, recipe.ftasks AS recipe_ftasks, recipe.ktasks AS recipe_ktasks, recipe.whiteboard AS recipe_whiteboard, recipe.ks_meta AS recipe_ks_meta, recipe.kernel_options AS recipe_kernel_options, recipe.kernel_options_post AS recipe_kernel_options_post, recipe.role AS recipe_role, recipe.panic AS recipe_panic, recipe._partitions AS recipe__partitions, recipe.autopick_random AS recipe_autopick_random, recipe.log_server AS recipe_log_server, recipe.virt_status AS recipe_virt_status -> FROM recipe -> WHERE recipe.type = 'machine_recipe' AND (EXISTS (SELECT 1 -> FROM recipe_set -> WHERE recipe_set.id = recipe.recipe_set_id AND (EXISTS (SELECT 1 -> FROM job -> WHERE job.id = recipe_set.job_id AND job.deleted IS NULL AND job.to_delete IS NULL))))) AS anon_1; +---------+ | count_1 | +---------+ | 584783 | +---------+ 1 row in set (9.95 sec) After fixing up this query, I get a better performace. MariaDB [beaker]> SELECT count(*) AS count_1 -> FROM (SELECT recipe.id AS recipe_id, recipe.recipe_set_id AS recipe_recipe_set_id, recipe.distro_tree_id AS recipe_distro_tree_id, recipe.result AS recipe_result, recipe.status AS recipe_status, recipe.start_time AS recipe_start_time, recipe.finish_time AS recipe_finish_time, recipe._host_requires AS recipe__host_requires, recipe._distro_requires AS recipe__distro_requires, recipe.kickstart AS recipe_kickstart, recipe.type AS recipe_type, recipe.ttasks AS recipe_ttasks, recipe.ntasks AS recipe_ntasks, recipe.ptasks AS recipe_ptasks, recipe.wtasks AS recipe_wtasks, recipe.ftasks AS recipe_ftasks, recipe.ktasks AS recipe_ktasks, recipe.whiteboard AS recipe_whiteboard, recipe.ks_meta AS recipe_ks_meta, recipe.kernel_options AS recipe_kernel_options, recipe.kernel_options_post AS recipe_kernel_options_post, recipe.role AS recipe_role, recipe.panic AS recipe_panic, recipe._partitions AS recipe__partitions, recipe.autopick_random AS recipe_autopick_random, recipe.log_server AS recipe_log_server, recipe.virt_status AS recipe_virt_status -> FROM recipe INNER JOIN recipe_set ON recipe_set.id = recipe.recipe_set_id INNER JOIN job ON job.id = recipe_set.job_id -> WHERE recipe.type = 'machine_recipe' AND job.deleted IS NULL AND job.to_delete IS NULL) AS anon_1 -> ; +---------+ | count_1 | +---------+ | 584783 | +---------+ 1 row in set (2.42 sec) Ideally we should not count rows for the tables with large rows. Once we point this page to Backbone, the performance will be much better as we have already implemented “pageable JSON collections” which does not use a query for counting rows. On Gerrit: https://gerrit.beaker-project.org/#/c/4933/ Yeah so there are two parts to the problem here... for very large tables (like activity) we are not showing the total number of results (or pages) in order to avoid the COUNT() entirely. That was bug 1101817. The other problem here looks like the same one we have hit in many places -- SQLAlchemy changed its behaviour to wrap all queries inside a subquery when doing .count(): bug 1297603, bug 1183239, <http://stackoverflow.com/a/14758255/120202>. Actually I guess there is a *third* problem which is that the query was unnecessarily doing the EXISTS stuff, when it could just be JOIN'ing instead. That's what Matt's patch is fixing. Matt, can you please just paste some timing information using a production db dump for (a) original bad COUNT query, (b) your fixed version with JOIN instead of EXISTS, and (c) with the subquery wrapping removed, by using an explicit func.count() as described in http://stackoverflow.com/a/14758255/120202 ? > Matt, can you please just paste some timing information using a production
> db dump for (a) original bad COUNT query, (b) your fixed version with JOIN
> instead of EXISTS, and (c) with the subquery wrapping removed, by using an
> explicit func.count() as described in
> http://stackoverflow.com/a/14758255/120202 ?
a:
MariaDB [beaker]> SELECT count(*) AS count_1
-> FROM (SELECT recipe.id AS recipe_id, recipe.recipe_set_id AS recipe_recipe_set_id, recipe.distro_tree_id AS recipe_distro_tree_id, recipe.result AS recipe_result, recipe.status AS recipe_status, recipe.start_time AS recipe_start_time, recipe.finish_time AS recipe_finish_time, recipe._host_requires AS recipe__host_requires, recipe._distro_requires AS recipe__distro_requires, recipe.kickstart AS recipe_kickstart, recipe.type AS recipe_type, recipe.ttasks AS recipe_ttasks, recipe.ntasks AS recipe_ntasks, recipe.ptasks AS recipe_ptasks, recipe.wtasks AS recipe_wtasks, recipe.ftasks AS recipe_ftasks, recipe.ktasks AS recipe_ktasks, recipe.whiteboard AS recipe_whiteboard, recipe.ks_meta AS recipe_ks_meta, recipe.kernel_options AS recipe_kernel_options, recipe.kernel_options_post AS recipe_kernel_options_post, recipe.role AS recipe_role, recipe.panic AS recipe_panic, recipe._partitions AS recipe__partitions, recipe.autopick_random AS recipe_autopick_random, recipe.log_server AS recipe_log_server, recipe.virt_status AS recipe_virt_status
-> FROM recipe
-> WHERE recipe.type = 'machine_recipe' AND (EXISTS (SELECT 1
-> FROM recipe_set
-> WHERE recipe_set.id = recipe.recipe_set_id AND (EXISTS (SELECT 1
-> FROM job
-> WHERE job.id = recipe_set.job_id AND job.deleted IS NULL AND job.to_delete IS NULL))))) AS anon_1;
+---------+
| count_1 |
+---------+
| 584783 |
+---------+
1 row in set (10.12 sec)
b:
MariaDB [beaker]> SELECT count(*) AS count_1
-> FROM (SELECT recipe.id AS recipe_id, recipe.recipe_set_id AS recipe_recipe_set_id, recipe.distro_tree_id AS recipe_distro_tree_id, recipe.result AS recipe_result, recipe.status AS recipe_status, recipe.start_time AS recipe_start_time, recipe.finish_time AS recipe_finish_time, recipe._host_requires AS recipe__host_requires, recipe._distro_requires AS recipe__distro_requires, recipe.kickstart AS recipe_kickstart, recipe.type AS recipe_type, recipe.ttasks AS recipe_ttasks, recipe.ntasks AS recipe_ntasks, recipe.ptasks AS recipe_ptasks, recipe.wtasks AS recipe_wtasks, recipe.ftasks AS recipe_ftasks, recipe.ktasks AS recipe_ktasks, recipe.whiteboard AS recipe_whiteboard, recipe.ks_meta AS recipe_ks_meta, recipe.kernel_options AS recipe_kernel_options, recipe.kernel_options_post AS recipe_kernel_options_post, recipe.role AS recipe_role, recipe.panic AS recipe_panic, recipe._partitions AS recipe__partitions, recipe.autopick_random AS recipe_autopick_random, recipe.log_server AS recipe_log_server, recipe.virt_status AS recipe_virt_status
-> FROM recipe INNER JOIN recipe_set ON recipe_set.id = recipe.recipe_set_id INNER JOIN job ON job.id = recipe_set.job_id
-> WHERE recipe.type = 'machine_recipe' AND job.deleted IS NULL AND job.to_delete IS NULL) AS anon_1
-> ;
+---------+
| count_1 |
+---------+
| 584783 |
+---------+
1 row in set (2.52 sec)
c: with the subquery wrapping removed on the current version
MariaDB [beaker]> SELECT count(recipe.id) AS count_1
-> FROM recipe
-> WHERE recipe.type = 'machine_recipe' AND (EXISTS (SELECT 1
-> FROM recipe_set
-> WHERE recipe_set.id = recipe.recipe_set_id AND (EXISTS (SELECT 1
-> FROM job
-> WHERE job.id = recipe_set.job_id AND job.deleted IS NULL AND job.to_delete IS NULL))));
+---------+
| count_1 |
+---------+
| 584783 |
+---------+
1 row in set (9.64 sec)
d: with the subquery wrapping removed on my fixed version
MariaDB [beaker]> SELECT count(recipe.id) AS count_1
-> FROM recipe INNER JOIN recipe_set ON recipe_set.id = recipe.recipe_set_id INNER JOIN job ON job.id = recipe_set.job_id
-> WHERE recipe.type = 'machine_recipe' AND job.deleted IS NULL AND job.to_delete IS NULL
-> ;
+---------+
| count_1 |
+---------+
| 584783 |
+---------+
(In reply to matt jia from comment #12) > d: with the subquery wrapping removed on my fixed version > > MariaDB [beaker]> SELECT count(recipe.id) AS count_1 > -> FROM recipe INNER JOIN recipe_set ON recipe_set.id = > recipe.recipe_set_id INNER JOIN job ON job.id = recipe_set.job_id > -> WHERE recipe.type = 'machine_recipe' AND job.deleted IS NULL AND > job.to_delete IS NULL > -> ; > +---------+ > | count_1 | > +---------+ > | 584783 | > +---------+ Opps, the result is missing. MariaDB [beaker]> SELECT count(recipe.id) AS count_1 -> FROM recipe INNER JOIN recipe_set ON recipe_set.id = -> recipe.recipe_set_id INNER JOIN job ON job.id = recipe_set.job_id -> WHERE recipe.type = 'machine_recipe' AND job.deleted IS NULL AND -> job.to_delete IS NULL; +---------+ | count_1 | +---------+ | 584783 | +---------+ 1 row in set (2.43 sec) Okay, so I guess this is showing that the subquery wrapping done by .count() has no negative impact on performance here, so we don't need to bother changing it. Your fix for the JOINs is all we need. Beaker 23.0 has been released. I think I remember the reason why we had the weird nested EXISTS instead of just doing a JOIN... with optimizer_search_depth=6 set on the server (which we had in our previous setup on db01, and which we still have on dbdevel.bne) the query breaks due to excessively large temp table, I guess because it has more than 6 joins and so MySQL picks a bad query plan. Same thing as in bug 590723. |