Bug 2249087
Summary: | Fetching Host's details does not scale wrt Hosts Collections | ||
---|---|---|---|
Product: | Red Hat Satellite | Reporter: | Pavel Moravec <pmoravec> |
Component: | Host Collections | Assignee: | Ian Ballou <iballou> |
Status: | CLOSED MIGRATED | QA Contact: | Satellite QE Team <sat-qe-bz-list> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 6.13.0 | CC: | iballou, jlenz, jpasqual, jsenkyri, mhulan, rlavi, wpinheir, zhunting |
Target Milestone: | stream | Keywords: | MigratedToJIRA, Performance, Regression, Triaged |
Target Release: | Unused | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2024-06-06 16:32:16 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Pavel Moravec
2023-11-10 16:34:03 UTC
*** Bug 2249089 has been marked as a duplicate of this bug. *** (In reply to Pavel Moravec from comment #0) > Additional info: > Enabling psql debugs, here is the source of slowness: > > 2023-11-10T17:28:47 [I|app|c972f5f9] Started GET "/api/v2/hosts/2520" for > 127.0.0.1 at 2023-11-10 17:28:47 +0100 > 2023-11-10T17:28:47 [I|app|c972f5f9] Processing by > Api::V2::HostsController#show as JSON > 2023-11-10T17:28:47 [I|app|c972f5f9] Parameters: {"apiv"=>"v2", > "id"=>"2520"} > .. > 2023-11-10T17:28:47 [D|sql|c972f5f9] Katello::HostCollection Load (0.6ms) > SELECT "katello_host_collections".* FROM "katello_host_collections" INNER > JOIN "katello_host_collection_hosts" ON "katello_host_collections"."id" = > "katello_host_collection_hosts"."host_collection_id" WHERE > "katello_host_collection_hosts"."host_id" = $1 [["host_id", 2520]] > 2023-11-10T17:28:47 [D|sql|c972f5f9] Host::Managed Load (14.7ms) SELECT > "hosts".* FROM "hosts" INNER JOIN "katello_host_collection_hosts" ON > "hosts"."id" = "katello_host_collection_hosts"."host_id" WHERE > "hosts"."type" = $1 AND "katello_host_collection_hosts"."host_collection_id" > = $2 [["type", "Host::Managed"], ["host_collection_id", 201]] > 2023-11-10T17:28:47 [D|sql|c972f5f9] Host::Managed Load (15.1ms) SELECT > "hosts".* FROM "hosts" INNER JOIN "katello_host_collection_hosts" ON > "hosts"."id" = "katello_host_collection_hosts"."host_id" WHERE > "hosts"."type" = $1 AND "katello_host_collection_hosts"."host_collection_id" > = $2 [["type", "Host::Managed"], ["host_collection_id", 202]] > .. > 2023-11-10T17:29:13 [D|sql|c972f5f9] Host::Managed Load (13.4ms) SELECT > "hosts".* FROM "hosts" INNER JOIN "katello_host_collection_hosts" ON > "hosts"."id" = "katello_host_collection_hosts"."host_id" WHERE > "hosts"."type" = $1 AND "katello_host_collection_hosts"."host_collection_id" > = $2 [["type", "Host::Managed"], ["host_collection_id", 300]] > 2023-11-10T17:29:13 [D|sql|c972f5f9] FactValue Load (0.9ms) SELECT > "fact_values".* FROM "fact_values" WHERE "fact_values"."host_id" = $1 > [["host_id", 2520]] > .. > 2023-11-10T17:29:13 [I|app|c972f5f9] Completed 200 OK in 25707ms (Views: > 23842.3ms | ActiveRecord: 1751.7ms | Allocations: 7109377) > > > The queries (for each Host Collection the Host belongs to) "give me all > (managed) Hosts from the Host Collection" are the culprit; why do we need > them at all..? Btw I tried to identify what code is responsible for those calls - but I failed. Giving me a hint what triggers those psql queries, I can try to come up with a fix. Created redmine issue https://projects.theforeman.org/issues/37346 from this bug We've found what is very likely the issue -- the hosts API returns how many hosts are in the host collections that the host belongs to. This is fine, however the code that finds the host count for the host collection was using `length()` instead of `count()`. Using `length()` causes all of the host records attached to the host collection to get loaded into memory. `count()` simply uses a COUNT SQL query to find the number. Details are in the PR (which we could use help testing): https://github.com/Katello/katello/pull/10960 Oh, me and Adam found the exact same cause yesterday. The length/count/size does not seem to do much difference, I always see N+1 SQL query in the logs. I think this information though does not make sense in this API endpoint. I'd expect user to look at host collection detail to see the amount of hosts. Not a detail of host to find out numbers of hosts attached to all of its collections. I know that would be backwards incompatible change, but should we drop this information from API with proper release notes? The other option is to introduce counter cache column so we don't need to count all hosts every time the API is queried. Marek, do you mind putting what n+1 queries you were seeing on the PR (https://github.com/Katello/katello/pull/10960) if you still have the info? The `SELECT COUNT(*) FROM "hosts"` vs `SELECT "hosts".* FROM "hosts"` should make a big difference in avoiding a huge query. If we find that it doesn't improve performance, I'm okay with just removing the host collection host count information from the API since it's available elsewhere and likely not particularly helpful. For whoever is testing performance (I think potentially Pavel?), it might be helpful to see the performance difference between the PR solution vs removing it entirely from /usr/share/gems/gems/katello-*/app/views/katello/api/v2/hosts/ host_collections.json.rabl. Hello, the PR https://github.com/Katello/katello/pull/10960 does significantly help. Having 5095 Hosts and 100 Host Collections like in #c0, then: - original execution times prior adding Hosts to Host Collections: # for i in $(seq 1 10); do time curl -ks -u admin:redhat 'https://localhost/api/v2/hosts/4790' > /dev/null; done 2>&1 | grep real real 0m0.440s real 0m0.342s real 0m0.342s real 0m0.280s real 0m0.251s real 0m0.287s real 0m0.249s real 0m0.257s real 0m0.244s real 0m0.249s # - Bad times after adding Hosts to Host Collections: # for i in $(seq 1 10); do time curl -ks -u admin:redhat 'https://localhost/api/v2/hosts/4790' > /dev/null; done 2>&1 | grep real real 0m27.062s real 0m38.244s real 0m26.237s real 0m43.044s real 1m8.279s real 0m16.226s real 0m52.270s real 0m22.764s real 0m22.230s real 0m24.327s # - Applying the patch and restarting services, reasonable small response times: # for i in $(seq 1 10); do time curl -ks -u admin:redhat 'https://localhost/api/v2/hosts/4790' > /dev/null; done 2>&1 | grep real real 0m0.887s real 0m0.759s real 0m0.700s real 0m0.783s real 0m0.951s real 0m0.812s real 0m0.768s real 0m0.793s real 0m0.806s real 0m0.814s # .. testing also another patch proposal: sed -i "s/, :total_hosts//1" /usr/share/gems/gems/katello-*/app/views/katello/api/v2/hosts/host_collections.json.rabl gives even better times: # for i in $(seq 1 10); do time curl -ks -u admin:redhat 'https://localhost/api/v2/hosts/4790' > /dev/null; done 2>&1 | grep real real 0m0.273s real 0m0.272s real 0m0.260s real 0m0.276s real 0m0.249s real 0m0.265s real 0m0.261s real 0m0.261s real 0m0.254s real 0m0.262s # .. and testing both changes altogether (probably redundant, but..): # for i in $(seq 1 10); do time curl -ks -u admin:redhat 'https://localhost/api/v2/hosts/4790' > /dev/null; done 2>&1 | grep real real 0m0.268s real 0m0.284s real 0m0.272s real 0m0.266s real 0m0.270s real 0m0.257s real 0m0.257s real 0m0.256s real 0m0.264s real 0m0.263s # Here's the N+1 from the log (debug log level + sql logger enabled) 2024-04-12T10:36:46 [D|sql|ad42d851] HostFacets::ReportedDataFacet Load (0.3ms) SELECT "host_facets_reported_data_facets".* FROM "host_facets_reported_data_facets" WHERE "host_facets_reported_data_facets"."host_id" = $1 LIMIT $2 [["host_id", 3963], ["LIMIT", 1]] 2024-04-12T10:36:46 [D|sql|ad42d851] HostFacets::InfrastructureFacet Load (0.3ms) SELECT "host_facets_infrastructure_facets".* FROM "host_facets_infrastructure_facets" WHERE "host_facets_infrastructure_facets"."host_id" = $1 LIMIT $2 [["host_id", 3963], ["LIMIT", 1]] 2024-04-12T10:36:46 [D|sql|ad42d851] (6.7ms) SELECT COUNT(*) FROM "hosts" INNER JOIN "katello_host_collection_hosts" ON "hosts"."id" = "katello_host_collection_hosts"."host_id" WHERE "hosts"."type" = $1 AND "katello_host_collection_hosts"."host_collection_id" = $2 [["type", "Host::Managed"], ["host_collection_id", 1]] 2024-04-12T10:36:46 [D|sql|ad42d851] (5.6ms) SELECT COUNT(*) FROM "hosts" INNER JOIN "katello_host_collection_hosts" ON "hosts"."id" = "katello_host_collection_hosts"."host_id" WHERE "hosts"."type" = $1 AND "katello_host_collection_hosts"."host_collection_id" = $2 [["type", "Host::Managed"], ["host_collection_id", 2]] 2024-04-12T10:36:46 [D|sql|ad42d851] (5.6ms) SELECT COUNT(*) FROM "hosts" INNER JOIN "katello_host_collection_hosts" ON "hosts"."id" = "katello_host_collection_hosts"."host_id" WHERE "hosts"."type" = $1 AND "katello_host_collection_hosts"."host_collection_id" = $2 [["type", "Host::Managed"], ["host_collection_id", 3]] 2024-04-12T10:36:46 [D|sql|ad42d851] (5.6ms) SELECT COUNT(*) FROM "hosts" INNER JOIN "katello_host_collection_hosts" ON "hosts"."id" = "katello_host_collection_hosts"."host_id" WHERE "hosts"."type" = $1 AND "katello_host_collection_hosts"."host_collection_id" = $2 [["type", "Host::Managed"], ["host_collection_id", 4]] 2024-04-12T10:36:46 [D|sql|ad42d851] (5.6ms) SELECT COUNT(*) FROM "hosts" INNER JOIN "katello_host_collection_hosts" ON "hosts"."id" = "katello_host_collection_hosts"."host_id" WHERE "hosts"."type" = $1 AND "katello_host_collection_hosts"."host_collection_id" = $2 [["type", "Host::Managed"], ["host_collection_id", 5]] 2024-04-12T10:36:46 [D|sql|ad42d851] (5.4ms) SELECT COUNT(*) FROM "hosts" INNER JOIN "katello_host_collection_hosts" ON "hosts"."id" = "katello_host_collection_hosts"."host_id" WHERE "hosts"."type" = $1 AND "katello_host_collection_hosts"."host_collection_id" = $2 [["type", "Host::Managed"], ["host_collection_id", 6]] 2024-04-12T10:36:46 [D|sql|ad42d851] (5.3ms) SELECT COUNT(*) FROM "hosts" INNER JOIN "katello_host_collection_hosts" ON "hosts"."id" = "katello_host_collection_hosts"."host_id" WHERE "hosts"."type" = $1 AND "katello_host_collection_hosts"."host_collection_id" = $2 [["type", "Host::Managed"], ["host_collection_id", 7]] 2024-04-12T10:36:46 [D|sql|ad42d851] (5.4ms) SELECT COUNT(*) FROM "hosts" INNER JOIN "katello_host_collection_hosts" ON "hosts"."id" = "katello_host_collection_hosts"."host_id" WHERE "hosts"."type" = $1 AND "katello_host_collection_hosts"."host_collection_id" = $2 [["type", "Host::Managed"], ["host_collection_id", 8]] This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there. Due to differences in account names between systems, some fields were not replicated. Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information. To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "SAT-" followed by an integer. You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like: "Bugzilla Bug" = 1234567 In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information. |