This bug has been migrated to another issue tracking site. It has been closed here and may no longer be being monitored.

If you would like to get updates for this issue, or to participate in it, you may do so at Red Hat Issue Tracker .
Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2249087 - Fetching Host's details does not scale wrt Hosts Collections
Summary: Fetching Host's details does not scale wrt Hosts Collections
Keywords:
Status: CLOSED MIGRATED
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Host Collections
Version: 6.13.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: stream
Assignee: Ian Ballou
QA Contact: Satellite QE Team
URL:
Whiteboard:
: 2249089 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-11-10 16:34 UTC by Pavel Moravec
Modified: 2024-06-06 16:32 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-06-06 16:32:16 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Foreman Issue Tracker 37346 0 Normal Closed Fetching Host's details does not scale wrt Hosts Collections 2024-05-15 21:03:44 UTC
Red Hat Issue Tracker   SAT-21268 0 None Migrated None 2024-06-06 16:32:15 UTC

Description Pavel Moravec 2023-11-10 16:34:03 UTC
Description of problem:
Having many Host Collections with many Hosts associated to them, getting a Host details takes tens of seconds.

E.g. having 5k Hosts and 100 Host Collections where five biggest Collections have 4k Hosts each, querying a Host (which is in all the five biggest Collections) takes tens of seconds.

The more Host Collections (and the bigger they are) the Host is assigned to, the longer the query takes.


Version-Release number of selected component (if applicable):
Sat6.13


How reproducible:
100%


Steps to Reproduce:
1. Have thousands of Hosts (e.g. the 5k I had). You can populate your Satellite via a very few real Hosts, by repeatedly running on them something like:

uuid=$(uuidgen)
echo "{\"dmi.system.uuid\": \"${uuid}\"}" > /etc/rhsm/facts/uuid.facts
hostnamectl set-hostname host-${uuid%%-*}.some.domain.com
subscription-manager clean
subscription-manager register --activationkey ak_test --org RedHat

(each iteration of above commands creates one Host)

2. Fetch a Host details few times, like:
# for i in $(seq 1 10); do time curl -ks -u admin:redhat 'https://localhost/api/v2/hosts/2520' > /dev/null; done 2>&1 | grep real
real	0m0.269s
real	0m0.258s
real	0m0.255s
real	0m0.253s
real	0m0.252s
real	0m0.266s
real	0m0.667s
real	0m0.255s
real	0m0.246s
real	0m0.256s
#

3. Create 100 Host Collections, empty so far:

collections=100

( for i in $(seq 1 $collections); do
	echo "host-collection create --organization RedHat --name Host_Collection_${i} --unlimited-hosts"
done ) | time hammer shell

4. Fetch the Host repeatedly again, to ensure it is still same fast.
5. Add many Hosts to the Host Collections, like e.g.:

( for c in $(seq 1 $collections); do
    hostids=$(su - postgres -c "psql foreman -c \"COPY (SELECT id FROM hosts WHERE id > $((10*c))) TO STDOUT\"" | tr '\n' ',' | sed "s/,$//g")
    echo "host-collection add-host --organization RedHat --name Host_Collection_${c} --host-ids ${hostids}"
done ) | time hammer shell

(each Host Collection will differ by 10 Hosts)

6. Fetch the Host repeatedly again; optionally chose different Hosts depending on how many Host Collections they belong to (and how "big" Collections there are).

E.g.:

su - postgres -c "psql foreman -c \"SELECT COUNT(*),host_id FROM katello_host_collection_hosts GROUP BY host_id ORDER BY count DESC LIMIT 5;\""

tells you Host IDs that are associated to the most Collections.


Actual results:
2. and 4. shows pretty ow times within one second.
6. shows tens of seconds like:

# for i in $(seq 1 10); do time curl -ks -u admin:redhat 'https://localhost/api/v2/hosts/2520' > /dev/null; done 2>&1 | grep real
real	0m24.980s
real	0m35.901s
real	0m26.356s
real	0m23.446s
real	0m26.306s
real	0m26.919s
real	0m36.831s
real	1m10.623s
real	0m35.503s
real	0m19.224s
#

While a Host in just very few Host Collections is OK:

# for i in $(seq 1 10); do time curl -ks -u admin:redhat 'https://localhost/api/v2/hosts/51' > /dev/null; done 2>&1 | grep real
real	0m4.446s
real	0m1.202s
real	0m1.704s
real	0m0.978s
real	0m1.007s
real	0m1.190s
real	0m1.831s
real	0m0.907s
real	0m1.049s
real	0m2.015s
#


Expected results:
All the times should be within a few seconds.


Additional info:
Enabling psql debugs, here is the source of slowness:

2023-11-10T17:28:47 [I|app|c972f5f9] Started GET "/api/v2/hosts/2520" for 127.0.0.1 at 2023-11-10 17:28:47 +0100
2023-11-10T17:28:47 [I|app|c972f5f9] Processing by Api::V2::HostsController#show as JSON
2023-11-10T17:28:47 [I|app|c972f5f9]   Parameters: {"apiv"=>"v2", "id"=>"2520"}
..
2023-11-10T17:28:47 [D|sql|c972f5f9]   Katello::HostCollection Load (0.6ms)  SELECT "katello_host_collections".* FROM "katello_host_collections" INNER JOIN "katello_host_collection_hosts" ON "katello_host_collections"."id" = "katello_host_collection_hosts"."host_collection_id" WHERE "katello_host_collection_hosts"."host_id" = $1  [["host_id", 2520]]
2023-11-10T17:28:47 [D|sql|c972f5f9]   Host::Managed Load (14.7ms)  SELECT "hosts".* FROM "hosts" INNER JOIN "katello_host_collection_hosts" ON "hosts"."id" = "katello_host_collection_hosts"."host_id" WHERE "hosts"."type" = $1 AND "katello_host_collection_hosts"."host_collection_id" = $2  [["type", "Host::Managed"], ["host_collection_id", 201]]
2023-11-10T17:28:47 [D|sql|c972f5f9]   Host::Managed Load (15.1ms)  SELECT "hosts".* FROM "hosts" INNER JOIN "katello_host_collection_hosts" ON "hosts"."id" = "katello_host_collection_hosts"."host_id" WHERE "hosts"."type" = $1 AND "katello_host_collection_hosts"."host_collection_id" = $2  [["type", "Host::Managed"], ["host_collection_id", 202]]
..
2023-11-10T17:29:13 [D|sql|c972f5f9]   Host::Managed Load (13.4ms)  SELECT "hosts".* FROM "hosts" INNER JOIN "katello_host_collection_hosts" ON "hosts"."id" = "katello_host_collection_hosts"."host_id" WHERE "hosts"."type" = $1 AND "katello_host_collection_hosts"."host_collection_id" = $2  [["type", "Host::Managed"], ["host_collection_id", 300]]
2023-11-10T17:29:13 [D|sql|c972f5f9]   FactValue Load (0.9ms)  SELECT "fact_values".* FROM "fact_values" WHERE "fact_values"."host_id" = $1  [["host_id", 2520]]
..
2023-11-10T17:29:13 [I|app|c972f5f9] Completed 200 OK in 25707ms (Views: 23842.3ms | ActiveRecord: 1751.7ms | Allocations: 7109377)


The queries (for each Host Collection the Host belongs to) "give me all (managed) Hosts from the Host Collection" are the culprit; why do we need them at all..?

Comment 1 Ron Lavi 2023-11-12 18:32:54 UTC
*** Bug 2249089 has been marked as a duplicate of this bug. ***

Comment 2 Pavel Moravec 2024-03-24 11:25:37 UTC
(In reply to Pavel Moravec from comment #0)
> Additional info:
> Enabling psql debugs, here is the source of slowness:
> 
> 2023-11-10T17:28:47 [I|app|c972f5f9] Started GET "/api/v2/hosts/2520" for
> 127.0.0.1 at 2023-11-10 17:28:47 +0100
> 2023-11-10T17:28:47 [I|app|c972f5f9] Processing by
> Api::V2::HostsController#show as JSON
> 2023-11-10T17:28:47 [I|app|c972f5f9]   Parameters: {"apiv"=>"v2",
> "id"=>"2520"}
> ..
> 2023-11-10T17:28:47 [D|sql|c972f5f9]   Katello::HostCollection Load (0.6ms) 
> SELECT "katello_host_collections".* FROM "katello_host_collections" INNER
> JOIN "katello_host_collection_hosts" ON "katello_host_collections"."id" =
> "katello_host_collection_hosts"."host_collection_id" WHERE
> "katello_host_collection_hosts"."host_id" = $1  [["host_id", 2520]]
> 2023-11-10T17:28:47 [D|sql|c972f5f9]   Host::Managed Load (14.7ms)  SELECT
> "hosts".* FROM "hosts" INNER JOIN "katello_host_collection_hosts" ON
> "hosts"."id" = "katello_host_collection_hosts"."host_id" WHERE
> "hosts"."type" = $1 AND "katello_host_collection_hosts"."host_collection_id"
> = $2  [["type", "Host::Managed"], ["host_collection_id", 201]]
> 2023-11-10T17:28:47 [D|sql|c972f5f9]   Host::Managed Load (15.1ms)  SELECT
> "hosts".* FROM "hosts" INNER JOIN "katello_host_collection_hosts" ON
> "hosts"."id" = "katello_host_collection_hosts"."host_id" WHERE
> "hosts"."type" = $1 AND "katello_host_collection_hosts"."host_collection_id"
> = $2  [["type", "Host::Managed"], ["host_collection_id", 202]]
> ..
> 2023-11-10T17:29:13 [D|sql|c972f5f9]   Host::Managed Load (13.4ms)  SELECT
> "hosts".* FROM "hosts" INNER JOIN "katello_host_collection_hosts" ON
> "hosts"."id" = "katello_host_collection_hosts"."host_id" WHERE
> "hosts"."type" = $1 AND "katello_host_collection_hosts"."host_collection_id"
> = $2  [["type", "Host::Managed"], ["host_collection_id", 300]]
> 2023-11-10T17:29:13 [D|sql|c972f5f9]   FactValue Load (0.9ms)  SELECT
> "fact_values".* FROM "fact_values" WHERE "fact_values"."host_id" = $1 
> [["host_id", 2520]]
> ..
> 2023-11-10T17:29:13 [I|app|c972f5f9] Completed 200 OK in 25707ms (Views:
> 23842.3ms | ActiveRecord: 1751.7ms | Allocations: 7109377)
> 
> 
> The queries (for each Host Collection the Host belongs to) "give me all
> (managed) Hosts from the Host Collection" are the culprit; why do we need
> them at all..?

Btw I tried to identify what code is responsible for those calls - but I failed. Giving me a hint what triggers those psql queries, I can try to come up with a fix.

Comment 5 Ian Ballou 2024-04-11 16:04:49 UTC
Created redmine issue https://projects.theforeman.org/issues/37346 from this bug

Comment 6 Ian Ballou 2024-04-11 16:21:16 UTC
We've found what is very likely the issue -- the hosts API returns how many hosts are in the host collections that the host belongs to. This is fine, however the code that finds the host count for the host collection was using `length()` instead of `count()`. Using `length()` causes all of the host records attached to the host collection to get loaded into memory. `count()` simply uses a COUNT SQL query to find the number. Details are in the PR (which we could use help testing): https://github.com/Katello/katello/pull/10960

Comment 7 Marek Hulan 2024-04-12 14:46:17 UTC
Oh, me and Adam found the exact same cause yesterday. The length/count/size does not seem to do much difference, I always see N+1 SQL query in the logs. I think this information though does not make sense in this API endpoint. I'd expect user to look at host collection detail to see the amount of hosts. Not a detail of host to find out numbers of hosts attached to all of its collections. I know that would be backwards incompatible change, but should we drop this information from API with proper release notes? The other option is to introduce counter cache column so we don't need to count all hosts every time the API is queried.

Comment 8 Ian Ballou 2024-04-12 15:06:09 UTC
Marek, do you mind putting what n+1 queries you were seeing on the PR (https://github.com/Katello/katello/pull/10960) if you still have the info? The `SELECT COUNT(*) FROM "hosts"` vs `SELECT "hosts".* FROM "hosts"` should make a big difference in avoiding a huge query.

If we find that it doesn't improve performance, I'm okay with just removing the host collection host count information from the API since it's available elsewhere and likely not particularly helpful.

For whoever is testing performance (I think potentially Pavel?), it might be helpful to see the performance difference between the PR solution vs removing it entirely from /usr/share/gems/gems/katello-*/app/views/katello/api/v2/hosts/
host_collections.json.rabl.

Comment 9 Pavel Moravec 2024-04-12 15:24:45 UTC
Hello, the PR https://github.com/Katello/katello/pull/10960 does significantly help.

Having 5095 Hosts and 100 Host Collections like in #c0, then:

- original execution times prior adding Hosts to Host Collections:

# for i in $(seq 1 10); do time curl -ks -u admin:redhat 'https://localhost/api/v2/hosts/4790' > /dev/null; done 2>&1 | grep real
real	0m0.440s
real	0m0.342s
real	0m0.342s
real	0m0.280s
real	0m0.251s
real	0m0.287s
real	0m0.249s
real	0m0.257s
real	0m0.244s
real	0m0.249s
#

- Bad times after adding Hosts to Host Collections:

# for i in $(seq 1 10); do time curl -ks -u admin:redhat 'https://localhost/api/v2/hosts/4790' > /dev/null; done 2>&1 | grep real
real	0m27.062s
real	0m38.244s
real	0m26.237s
real	0m43.044s
real	1m8.279s
real	0m16.226s
real	0m52.270s
real	0m22.764s
real	0m22.230s
real	0m24.327s
#

- Applying the patch and restarting services, reasonable small response times:

# for i in $(seq 1 10); do time curl -ks -u admin:redhat 'https://localhost/api/v2/hosts/4790' > /dev/null; done 2>&1 | grep real
real	0m0.887s
real	0m0.759s
real	0m0.700s
real	0m0.783s
real	0m0.951s
real	0m0.812s
real	0m0.768s
real	0m0.793s
real	0m0.806s
real	0m0.814s
#

Comment 11 Pavel Moravec 2024-04-12 15:35:21 UTC
.. testing also another patch proposal:

sed -i "s/, :total_hosts//1" /usr/share/gems/gems/katello-*/app/views/katello/api/v2/hosts/host_collections.json.rabl

gives even better times:

# for i in $(seq 1 10); do time curl -ks -u admin:redhat 'https://localhost/api/v2/hosts/4790' > /dev/null; done 2>&1 | grep real
real	0m0.273s
real	0m0.272s
real	0m0.260s
real	0m0.276s
real	0m0.249s
real	0m0.265s
real	0m0.261s
real	0m0.261s
real	0m0.254s
real	0m0.262s
#

.. and testing both changes altogether (probably redundant, but..):

# for i in $(seq 1 10); do time curl -ks -u admin:redhat 'https://localhost/api/v2/hosts/4790' > /dev/null; done 2>&1 | grep real
real	0m0.268s
real	0m0.284s
real	0m0.272s
real	0m0.266s
real	0m0.270s
real	0m0.257s
real	0m0.257s
real	0m0.256s
real	0m0.264s
real	0m0.263s
#

Comment 12 Marek Hulan 2024-04-15 07:40:52 UTC
Here's the N+1 from the log (debug log level + sql logger enabled)

2024-04-12T10:36:46 [D|sql|ad42d851]   HostFacets::ReportedDataFacet Load (0.3ms)  SELECT "host_facets_reported_data_facets".* FROM "host_facets_reported_data_facets" WHERE "host_facets_reported_data_facets"."host_id" = $1 LIMIT $2  [["host_id", 3963], ["LIMIT", 1]]
2024-04-12T10:36:46 [D|sql|ad42d851]   HostFacets::InfrastructureFacet Load (0.3ms)  SELECT "host_facets_infrastructure_facets".* FROM "host_facets_infrastructure_facets" WHERE "host_facets_infrastructure_facets"."host_id" = $1 LIMIT $2  [["host_id", 3963], ["LIMIT", 1]]
2024-04-12T10:36:46 [D|sql|ad42d851]    (6.7ms)  SELECT COUNT(*) FROM "hosts" INNER JOIN "katello_host_collection_hosts" ON "hosts"."id" = "katello_host_collection_hosts"."host_id" WHERE "hosts"."type" = $1 AND "katello_host_collection_hosts"."host_collection_id" = $2  [["type", "Host::Managed"], ["host_collection_id", 1]]
2024-04-12T10:36:46 [D|sql|ad42d851]    (5.6ms)  SELECT COUNT(*) FROM "hosts" INNER JOIN "katello_host_collection_hosts" ON "hosts"."id" = "katello_host_collection_hosts"."host_id" WHERE "hosts"."type" = $1 AND "katello_host_collection_hosts"."host_collection_id" = $2  [["type", "Host::Managed"], ["host_collection_id", 2]]
2024-04-12T10:36:46 [D|sql|ad42d851]    (5.6ms)  SELECT COUNT(*) FROM "hosts" INNER JOIN "katello_host_collection_hosts" ON "hosts"."id" = "katello_host_collection_hosts"."host_id" WHERE "hosts"."type" = $1 AND "katello_host_collection_hosts"."host_collection_id" = $2  [["type", "Host::Managed"], ["host_collection_id", 3]]
2024-04-12T10:36:46 [D|sql|ad42d851]    (5.6ms)  SELECT COUNT(*) FROM "hosts" INNER JOIN "katello_host_collection_hosts" ON "hosts"."id" = "katello_host_collection_hosts"."host_id" WHERE "hosts"."type" = $1 AND "katello_host_collection_hosts"."host_collection_id" = $2  [["type", "Host::Managed"], ["host_collection_id", 4]]
2024-04-12T10:36:46 [D|sql|ad42d851]    (5.6ms)  SELECT COUNT(*) FROM "hosts" INNER JOIN "katello_host_collection_hosts" ON "hosts"."id" = "katello_host_collection_hosts"."host_id" WHERE "hosts"."type" = $1 AND "katello_host_collection_hosts"."host_collection_id" = $2  [["type", "Host::Managed"], ["host_collection_id", 5]]
2024-04-12T10:36:46 [D|sql|ad42d851]    (5.4ms)  SELECT COUNT(*) FROM "hosts" INNER JOIN "katello_host_collection_hosts" ON "hosts"."id" = "katello_host_collection_hosts"."host_id" WHERE "hosts"."type" = $1 AND "katello_host_collection_hosts"."host_collection_id" = $2  [["type", "Host::Managed"], ["host_collection_id", 6]]
2024-04-12T10:36:46 [D|sql|ad42d851]    (5.3ms)  SELECT COUNT(*) FROM "hosts" INNER JOIN "katello_host_collection_hosts" ON "hosts"."id" = "katello_host_collection_hosts"."host_id" WHERE "hosts"."type" = $1 AND "katello_host_collection_hosts"."host_collection_id" = $2  [["type", "Host::Managed"], ["host_collection_id", 7]]
2024-04-12T10:36:46 [D|sql|ad42d851]    (5.4ms)  SELECT COUNT(*) FROM "hosts" INNER JOIN "katello_host_collection_hosts" ON "hosts"."id" = "katello_host_collection_hosts"."host_id" WHERE "hosts"."type" = $1 AND "katello_host_collection_hosts"."host_collection_id" = $2  [["type", "Host::Managed"], ["host_collection_id", 8]]

Comment 14 Eric Helms 2024-06-06 16:32:16 UTC
This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there.

Due to differences in account names between systems, some fields were not replicated.  Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information.

To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "SAT-" followed by an integer.  You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like:

"Bugzilla Bug" = 1234567

In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information.


Note You need to log in before you can comment on or make changes to this bug.