Description of problem: The Content Hosts page is very slow to load when having 70 ESX hosts registered on Satellite and 1 virtual system. It takes about 2~3 minutes to load list of systems. The Satellite system is bare metal system with Intel Xeon(R) CPU and 46 GB of RAM. Version-Release number of selected component (if applicable): Satellite 6.0.6 How reproducible: Every time. Steps to Reproduce: Try to visit Hosts-->Content Hosts page with some more number of systems(>50). Actual results: The Content Hosts page loads very slowly. Expected results: It should be faster when having system with good configuration. Additional info: Could not see any informative messages in /var/log/foreman/production.log when foreman logging increased.
Since this issue was entered in Red Hat Bugzilla, the release flag has been set to ? to ensure that it is properly evaluated for this release.
Some analysis what Sat6 performs during a single content hosts page generation. The main concern: as "content-host list" response does not provide too much details of each content host, why there are queries like "give me everything about the hosts, lifecycle environments, products,.."? Commenting psql queries (that imho cover lot of the inefficiencies): For a use case with 172 content hosts (debug level of production.log to be uploaded), I see these numbers of SELECTs against different tables: 1 users 22 environments 22 hostgroups 22 hosts 22 models 22 operatingsystems 178 katello_content_view_versions 221 katello_activation_keys 221 katello_content_views 222 katello_custom_info 222 katello_host_collections 308 katello_products 401 taxonomies 742 katello_systems 802 katello_environments - all environments queries are: SELECT "environments".* FROM "environments" WHERE "environments"."id" = 1 ORDER BY environments.name LIMIT 1 - is it necessary to run the same query many times? Isnt it better to cache the result? - all hostgroups queries are: SELECT "hostgroups".* FROM "hostgroups" WHERE "hostgroups"."id" = 2 ORDER BY hostgroups.title LIMIT 1 - same question as above - all models queries are: SELECT "models".* FROM "models" WHERE "models"."id" = 2 ORDER BY models.name LIMIT 1 - same question as above - all operatingsystems queries are: SELECT "operatingsystems".* FROM "operatingsystems" WHERE "operatingsystems"."id" = 2 ORDER BY operatingsystems.name LIMIT 1 - same question as above - katello_content_view_versions queries are often repeated (agains possible to cache results) - katello_activation_keys basically queries the table in a row-after-a-row fashion: SELECT "katello_activation_keys".* FROM "katello_activation_keys" INNER JOIN "katello_system_activation_keys" ON "katello_activation_keys"."id" = "katello_system_activation_keys"."activation_key_id" WHERE "katello_system_activation_keys"."system_id" = 104 SELECT "katello_activation_keys".* FROM "katello_activation_keys" INNER JOIN "katello_system_activation_keys" ON "katello_activation_keys"."id" = "katello_system_activation_keys"."activation_key_id" WHERE "katello_system_activation_keys"."system_id" = 105 SELECT "katello_activation_keys".* FROM "katello_activation_keys" INNER JOIN "katello_system_activation_keys" ON "katello_activation_keys"."id" = "katello_system_activation_keys"."activation_key_id" WHERE "katello_system_activation_keys"."system_id" = 106 - same row-after-a-row queries are against katello_custom_info and katello_host_collections tables - all except one query to taxonomies is: SELECT "taxonomies".* FROM "taxonomies" WHERE "taxonomies"."type" IN ('Organization') AND "taxonomies"."id" = 10 ORDER BY title LIMIT 1 - as the content-host list query is run against a given taxonomy (Organization/Location pair), why to even query this table (more than once)? - hundreds of queries like: SELECT "katello_systems".* FROM "katello_systems" INNER JOIN "katello_system_host_collections" ON "katello_systems"."id" = "katello_system_host_collections"."system_id" WHERE "katello_system_host_collections"."host_collection_id" = 287 - why do we (repeatedly!) need to calculate how many content hosts are in some host collection? How is that usefull for content host list? (and again, this is worth to be cached) - similarly, hundreds of queries like: SELECT "katello_environments".* FROM "katello_environments" INNER JOIN "katello_content_view_environments" ON "katello_environments"."id" = "katello_content_view_environments"."environment_id" WHERE "katello_content_view_environments"."content_view_version_id" = 24 - what's the purpose of such queries? if the content host list needs to print just content-view and lifecycle environment name? - checking what psql queries took the most time: Katello::System Load (260.6ms) SELECT "katello_systems".* FROM "katello_systems" WHERE "katello_systems"."id" IN (241, 213, ..) ORDER BY id = 241 DESC, id = 213 DESC, .. (this is necessary(?) though it's time complexity grows more for more systems) These records take a lot of time (e.g. 15 per second only): 2015-03-06 17:18:39 +0200 (DEBUG) Katello::System Load (0.8ms) SELECT "katello_systems".* FROM "katello_systems" INNER JOIN "katello_system_host_collections" ON "katello_systems"."id" = "katello_system_host_collections"."system_id" WHERE "katello_system_host_collections"."host_collection_id" = 258 2015-03-06 17:18:57 +0200 (DEBUG) Host::Managed Load (1.2ms) SELECT "hosts".* FROM "hosts" WHERE "hosts"."type" IN ('Host::Managed') AND "hosts"."id" = 15 LIMIT 1 2015-03-06 17:18:57 +0200 (DEBUG) Katello::HostCollection Load (1.1ms) SELECT "katello_host_collections".* FROM "katello_host_collections" INNER JOIN "katello_system_host_collections" ON "katello_host_collections"."id" = "katello_system_host_collections"."host_collection_id" WHERE "katello_system_host_collections"."system_id" = 289 2015-03-06 17:19:24 +0200 (DEBUG) Katello::ActivationKey Load (1.7ms) SELECT "katello_activation_keys".* FROM "katello_activation_keys" INNER JOIN "katello_system_activation_keys" ON "katello_activation_keys"."id" = "katello_system_activation_keys"."activation_key_id" WHERE "katello_system_activation_keys"."system_id" = 132 2015-03-06 17:19:44 +0200 (DEBUG) Katello::CustomInfo Load (1.0ms) SELECT "katello_custom_info".* FROM "katello_custom_info" WHERE "katello_custom_info"."informable_id" = 161 AND "katello_custom_info"."informable_type" = 'Katello::System'
Created attachment 999486 [details] relevant part of production.log with debugs (and timestamps) enabled
Created redmine issue http://projects.theforeman.org/issues/10152 from this bug
It appears that there is also an unnecessary where clause when listing systems. Is there any reason to be using an IN clause with ALL of the system ID's rather than omitting the where at all? If it is required can this not be split into multiple IN clauses or use a join or sub-select? Example: <cut> Processing by Katello::Api::V2::SystemsController#index as XML Parameters: {"full_results"=>"true", "per_page"=>"25000", "api_version"=>"v2", "organization_id"=>"3"} Authorized user systems-api(Systems API) ActiveRecord::StatementInvalid: PGError: ERROR: target lists can have at most 1664 entries : SELECT "katello_systems".* FROM "katello_systems" WHERE "katello_systems"."id" IN (... </cut> Is this something that is addressed by the foreman patches in #10152? We plan on using a Satellite 6 install with over 20,000 systems, and it seems like the API was never tested with a large set of content hosts.
Moving to POST since upstream bug http://projects.theforeman.org/issues/10152 has been closed ------------- dustin tsang Applied in changeset commit:katello|20fecb6490a200a7ac8bba577068d7d7b51b3902.
Hi, is there any script which populates satellite6 with content hosts are you willing to share it ? thanks
(In reply to Tazim Kolhar from comment #14) > Hi, > > is there any script which populates satellite6 with content hosts > are you willing to share it ? > > thanks Could you try running in a loop on one content host: subscription-manager subscribe -u admin -p adminPassword subscribe --force subscription-manager clean ? Number of content hosts kept by Satellite should grow, though these will be rather artificial records. But I saw evident slowdown of "content-host list" execution using this fake generating of content hosts. (see little bit relevant bz1159303 that might have similar use cases how to "generate" fake/orhpaned content hosts; some such scenarios were fixed in Sat6.1 one hasnt (and that's ok imho)
(In reply to Pavel Moravec from comment #15) > (In reply to Tazim Kolhar from comment #14) > > Hi, > > > > is there any script which populates satellite6 with content hosts > > are you willing to share it ? > > > > thanks > > Could you try running in a loop on one content host: > > subscription-manager subscribe -u admin -p adminPassword subscribe --force > subscription-manager clean > > ? Actually, if I run a loop, it will get unregistered and re-registered so ultimately there will be only one content host even though i run a loop over one content host For example # subscription-manager register --org="test123" --environment="Library" --autosubscribe --force Username: admin Password: The system has been registered with ID: 119bdf7c-797e-4e8f-a708-a857966a6e1f Installed Product Current Status: Product Name: Red Hat Enterprise Linux Server Status: Subscribed # subscription-manager register --org="test123" --environment="Library" --autosubscribe --force The system with UUID 119bdf7c-797e-4e8f-a708-a857966a6e1f has been unregistered Username: admin Password: The system has been registered with ID: 63ca8c0c-af30-4b77-a5b3-62c97eb0812d Installed Product Current Status: Product Name: Red Hat Enterprise Linux Server Status: Subscribed As you can see if only one content host is used, it gets unregistered first and then re-egistered i hope this makes sense thanks
VERIFIED: # rpm -qa | grep foreman ruby193-rubygem-foreman_docker-1.2.0.14-1.el6_6sat.noarch rubygem-hammer_cli_foreman_docker-0.0.3.6-1.el6_6sat.noarch foreman-1.7.2.24-1.el6_6sat.noarch ruby193-rubygem-foreman-tasks-0.6.12.5-1.el6_6sat.noarch ruby193-rubygem-foreman_gutterball-0.0.1.9-1.el6_6sat.noarch rubygem-hammer_cli_foreman_bootdisk-0.1.2.7-1.el6_6sat.noarch foreman-gce-1.7.2.24-1.el6_6sat.noarch ruby193-rubygem-foreman_discovery-2.0.0.14-1.el6_6sat.noarch ruby193-rubygem-foreman-redhat_access-0.1.0-1.el6_6sat.noarch rubygem-hammer_cli_foreman_tasks-0.0.3.4-1.el6_6sat.noarch foreman-libvirt-1.7.2.24-1.el6_6sat.noarch foreman-vmware-1.7.2.24-1.el6_6sat.noarch ruby193-rubygem-foreman_hooks-0.3.7-2.el6_6sat.noarch foreman-proxy-1.7.2.4-1.el6_6sat.noarch tyan-gt24-08.rhts.eng.bos.redhat.com-foreman-client-1.0-1.noarch rubygem-hammer_cli_foreman-0.1.4.11-1.el6_6sat.noarch tyan-gt24-08.rhts.eng.bos.redhat.com-foreman-proxy-client-1.0-1.noarch foreman-selinux-1.7.2.13-1.el6_6sat.noarch foreman-compute-1.7.2.24-1.el6_6sat.noarch foreman-postgresql-1.7.2.24-1.el6_6sat.noarch rubygem-hammer_cli_foreman_discovery-0.0.1.10-1.el6_6sat.noarch tyan-gt24-08.rhts.eng.bos.redhat.com-foreman-proxy-1.0-1.noarch foreman-ovirt-1.7.2.24-1.el6_6sat.noarch ruby193-rubygem-foreman_bootdisk-4.0.2.13-1.el6_6sat.noarch foreman-debug-1.7.2.24-1.el6_6sat.noarch steps: 1. run in loop of 75 times: $ subscription-manager register --org="test123" --environment="Library" -u admin -p <password> $ subscription-manager clean 2. The Content Hosts page loads quickly screenshot attached
Created attachment 1028586 [details] content host
Created attachment 1028588 [details] content host > 50
This bug is slated to be released with Satellite 6.1.
This bug was fixed in version 6.1.1 of Satellite which was released on 12 August, 2015.
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days