Bug 1191711

Summary:

The Content Hosts page is loading very slowly on Satellite 6

Product:

Red Hat Satellite

Reporter:

Amit Upadhye <aupadhye>

Component:

WebUI

Assignee:

Mike McCune <mmccune>

WebUI sub component:

Katello

QA Contact:

Tazim Kolhar <tkolhar>

Status:

CLOSED CURRENTRELEASE

Docs Contact:

Severity:

high

Priority:

medium

CC:

aupadhye, bkearney, chrobert, cwelton, dzr0001, jnikolak, jswensso, ktordeur, ldelouw, mlinden, mmccune, pdwyer, pmoravec, pmutha, tbily, tkolhar

Version:

6.0.6

Keywords:

Triaged

Target Milestone:

Unspecified

Target Release:

Unused

Hardware:

All

OS:

Linux

URL:

http://projects.theforeman.org/issues/10152

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2015-08-12 13:57:46 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
content host	none
content host > 50	none

Description Amit Upadhye 2015-02-11 19:38:09 UTC

Description of problem:

The Content Hosts page is very slow to load when having 70 ESX hosts registered on Satellite and 1 virtual system. It takes about 2~3 minutes to load list of systems. The Satellite system is bare metal system with Intel Xeon(R) CPU and 46 GB of RAM.

Version-Release number of selected component (if applicable):

Satellite 6.0.6

How reproducible:

Every time.

Steps to Reproduce:

Try to visit Hosts-->Content Hosts page with some more number of systems(>50).

Actual results:

The Content Hosts page loads very slowly.

Expected results:

It should be faster when having system with good configuration.

Additional info:

Could not see any informative messages in /var/log/foreman/production.log when foreman logging increased.

Comment 1 RHEL Program Management 2015-02-11 19:53:11 UTC

Since this issue was entered in Red Hat Bugzilla, the release flag has been
set to ? to ensure that it is properly evaluated for this release.

Comment 3 Pavel Moravec 2015-03-09 10:48:10 UTC

Some analysis what Sat6 performs during a single content hosts page generation. The main concern: as "content-host list" response does not provide too much details of each content host, why there are queries like "give me everything about the hosts, lifecycle environments, products,.."?

Commenting psql queries (that imho cover lot of the inefficiencies):

For a use case with 172 content hosts (debug level of production.log to be uploaded), I see these numbers of SELECTs against different tables:

      1 users
     22 environments
     22 hostgroups
     22 hosts
     22 models
     22 operatingsystems
    178 katello_content_view_versions
    221 katello_activation_keys
    221 katello_content_views
    222 katello_custom_info
    222 katello_host_collections
    308 katello_products
    401 taxonomies
    742 katello_systems
    802 katello_environments

- all environments queries are: SELECT "environments".* FROM "environments" WHERE "environments"."id" = 1 ORDER BY environments.name LIMIT 1
  - is it necessary to run the same query many times? Isnt it better to cache the result?

- all hostgroups queries are: SELECT "hostgroups".* FROM "hostgroups" WHERE "hostgroups"."id" = 2 ORDER BY hostgroups.title LIMIT 1
  - same question as above

- all models queries are: SELECT "models".* FROM "models" WHERE "models"."id" = 2 ORDER BY models.name LIMIT 1
  - same question as above

- all operatingsystems queries are: SELECT "operatingsystems".* FROM "operatingsystems" WHERE "operatingsystems"."id" = 2 ORDER BY operatingsystems.name LIMIT 1
  - same question as above

- katello_content_view_versions queries are often repeated (agains possible to cache results)

- katello_activation_keys basically queries the table in a row-after-a-row fashion:
  SELECT "katello_activation_keys".* FROM "katello_activation_keys" INNER JOIN "katello_system_activation_keys" ON "katello_activation_keys"."id" = "katello_system_activation_keys"."activation_key_id" WHERE "katello_system_activation_keys"."system_id" = 104
  SELECT "katello_activation_keys".* FROM "katello_activation_keys" INNER JOIN "katello_system_activation_keys" ON "katello_activation_keys"."id" = "katello_system_activation_keys"."activation_key_id" WHERE "katello_system_activation_keys"."system_id" = 105
  SELECT "katello_activation_keys".* FROM "katello_activation_keys" INNER JOIN "katello_system_activation_keys" ON "katello_activation_keys"."id" = "katello_system_activation_keys"."activation_key_id" WHERE "katello_system_activation_keys"."system_id" = 106

- same row-after-a-row queries are against katello_custom_info and katello_host_collections tables

- all except one query to taxonomies is: SELECT "taxonomies".* FROM "taxonomies" WHERE "taxonomies"."type" IN ('Organization') AND "taxonomies"."id" = 10 ORDER BY title LIMIT 1
  - as the content-host list query is run against a given taxonomy (Organization/Location pair), why to even query this table (more than once)?

- hundreds of queries like: SELECT "katello_systems".* FROM "katello_systems" INNER JOIN "katello_system_host_collections" ON "katello_systems"."id" = "katello_system_host_collections"."system_id" WHERE "katello_system_host_collections"."host_collection_id" = 287
  - why do we (repeatedly!) need to calculate how many content hosts are in some host collection? How is that usefull for content host list? (and again, this is worth to be cached)

- similarly, hundreds of queries like: SELECT "katello_environments".* FROM "katello_environments" INNER JOIN "katello_content_view_environments" ON "katello_environments"."id" = "katello_content_view_environments"."environment_id" WHERE "katello_content_view_environments"."content_view_version_id" = 24
  - what's the purpose of such queries? if the content host list needs to print just content-view and lifecycle environment name?


- checking what psql queries took the most time:

Katello::System Load (260.6ms)  SELECT "katello_systems".* FROM "katello_systems" WHERE "katello_systems"."id" IN (241, 213, ..) ORDER BY id = 241 DESC, id = 213 DESC, ..
(this is necessary(?) though it's time complexity grows more for more systems)


These records take a lot of time (e.g. 15 per second only):

2015-03-06 17:18:39 +0200 (DEBUG)   Katello::System Load (0.8ms)  SELECT "katello_systems".* FROM "katello_systems" INNER JOIN "katello_system_host_collections" ON "katello_systems"."id" = "katello_system_host_collections"."system_id" WHERE "katello_system_host_collections"."host_collection_id" = 258

2015-03-06 17:18:57 +0200 (DEBUG)   Host::Managed Load (1.2ms)  SELECT "hosts".* FROM "hosts" WHERE "hosts"."type" IN ('Host::Managed') AND "hosts"."id" = 15 LIMIT 1

2015-03-06 17:18:57 +0200 (DEBUG)   Katello::HostCollection Load (1.1ms)  SELECT "katello_host_collections".* FROM "katello_host_collections" INNER JOIN "katello_system_host_collections" ON "katello_host_collections"."id" = "katello_system_host_collections"."host_collection_id" WHERE "katello_system_host_collections"."system_id" = 289

2015-03-06 17:19:24 +0200 (DEBUG)   Katello::ActivationKey Load (1.7ms)  SELECT "katello_activation_keys".* FROM "katello_activation_keys" INNER JOIN "katello_system_activation_keys" ON "katello_activation_keys"."id" = "katello_system_activation_keys"."activation_key_id" WHERE "katello_system_activation_keys"."system_id" = 132

2015-03-06 17:19:44 +0200 (DEBUG)   Katello::CustomInfo Load (1.0ms)  SELECT "katello_custom_info".* FROM "katello_custom_info" WHERE "katello_custom_info"."informable_id" = 161 AND "katello_custom_info"."informable_type" = 'Katello::System'

Comment 4 Pavel Moravec 2015-03-09 10:49:24 UTC

Created attachment 999486 [details]
relevant part of production.log with debugs (and timestamps) enabled

Comment 6 Dustin Tsang 2015-04-16 04:52:02 UTC

Created redmine issue http://projects.theforeman.org/issues/10152 from this bug

Comment 7 dzr0001 2015-04-16 16:43:28 UTC

It appears that there is also an unnecessary where clause when listing systems. Is there any reason to be using an IN clause with ALL of the system ID's rather than omitting the where at all? If it is required can this not be split into multiple IN clauses or use a join or sub-select? 

Example:

<cut>
Processing by Katello::Api::V2::SystemsController#index as XML
  Parameters: {"full_results"=>"true", "per_page"=>"25000", "api_version"=>"v2", "organization_id"=>"3"}
Authorized user systems-api(Systems API)
ActiveRecord::StatementInvalid: PGError: ERROR:  target lists can have at most 1664 entries
: SELECT "katello_systems".* FROM "katello_systems"  WHERE "katello_systems"."id" IN (...
</cut>

Is this something that is addressed by the foreman patches in #10152?

We plan on using a Satellite 6 install with over 20,000 systems, and it seems like the API was never tested with a large set of content hosts.

Comment 11 Bryan Kearney 2015-05-15 17:53:54 UTC

Moving to POST since upstream bug http://projects.theforeman.org/issues/10152 has been closed
-------------
dustin tsang
Applied in changeset commit:katello|20fecb6490a200a7ac8bba577068d7d7b51b3902.

Comment 14 Tazim Kolhar 2015-05-21 06:55:10 UTC

Hi,

     is there any script which populates satellite6 with content hosts
     are you willing to share it ?

thanks

Comment 15 Pavel Moravec 2015-05-21 07:20:33 UTC

(In reply to Tazim Kolhar from comment #14)
> Hi,
> 
>      is there any script which populates satellite6 with content hosts
>      are you willing to share it ?
> 
> thanks

Could you try running in a loop on one content host:

subscription-manager subscribe -u admin -p adminPassword subscribe --force
subscription-manager clean

?

Number of content hosts kept by Satellite should grow, though these will be rather artificial records. But I saw evident slowdown of "content-host list" execution using this fake generating of content hosts.

(see little bit relevant bz1159303 that might have similar use cases how to "generate" fake/orhpaned content hosts; some such scenarios were fixed in Sat6.1 one hasnt (and that's ok imho)

Comment 16 Tazim Kolhar 2015-05-22 07:55:18 UTC

(In reply to Pavel Moravec from comment #15)
> (In reply to Tazim Kolhar from comment #14)
> > Hi,
> > 
> >      is there any script which populates satellite6 with content hosts
> >      are you willing to share it ?
> > 
> > thanks
> 
> Could you try running in a loop on one content host:
> 
> subscription-manager subscribe -u admin -p adminPassword subscribe --force
> subscription-manager clean
> 
> ?

Actually, if I run a loop, it will get unregistered and re-registered
so ultimately there will be only one content host even though i run a loop
over one content host

For example 
# subscription-manager register --org="test123" --environment="Library" --autosubscribe --force
Username: admin
Password: 
The system has been registered with ID: 119bdf7c-797e-4e8f-a708-a857966a6e1f 
Installed Product Current Status:
Product Name: Red Hat Enterprise Linux Server
Status:       Subscribed

# subscription-manager register --org="test123" --environment="Library" --autosubscribe --force
The system with UUID 119bdf7c-797e-4e8f-a708-a857966a6e1f has been unregistered
Username: admin
Password: 
The system has been registered with ID: 63ca8c0c-af30-4b77-a5b3-62c97eb0812d 
Installed Product Current Status:
Product Name: Red Hat Enterprise Linux Server
Status:       Subscribed

As you can see if only one content host is used, it gets unregistered first and then re-egistered
i hope this makes sense 
thanks

Comment 17 Tazim Kolhar 2015-05-22 08:18:40 UTC

VERIFIED:

# rpm -qa | grep foreman
ruby193-rubygem-foreman_docker-1.2.0.14-1.el6_6sat.noarch
rubygem-hammer_cli_foreman_docker-0.0.3.6-1.el6_6sat.noarch
foreman-1.7.2.24-1.el6_6sat.noarch
ruby193-rubygem-foreman-tasks-0.6.12.5-1.el6_6sat.noarch
ruby193-rubygem-foreman_gutterball-0.0.1.9-1.el6_6sat.noarch
rubygem-hammer_cli_foreman_bootdisk-0.1.2.7-1.el6_6sat.noarch
foreman-gce-1.7.2.24-1.el6_6sat.noarch
ruby193-rubygem-foreman_discovery-2.0.0.14-1.el6_6sat.noarch
ruby193-rubygem-foreman-redhat_access-0.1.0-1.el6_6sat.noarch
rubygem-hammer_cli_foreman_tasks-0.0.3.4-1.el6_6sat.noarch
foreman-libvirt-1.7.2.24-1.el6_6sat.noarch
foreman-vmware-1.7.2.24-1.el6_6sat.noarch
ruby193-rubygem-foreman_hooks-0.3.7-2.el6_6sat.noarch
foreman-proxy-1.7.2.4-1.el6_6sat.noarch
tyan-gt24-08.rhts.eng.bos.redhat.com-foreman-client-1.0-1.noarch
rubygem-hammer_cli_foreman-0.1.4.11-1.el6_6sat.noarch
tyan-gt24-08.rhts.eng.bos.redhat.com-foreman-proxy-client-1.0-1.noarch
foreman-selinux-1.7.2.13-1.el6_6sat.noarch
foreman-compute-1.7.2.24-1.el6_6sat.noarch
foreman-postgresql-1.7.2.24-1.el6_6sat.noarch
rubygem-hammer_cli_foreman_discovery-0.0.1.10-1.el6_6sat.noarch
tyan-gt24-08.rhts.eng.bos.redhat.com-foreman-proxy-1.0-1.noarch
foreman-ovirt-1.7.2.24-1.el6_6sat.noarch
ruby193-rubygem-foreman_bootdisk-4.0.2.13-1.el6_6sat.noarch
foreman-debug-1.7.2.24-1.el6_6sat.noarch

steps:
1. run in loop of 75 times:
   $ subscription-manager register --org="test123" --environment="Library" -u admin -p <password>
   $ subscription-manager clean
2. The Content Hosts page loads quickly

screenshot attached

Comment 18 Tazim Kolhar 2015-05-22 08:19:40 UTC

Created attachment 1028586 [details]
content host

Comment 19 Tazim Kolhar 2015-05-22 08:27:15 UTC

Created attachment 1028588 [details]
content host > 50

Comment 23 Bryan Kearney 2015-08-11 13:22:31 UTC

This bug is slated to be released with Satellite 6.1.

Comment 24 Bryan Kearney 2015-08-12 13:57:46 UTC

This bug was fixed in version 6.1.1 of Satellite which was released on 12 August, 2015.

Comment 27 Red Hat Bugzilla 2023-09-14 23:58:09 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days