Bug 854438 - very big number of guest systems registered causes long responses on Details page of their host
Summary: very big number of guest systems registered causes long responses on Details ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Spacewalk
Classification: Community
Component: WebUI
Version: 1.8
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Tomas Lestach
QA Contact: Red Hat Satellite QA List
URL:
Whiteboard:
Depends On:
Blocks: space28
TreeView+ depends on / blocked
 
Reported: 2012-09-05 04:02 UTC by Jan Hutař
Modified: 2018-04-20 12:21 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-11-28 08:31:31 UTC
Embargoed:


Attachments (Terms of Use)

Description Jan Hutař 2012-09-05 04:02:40 UTC
Description of problem:
I have virtualization host (Xen in my case) with ~2300 registered guests (to be honest, it is just one host registered this may times). Accessing Details page of their host takes too long.


Version-Release number of selected component (if applicable):
This was reproduced on Satellite, but expected to be broken on current SW as well.


How reproducible:
always


Steps to Reproduce:
1. Have a registered virt. host with a guest and register it multiple times
2. Make sure they appear in host's Virtulzation tab
3. Reload host's Details page and observe page load time


Actual results:
For ~2300 guests it took about 50 seconds on 1.3GHz x 24 cores, 4GB RAM machine.


Expected results:
Page load time should be independent on number of guests registered.


Additional info:
When I have tried to remove some systems and measure again as per tlestach's idea, these were the numbers:
0 guests deleted (i.e. ~2300 guests registered): 68 50 50 (3 attempmts to reload Details page, numbers are seconds)
500 guests deleted: 49 49 49
1000 guests deleted: 50 30 29
1500 guests deleted: 19 20 19
2000 guests deleted: 9 9 9
all guests deleted: 2 1 2

Comment 1 Jan Pazdziora 2013-03-08 15:14:31 UTC
Can you reproduce on Spacewalk with PostgreSQL? Can you use select from the pg_stat_activity table to see what select it hangs on?

Comment 2 Jan Hutař 2013-04-22 11:04:04 UTC
I have used system with 16 CPUs and 24 GB RAM and configured this in postgresql.conf (and restarted DB):

log_min_messages = info
log_min_duration_statement = 0

and used host with 1000 guests registered (actually one guest registered 1000 times). To load its system details page it takes 16s. To load system details page of another system without guests it takes 2 seconds.

Comparing how many lines are generated in PostgreSQL logs when refreshing page with system details:

system without guests: 1028
system with 1000 guests: 21138

I'll attach log for that big guest.

Comment 5 Jan Pazdziora 2017-10-18 07:41:45 UTC
Is this still an active issue?

Comment 6 Jan Hutař 2017-11-28 08:31:31 UTC
Looks like it is OK now. Tested on Satellite with 4 CPUs and 32 GB RAM with:

    satellite-schema-5.8.0.37-1.el6sat.noarch
    spacewalk-java-2.5.14-106.el6sat.noarch

and these are the timings (I have noticed page under "Provisioning" is slow now, so measured it as well):

1306 guests:
    Detail [s]: 2.1 3.6 3.0 3.5 3.2
    Provisioning [s]: 17.3 18.6 18.2 19.4 20.0

1000 guests:
    Detail [s]: 3.2 2.6 2.6 3.0 2.5
    Provisioning [s]: 13.2 13.6 15.4 15.9 15.2

500 guests:
    Detail [s]: 3.3 3.0 3.1 2.8 3.5
    Provisioning [s]: 6.8 9.7 8.5 9.3 9.6

25 guests (4 without profiles):
    Detail [s]: 2.7 2.8 2.7 2.7 2.5
    Provisioning [s]: 3.8 3.9 3.7 3.1 3.6

Load speeds of "Performance" tab page seems very linear. Also when I had "log_min_duration_statement = 500" in PostgreSQL log, it did not logged any long query when reloading that page. So IMO this is nothing to report.

Noticed this slow query when displaying Virtualization -> Details tab, but it is probably OK (given this was for 1306 guests while you will have tenths of guests per one host at maximum):

2017-11-28 02:26:25.806 EST LOG:  duration: 1288.037 ms  execute <unnamed>: SELECT
	  DISTINCT VI.id,
	           VI.host_system_id,
	           VI.virtual_system_id,
	           VI.uuid,
	         COALESCE(VII.name, '(none)') AS NAME,
	         COALESCE(VIS.name, '(unknown)') AS STATE_NAME,
	         COALESCE(VIS.label, 'unknown') AS STATE_LABEL,
	     COALESCE(VII.vcpus, 0) AS VCPUS,
	     COALESCE(VII.memory_size_k, 0) AS MEMORY,
	     COALESCE(((SELECT 1 FROM rhnUserServerPerms USP WHERE USP.user_id = $1 AND USP.server_id = VI.virtual_system_id)), 0 ) as accessible,
	         rhn_channel.user_role_check((select channel_id from rhnServerOverview where server_id = VI.virtual_system_id), $2, 'subscribe') AS subscribable,
	         1 AS selectable
	    FROM rhnVirtualInstance VI
	         LEFT OUTER JOIN rhnVirtualInstanceInfo VII ON VI.id = VII.instance_id
	         LEFT OUTER JOIN rhnVirtualInstanceState VIS ON VII.state = VIS.id
	   WHERE EXISTS (SELECT 1
	                   FROM rhnUserServerPerms USP
	                  WHERE USP.user_id = $3
	                    AND USP.server_id = VI.host_system_id)
	     AND VI.host_system_id = $4
	     AND VI.uuid IS NOT NULL
	ORDER BY NAME, VI.virtual_system_id, VI.uuid DESC
2017-11-28 02:26:25.806 EST DETAIL:  parameters: $1 = '1', $2 = '1', $3 = '1', $4 = '1000021025'


Note You need to log in before you can comment on or make changes to this bug.