Bug 1078435

Summary: Postgresql SELECT statements take 25 seconds in 'Systems' tab
Product: [Community] Spacewalk Reporter: Matt Wilkinson <matthewgwilkinson>
Component: WebUIAssignee: Grant Gainey <ggainey>
Status: CLOSED CURRENTRELEASE QA Contact: Red Hat Satellite QA List <satqe-list>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 2.1CC: mmraka
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-10-03 18:54:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1564160    

Description Matt Wilkinson 2014-03-19 17:09:32 UTC
Description of problem:
I've recently upgraded my Spacewalk installation from 2.0 to 2.1. Everything works well except one major issue: whenever a user in the Spacewalk GUI clicks on the 'Systems' tab, it takes between 20 and 25 seconds for the page to respond regardless of how many items per page. We have about 330+ systems registered with our Spacewalk server and with the items per page set to 25, it takes 25 seconds to display the list. 

I have investigated the OS heavily including evaluating the performance/resources, logs, etc. and I cannot find any obvious issue. However, in monitoring the postgres query logs, it does seem that there are a constant stream of postgres queries going on during the entire time the 'Systems' page is being loaded. 

I can't find an obvious issue with my setup beyond the large number of systems we have and the large number of attributes each system had, e.g. errata, packages, etc.

Version-Release number of selected component (if applicable):
Spacewalk v2.1 running on a RHEL 6.5 x86_64 box with the latest patches.

How reproducible:
Always

Steps to Reproduce:
1. Login to Spacewalk GUI
2. Click on the 'Systems' tab
3.

Actual results:
(Usually use Google Chrome): The page begins processing and the spinning indicator spins in the Chrome tab. After about 25 seconds, the page will display a list of systems. 

Expected results:
I would expect that with only 300 systems the page should display a lot quicker.

Additional info:
I've tweaked the postgres.conf file with additional tuning parameters based on information from Postgres documentation and Red Hat documentation. They are:

maintenance_work_mem = 352MB
checkpoint_completion_target = 0.7
effective_cache_size = 6GB
work_mem = 32MB
wal_buffers = 16MB
checkpoint_segments = 16
shared_buffers = 2GB
max_connections = 600

The Postgres DB resides on the localhost of the spacewalk server itself, which was the default method of installing Spacewalk with Postgres. 

The server's setup like this:

Physical ESXi 5.5 host running on Cisco UCS blade with Xeon CPUs and lots o' RAM
vSphere 5.5 VM with Hardware v10
8 GB of RAM
4 vCPUs
vmxnet3/pvscsi drivers in use
15GB base OS disk
20GB disk/filesystem for postgres DB mounted at /var/lib/pgsql
NetApp datastore backend

Tomcat6 has been tuned to -Xms512m -Xmx1024m
Java taskomatic has been tuned to -Xms2048m -Xmx2048m

Watching the top output during this process, I notice that the most processing is done by postgres and tomcat6 when clicking on the systems tab.

iowait does not seem to be an issue
RAM does not seem to be an issue
CPU does not seem to be an issue

Comment 1 Matt Wilkinson 2014-03-19 19:13:38 UTC
I just noticed something interesting. It only takes 2 seconds to load the 'Errata' page and that page has 984 items (total) in it.

The 'Systems' page only has 337 total items....

Comment 2 Matt Wilkinson 2016-01-13 21:08:26 UTC
I've solved this by running the following command nightly:

vacuumdb --analyze <dbname>

Comment 3 Grant Gainey 2017-10-03 18:54:05 UTC
Since this was opened we've moved from pgres 8.4 to 9.2 to 9.5. The postgres optimizer has gotten smarter along the way, as has the default vacuumdb behavior. As noted in #c2, this particular issue is a symptom of the database needing some housekeeping. There isn't any code-change to do here; closing as 'currentrelease'.