535240 – (RHQ-1958) make CheckForTimedOutOperationsJob run in a short and predictable amount of time

Bug 535240 (RHQ-1958) - make CheckForTimedOutOperationsJob run in a short and predictable amount of time

Summary: make CheckForTimedOutOperationsJob run in a short and predictable amount of time

Keywords:
Status:	CLOSED WONTFIX
Alias:	RHQ-1958
Product:	RHQ Project
Classification:	Other
Component:	Performance
Sub Component:
Version:	unspecified
Hardware:	All
OS:	All
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	RHQ Project Maintainer
QA Contact:
Docs Contact:
URL:	http://jira.rhq-project.org/browse/RH...
Whiteboard:
Depends On:
Blocks:	rhq-perf
TreeView+	depends on / blocked

Reported:	2009-04-10 17:34 UTC by Jay Shaughnessy
Modified:	2014-05-05 18:30 UTC (History)
CC List:	3 users (show)
Fixed In Version:	1.4
Clone Of:
Environment:
Last Closed:	2014-05-05 18:30:40 UTC
Embargoed:

Attachments	(Terms of Use)

Description Jay Shaughnessy 2009-04-10 17:34:00 UTC

When working on RHQ-1950, and talking with Joseph, it became clear that execution time for our CheckForTimedOutOperationsJob will most likely grow as the the size of the RHQ_OPERATION_HISTORY table grows.  This is because we execute ResourceOperationHistory.QUERY_FIND_ALL_IN_STATUS to find all IN_PROGRESS queries.  The status field is stored as an unindexed, string field.  So, we're looking at a table scan for each execution of the query.

This jop currently executes at 1 minute intervals.  This is down from 10 minutes due to the work in RHQ-1950, but the interval is probably unimportant. What is important is the amount of time the job can run if the op history table grows large.  How large before this may be a problem is not currently known and should be determined as the first step of this Jira.  In that way we can determine whether we need to do some query/db work to resolve this issue.

Putting in index on the current field probably will not solve the problem given it's character nature and low cardinality of values.  An alternative suggested by Joe would be to change the field to a numeric, with enum values corresponding to the progressive states of an op. E.g.  (inprocess, cancelled, completed, failed).  Giving this field an ordered index could give us the ability to do range scans efficiently.

I suggest we at least understand the perf of the current query in the 1.2 timeframe so as to know whether a change is required in the short term.

Comment 1 John Mazzitelli 2009-04-15 14:58:33 UTC

workaround is to periodically delete your operation history items. User can go into the Operation history tab and start purging old histories.

Comment 2 Charles Crouch 2009-07-07 15:53:31 UTC

Lets see if we can include this in the perf work for 1.3

Comment 3 Heiko W. Rupp 2009-08-12 20:02:08 UTC

Postgres is very much able to use an index when looking for entries that are relatively rare like "INPROGRESS", so I think this can help.
A quick test on a table with >2000 (2078x success, 39x failure, 3x inprogress) entries shows that for failure and inprogress, the index is used, while for success a table scan is going over the whole table.

Comment 4 Charles Crouch 2009-09-01 14:52:32 UTC

Moving features/improvements to 1.4

Comment 5 Red Hat Bugzilla 2009-11-10 20:50:18 UTC

This bug was previously known as http://jira.rhq-project.org/browse/RHQ-1958
This bug relates to RHQ-2302
This bug relates to RHQ-2343

Comment 6 wes hayutin 2010-02-16 17:10:22 UTC

mass add of key word FutureFeature to help track

Comment 7 Jay Shaughnessy 2014-05-05 18:30:40 UTC

This hasn't been a problem.  Closing and we can open a new BZ if and when this is an issue in the future.

Note You need to log in before you can comment on or make changes to this bug.