535704 – (RHQ-2372) break up measurement compression / purge jobs into smaller chunks

Bug 535704 (RHQ-2372) - break up measurement compression / purge jobs into smaller chunks

Summary: break up measurement compression / purge jobs into smaller chunks

Keywords:
Status:	CLOSED NEXTRELEASE
Alias:	RHQ-2372
Product:	RHQ Project
Classification:	Other
Component:	Monitoring
Sub Component:
Version:	unspecified
Hardware:	All
OS:	All
Priority:	high
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Joseph Marques
QA Contact:	Corey Welton
Docs Contact:
URL:	http://jira.rhq-project.org/browse/RH...
Whiteboard:
Duplicates (1):	RHQ-1355 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2009-08-20 15:00 UTC by Joseph Marques
Modified:	2010-02-16 21:09 UTC (History)
CC List:	1 user (show)
Fixed In Version:	1.3
Clone Of:
Environment:
Last Closed:
Embargoed:

Attachments	(Terms of Use)

Description Joseph Marques 2009-08-20 15:00:00 UTC

as long as there is at least one server up talking to the database, the hourly quartz jobs run and compress / purge the last hour of measurement data.  when all servers are down, however, these jobs do not run.  then, when one of the servers is started back up, it could have a very large backlog of data to process.  to help smooth this out, the compression / purge routines should break up the work to be done in smaller chunks.  the chunks should be equivalent to the size of the job that needs to be done (1hr chunks for _1H table, 6hr chunks for _6H table, etc).

Comment 1 Joseph Marques 2009-08-20 15:01:07 UTC

rev4933 - perf enhancement for measurement purging (mazz)

Comment 2 Joseph Marques 2009-08-20 15:17:49 UTC

rev4967 - chunk up the measurement compression by intervals;

Comment 3 Joseph Marques 2009-08-21 13:54:06 UTC

rev4971 - pass correct interval parameter to purge methods;

Comment 4 Joseph Marques 2009-08-21 15:09:55 UTC

rev4972 - always purge backwards in time from 'purgeBefore' until the oldest timestamp available in that table;

Comment 5 Joseph Marques 2009-08-21 16:36:44 UTC

rev4975 - need to use greater-or-equal to catch the timestamps directly on the hour, which is guaranteed because we round is off timestamps for compression purposes;

Comment 6 Joseph Marques 2009-08-21 22:48:44 UTC

rev4977 - pass correct interval parameter to purge methods;

Comment 7 Joseph Marques 2009-08-25 01:29:16 UTC

rev4983 - make transaction timeouts for purge / compress chunk methods to be 20 mins; 
catch throwable from the calling content so that data purge / compress always attempts to execute all chunks;

Comment 8 Joseph Marques 2009-08-25 04:12:33 UTC

rev4984 - increase transaction timeouts for purge / compress chunk methods to 60 mins, which is necessary when servers are coming online and flooding with backlogged agent data;

Comment 9 Joseph Marques 2009-08-28 16:44:31 UTC

re-opening because these fixes aren't in trunk yet, only in the perf branch.

Comment 10 Joseph Marques 2009-09-03 07:30:14 UTC

rev5081 - chunk the work for purging and compression; 
always purge backwards in time from 'purgeBefore' until the oldest timestamp available in that table; 
use greater-or-equal-to comparison to catch the timestamps directly on the hour, which is always the case because timestamps are rounded off during compression; 
catch throwable from the calling context so that data purge / compression always attempts to execute all chunks each time its run; 
modify transaction timeout to 60 minutes, which is necessary when servers are coming online and flooding the server with backlogged data;

Comment 11 Corey Welton 2009-09-10 14:51:31 UTC

QA Closing - Code change

Comment 12 Red Hat Bugzilla 2009-11-10 21:03:07 UTC

This bug was previously known as http://jira.rhq-project.org/browse/RHQ-2372

Comment 13 David Lawrence 2009-11-11 17:09:40 UTC

*** Bug 534571 has been marked as a duplicate of this bug. ***

Comment 14 wes hayutin 2010-02-16 21:09:54 UTC

Mass move to component = Monitoring

Note You need to log in before you can comment on or make changes to this bug.