Bug 535704 (RHQ-2372) - break up measurement compression / purge jobs into smaller chunks
Summary: break up measurement compression / purge jobs into smaller chunks
Keywords:
Status: CLOSED NEXTRELEASE
Alias: RHQ-2372
Product: RHQ Project
Classification: Other
Component: Monitoring
Version: unspecified
Hardware: All
OS: All
high
medium
Target Milestone: ---
: ---
Assignee: Joseph Marques
QA Contact: Corey Welton
URL: http://jira.rhq-project.org/browse/RH...
Whiteboard:
: RHQ-1355 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-08-20 15:00 UTC by Joseph Marques
Modified: 2010-02-16 21:09 UTC (History)
1 user (show)

Fixed In Version: 1.3
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:


Attachments (Terms of Use)

Description Joseph Marques 2009-08-20 15:00:00 UTC
as long as there is at least one server up talking to the database, the hourly quartz jobs run and compress / purge the last hour of measurement data.  when all servers are down, however, these jobs do not run.  then, when one of the servers is started back up, it could have a very large backlog of data to process.  to help smooth this out, the compression / purge routines should break up the work to be done in smaller chunks.  the chunks should be equivalent to the size of the job that needs to be done (1hr chunks for _1H table, 6hr chunks for _6H table, etc).

Comment 1 Joseph Marques 2009-08-20 15:01:07 UTC
rev4933 - perf enhancement for measurement purging (mazz)

Comment 2 Joseph Marques 2009-08-20 15:17:49 UTC
rev4967 - chunk up the measurement compression by intervals; 

Comment 3 Joseph Marques 2009-08-21 13:54:06 UTC
rev4971 - pass correct interval parameter to purge methods; 

Comment 4 Joseph Marques 2009-08-21 15:09:55 UTC
rev4972 - always purge backwards in time from 'purgeBefore' until the oldest timestamp available in that table; 

Comment 5 Joseph Marques 2009-08-21 16:36:44 UTC
rev4975 - need to use greater-or-equal to catch the timestamps directly on the hour, which is guaranteed because we round is off timestamps for compression purposes; 

Comment 6 Joseph Marques 2009-08-21 22:48:44 UTC
rev4977 - pass correct interval parameter to purge methods; 

Comment 7 Joseph Marques 2009-08-25 01:29:16 UTC
rev4983 - make transaction timeouts for purge / compress chunk methods to be 20 mins; 
catch throwable from the calling content so that data purge / compress always attempts to execute all chunks; 

Comment 8 Joseph Marques 2009-08-25 04:12:33 UTC
rev4984 - increase transaction timeouts for purge / compress chunk methods to 60 mins, which is necessary when servers are coming online and flooding with backlogged agent data; 

Comment 9 Joseph Marques 2009-08-28 16:44:31 UTC
re-opening because these fixes aren't in trunk yet, only in the perf branch.

Comment 10 Joseph Marques 2009-09-03 07:30:14 UTC
rev5081 - chunk the work for purging and compression; 
always purge backwards in time from 'purgeBefore' until the oldest timestamp available in that table; 
use greater-or-equal-to comparison to catch the timestamps directly on the hour, which is always the case because timestamps are rounded off during compression; 
catch throwable from the calling context so that data purge / compression always attempts to execute all chunks each time its run; 
modify transaction timeout to 60 minutes, which is necessary when servers are coming online and flooding the server with backlogged data; 

Comment 11 Corey Welton 2009-09-10 14:51:31 UTC
QA Closing - Code change

Comment 12 Red Hat Bugzilla 2009-11-10 21:03:07 UTC
This bug was previously known as http://jira.rhq-project.org/browse/RHQ-2372


Comment 13 David Lawrence 2009-11-11 17:09:40 UTC
*** Bug 534571 has been marked as a duplicate of this bug. ***

Comment 14 wes hayutin 2010-02-16 21:09:54 UTC
Mass move to component = Monitoring


Note You need to log in before you can comment on or make changes to this bug.