Bug 534571 (RHQ-1355)

Summary: Investigate performance improvements for table purges
Product: [Other] RHQ Project Reporter: Charles Crouch <ccrouch>
Component: PerformanceAssignee: Joseph Marques <jmarques>
Status: CLOSED DUPLICATE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 1.2CC: hbrock
Target Milestone: ---Keywords: Task
Target Release: ---   
Hardware: All   
OS: All   
URL: http://jira.rhq-project.org/browse/RHQ-1355
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:

Description Charles Crouch 2009-01-13 09:44:00 EST
For example consider the 1d measurement table:
Our perf environment, doesn't have a ridiculous number of resources or metrics but has 678k measurement schedules which equates to almost 250m (678k*365) rows in the 1d table after a year. We need to make sure our current purge algorithm is sufficient.
Comment 1 Joseph Marques 2009-01-25 08:50:34 EST
Deletes become more and more inefficient on large tables because they are almost always implemented as single-threaded algorithms. However, selects are often *highly* threaded algorithms. Knowing this, can improve purges by issuing selects to pull data into memory, immediately followed by delete statements on those same rows. In short, because we can select with greater concurrency, we can delete more in the same amount of time (or delete the same amount in less time). Once the physical blocks which contain the rows to be deleted are in memory from the select statements, the deletes (depending on the size of the db cache) should have almost no cache misses, thus speeding up the overall operation.

Oh, and this solution is irrespective of the chunk size (# of rows to be deleted determined by deleteRowDataOlderThan timestamp for the respective table being shortened). We could always do a little analytics ahead of time to figure out whether we should chunk the work out or just try to purge the whole lot of data each time the purge job runs.  If the database (or all the servers) have been down for a while, this will naturally create a lot of work for the purge job if we try to process all data deleteRowDataOlderThan in a single transaction.  it might behoove us to figure out how many rows that is or what timeframe that spans (oldest row up until deleteRowDataOlderThan).  then, break up the work into separate transactions that each delete chunks (perhaps 1-hr, perhaps smaller) of data.
Comment 2 Joseph Marques 2009-09-04 18:12:49 EDT
duplicate of jiras which RHQ-2372, RHQ-2376, and RHQ-1448 collectively resolved by using a chunking solutions combined with appropriate indexes to reduce table contention in high-throughput write environments.
Comment 3 Red Hat Bugzilla 2009-11-10 15:31:12 EST
This bug was previously known as http://jira.rhq-project.org/browse/RHQ-1355
This bug is related to RHQ-1336
This bug is related to RHQ-1119
This bug was marked DUPLICATE in the database but has no duplicate of bug id.
Comment 4 David Lawrence 2009-11-11 12:09:40 EST

*** This bug has been marked as a duplicate of bug 535704 ***