Red Hat Bugzilla – Bug 534571
Investigate performance improvements for table purges
Last modified: 2015-02-01 18:24:58 EST
For example consider the 1d measurement table:
Our perf environment, doesn't have a ridiculous number of resources or metrics but has 678k measurement schedules which equates to almost 250m (678k*365) rows in the 1d table after a year. We need to make sure our current purge algorithm is sufficient.
Deletes become more and more inefficient on large tables because they are almost always implemented as single-threaded algorithms. However, selects are often *highly* threaded algorithms. Knowing this, can improve purges by issuing selects to pull data into memory, immediately followed by delete statements on those same rows. In short, because we can select with greater concurrency, we can delete more in the same amount of time (or delete the same amount in less time). Once the physical blocks which contain the rows to be deleted are in memory from the select statements, the deletes (depending on the size of the db cache) should have almost no cache misses, thus speeding up the overall operation.
Oh, and this solution is irrespective of the chunk size (# of rows to be deleted determined by deleteRowDataOlderThan timestamp for the respective table being shortened). We could always do a little analytics ahead of time to figure out whether we should chunk the work out or just try to purge the whole lot of data each time the purge job runs. If the database (or all the servers) have been down for a while, this will naturally create a lot of work for the purge job if we try to process all data deleteRowDataOlderThan in a single transaction. it might behoove us to figure out how many rows that is or what timeframe that spans (oldest row up until deleteRowDataOlderThan). then, break up the work into separate transactions that each delete chunks (perhaps 1-hr, perhaps smaller) of data.
duplicate of jiras which RHQ-2372, RHQ-2376, and RHQ-1448 collectively resolved by using a chunking solutions combined with appropriate indexes to reduce table contention in high-throughput write environments.
This bug was previously known as http://jira.rhq-project.org/browse/RHQ-1355
This bug is related to RHQ-1336
This bug is related to RHQ-1119
This bug was marked DUPLICATE in the database but has no duplicate of bug id.
*** This bug has been marked as a duplicate of bug 535704 ***