Customer DBA indicates that the following queries seem to contribute to locking when using call time metrics: INSERT INTO RHQ_CALLTIME_DATA_KEY(id, schedule_id, call_destination) SELECT RHQ_calltime_data_key_id_seq.nextval, :1, :2 FROM RHQ_numbers WHERE i = 42 AND NOT EXISTS (SELECT * FROM RHQ_CALLTIME_DATA_KEY WHERE schedule_id = :3 AND call_destination = :4) DELETE FROM RHQ_CALLTIME_DATA_VALUE WHERE key_id = (SELECT id FROM RHQ_CALLTIME_DATA_KEY WHERE schedule_id = :1 AND call_destination = :2) AND begin_time = :3 I did some digging into the call time subsystem. Wrote up my thoughts here -- http://www.rhq-project.org/display/RHQ/Improving+CallTimeData+Insertion+Logic I suggest we implement 1 & 2, and then consider whether it's worth the time to try and include any of the others in the next release.
I think testing and fixing other performance areas are a priority for JON2.4
commit 7f912881eb1ed141a5519e62fd13522eb97a42d1 Author: Joseph Marques <joseph> Date: Sat Jun 19 09:42:55 2010 -0400 BZ-570360: improve call-time data reporting performance by adjusting transactional boundaries ----- This maps to "Part 1 - Transactional Boundaries" in the wiki link.
commit a6a707cab05a9ee8cebf49e87b381b507c343285 Author: Joseph Marques <joseph> Date: Mon Jun 21 07:32:12 2010 -0400 BZ-570360: eliminate routine that purges duplicate call-time data values Is this really needed? Under what circumstances are duplicates generated? Are we presuming this is a misbehaving plugin, or a problem at the comm-layer where the same piece of data is delivered twice? I'm tentatively removing this because it is a big hit to call-time data reporting performance. If, however, evidence is presented that implores us to bring this functionality back, it should be implemented by getting rid of the duplicate data points that share the same key_id and begin_time. These records can be found with: SELECT key_id, begin_time, count(id) FROM rhq_calltime_data_value GROUP BY key_id, begin_time HAVING count(id) > 1 This is functionally equivalent to the CALLTIME_VALUE_DELETE_SUPERCEDED_STATEMENT query, but takes advantage of the fact that key_id represents the already-computed pair of schedule_id/destination, thus allowing the duplicate-search to be implemented against a single table. Taking this solution one step further, an appropriate delete statement can be crafted which leverages the above concept but deletes all but one of the duplicates (perhaps leaving the record with the smallest id/pk). This purge routine can either be grouped in with the rest in DataPurgeJob, or it can be implemented as its own quartz job that runs more (or less) frequently, depending on how needs (i.e., how often we anticipate duplicates) ----- This work originally would have mapped to "Part 2 - Refactoring table maintenance", but upon further reflection was better to remove the routine altogether.
Pushing this bug out of dev because we won't be doing any further incremental improvements to the call-time data subsystem. In the future, the schema might be rewritten to yield even greater performance, but a separate BZ can be opened at that time to track that once it gets on the current release plan schedule. FYI, testing for this item is almost a no-op, as no functional changes were made. Transactional boundaries were manipulated, and unnecessary routines were by-passed. I've already verified that there are no regressions to the insertion / reporting routine overall, but QA should feel free to test this if they need greater confidence before release.
QA Closing
Mass-closure of verified bugs against JON.