Bug 1049054
Summary: | Store aggregate metrics in a single (CQL) row | |||
---|---|---|---|---|
Product: | [Other] RHQ Project | Reporter: | John Sanda <jsanda> | |
Component: | Core Server, Performance, Storage Node | Assignee: | Nobody <nobody> | |
Status: | ON_QA --- | QA Contact: | ||
Severity: | medium | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 4.9 | CC: | hrupp, jshaughn | |
Target Milestone: | GA | |||
Target Release: | RHQ 4.13 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1135602 (view as bug list) | Environment: | ||
Last Closed: | Type: | Bug | ||
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1133605, 1126410, 1135602 |
Description
John Sanda
2014-01-06 21:32:53 UTC
Bump the target version now that 4.11 is out. Bumping to 4.13 due to time constraints The biggest problem here is that we can wind up with partial aggregates. Persisting an aggregate metric requires three separate writes. Since we do not use atomic batches, we can (and have) wound up with partial aggregates which in turn can lead to difficult bugs. With respect to the proposed schema changes in the description, I think collections are over kill. I think they would be more appropriate if the types of aggregates stored changed dynamically, but that has never been the case. Given this I propose the following new table, CREATE TABLE aggregate_metrics ( schedule_id int, bucket text, time timestamp, avg double, max double, min double, PRIMARY KEY ((schedule_id, bucket), time) ); This table will replace the current one_hour_metrics, six_hour_metrics, and twenty_four_hour_metrics tables. An upgrade task will be needed to migrate data from the old tables to the new table. I want to point out that the upgrade task will be part of the regular server installation/upgrade. Unlike the RDBMS where migrating data from one table to another can be done with SQL statements, Cassandra requires custom code. Changes have been pushed to master. The schema changes are the ones described in comment 3. There are still some references in code to the dropped tables. Those will get removed as part of the work for bug 1135603 and bug 1135629. master commit hashes: a7fc965efe 29a95692f8 bfcdfc8bf 2295826848 45f7c26fd97f da9707cddc4 There have been a few additional commits since the work landed in master. additional master commit hashes: 0cb3d762b 6ac05a3d8 21c7f7be64 MigrateAggregateMetrics.java is the class that performs the data migration into the new aggregate_metrics table. Any errors will result in an exception being thrown which in turn cause the server install/upgrade to fail. The exception however is not thrown until the end of the migration. That is, MigrateAggregateMetrics will go through all measurement schedules, attempting to migrate as much data as possible, before throwing an exception. The schedule ids of data that has been successfully migrated are written to log files stored in the server's data directory. The log file is read before starting the migration from each of the old tables to avoid duplicate work. Schedule ids whose data has already been migrated successfully will be skipped on subsequent runs of the server install/upgrade. Lastly, progress of the migration of each table is logged every 30 seconds. A message stating the number of remaining schedules is logged. |