Bug 1049054

Summary: Store aggregate metrics in a single (CQL) row
Product: [Other] RHQ Project Reporter: John Sanda <jsanda>
Component: Core Server, Performance, Storage NodeAssignee: Nobody <nobody>
Status: ON_QA --- QA Contact:
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.9CC: hrupp, jshaughn
Target Milestone: GA   
Target Release: RHQ 4.13   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1135602 (view as bug list) Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1133605, 1126410, 1135602    

Description John Sanda 2014-01-06 21:32:53 UTC
Description of problem:
This change has the potential to be a big win as it will reduce the number of writes for storing aggregate metrics by a factor of three. The C* schema for aggregate metrics looks like,

CREATE TABLE one_hour_metrics (
    schedule_id int,
    time timestamp,
    type int,
    value double,
    PRIMARY KEY (schedule_id, time, type)
) WITH COMPACT STORAGE;

where the type column identifies the value as one of max, min, or average. Storing an aggregate metric requires three separate write requests. Batch statements could be used to reduce the number of network roundtrips; however, I found that the performance of prepared statements to be better than that of unprepared batches. Inserting an aggregate would look like,

insert into one_hour_metrics (schedule_id, time, type, value) VALUES 
(100, '2014-01-01', 0, 3.14);
insert into one_hour_metrics (schedule_id, time, type, value) VALUES 
(100, '2014-01-01', 1, 3.14);
insert into one_hour_metrics (schedule_id, time, type, value) VALUES 
(100, '2014-01-01', 2, 3.14);

Using CQL collections we can reduce the number of writes for an aggregate metric to one while the on-disk storage remains nearly identical. The schema would look like,

CREATE TABLE one_hour_metrics (
    schedule_id int,
    time timestamp,
    value map<int, double>,
    PRIMARY KEY (schedule_id, time, type)
);

And inserting an aggregate would look like,

insert into one_hour_metrics (schedule_id, time, value) VALUES
(100, '2014-01-01', {0: 3,14, 1: 3.14, 2: 314});

Because the table does not use the WITH COMPACT STORAGE directive, there is an extra column of overhead per CQL row. That overhead should become largely insignificant once compression is re-enabled (see bug 1015628). 

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Heiko W. Rupp 2014-05-08 14:42:37 UTC
Bump the target version now that 4.11 is out.

Comment 2 Jay Shaughnessy 2014-07-07 16:55:39 UTC
Bumping to 4.13 due to time constraints

Comment 3 John Sanda 2014-08-28 02:04:29 UTC
The biggest problem here is that we can wind up with partial aggregates. Persisting an aggregate metric requires three separate writes. Since we do not use atomic batches, we can (and have) wound up with partial aggregates which in turn can lead to difficult bugs.

With respect to the proposed schema changes in the description, I think collections are over kill. I think they would be more appropriate if the types of aggregates stored changed dynamically, but that has never been the case. Given this I propose the following new table,

CREATE TABLE aggregate_metrics (
  schedule_id int,
  bucket text,
  time timestamp,
  avg double,
  max double,
  min double,
  PRIMARY KEY ((schedule_id, bucket), time)
);

This table will replace the current one_hour_metrics, six_hour_metrics, and twenty_four_hour_metrics tables. An upgrade task will be needed to migrate data from the old tables to the new table.

Comment 4 John Sanda 2014-08-29 18:34:24 UTC
I want to point out that the upgrade task will be part of the regular server installation/upgrade. Unlike the RDBMS where migrating data from one table to another can be done with SQL statements, Cassandra requires custom code.

Comment 5 John Sanda 2014-09-02 02:32:39 UTC
Changes have been pushed to master. The schema changes are the ones described in comment 3. There are still some references in code to the dropped tables. Those will get removed as part of the work for bug 1135603 and bug 1135629.

master commit hashes:
a7fc965efe
29a95692f8
bfcdfc8bf
2295826848
45f7c26fd97f
da9707cddc4

Comment 6 John Sanda 2014-09-02 14:00:41 UTC
There have been a few additional commits since the work landed in master. 

additional master commit hashes:
0cb3d762b
6ac05a3d8
21c7f7be64

Comment 7 John Sanda 2014-09-02 14:09:37 UTC
MigrateAggregateMetrics.java is the class that performs the data migration into the new aggregate_metrics table. Any errors will result in an exception being thrown which in turn cause the server install/upgrade to fail. The exception however is not thrown until the end of the migration. That is, MigrateAggregateMetrics will go through all measurement schedules, attempting to migrate as much data as possible, before throwing an exception. The schedule ids of data that has been successfully migrated are written to log files stored in the server's data directory. The log file is read before starting the migration from each of the old tables to avoid duplicate work. Schedule ids whose data has already been migrated successfully will be skipped on subsequent runs of the server install/upgrade. Lastly, progress of the migration of each table is logged every 30 seconds. A message stating the number of remaining schedules is logged.