Bug 870564

Summary: Peak and Low metric values are incorrectly aggregated from compressed avg. value instead of minvalue and maxvalue
Product: [Other] RHQ Project Reporter: Larry O'Leary <loleary>
Component: MonitoringAssignee: John Sanda <jsanda>
Status: NEW --- QA Contact: Mike Foley <mfoley>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.4CC: hrupp, jsanda
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: All   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 870561 Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On:    
Bug Blocks: 870561    

Description Larry O'Leary 2012-10-26 17:16:01 EDT
+++ This bug was initially created as a clone of JBoss ON Bug #870561 +++

Description of problem:
When aggregated metrics values are returned from the compressed tables, the value column is used for low and peak instead of the minvalue and maxvalue columns. This results in integer metric returning decimal values for minimum and maximum. Additionally, the returned value is just wrong because it is taking the averaged (compressed) value over the time period instead of the true min and max values during the same time period.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. Review the RHQ Agent's Agent-Server Clock Difference metric graph
2. Set the time interval to previous 8 days
Actual results:
Min and/or max value will be a decimal value instead of integer

Expected results:
Min and max should be an integer

Additional info:
This appears to be a result of the database query that aggregates the data from the compressed tables:

    SELECT timestamp, max(av), max(peak), max(low)
    FROM ((
    SELECT timestamp, avg(value) as av, max(value) as peak, min(value) as low
    FROM ((
    SELECT beginTS as timestamp, value
    FROM (
    SELECT 1341847301298 + (15840000 * i) as beginTS, i
    FROM RHQ_numbers where i < 60) n, RHQ_MEASUREMENT_DATA_NUM_1H d
    WHERE time_stamp BETWEEN beginTS AND (beginTS + 15840000) AND
    d.schedule_id IN (
    SELECT innerSchedule.id
    FROM rhq_measurement_sched innerSchedule
    WHERE innerSchedule.definition = 10003 AND
    innerSchedule.resource_id = 10003
    ) data GROUP BY timestamp)
    SELECT 1341847301298 + (15840000 * i) AS timestamp, NULL AS av, NULL AS peak, NULL AS low
    FROM RHQ_numbers
    WHERE i < 60)
    ) alldata
    GROUP BY timestamp ORDER BY timestamp

Instead of avg(value) as av, max(value) as peak, min(value) as low we should probably be using: avg(value) as av, max(maxvalue) as peak, min(minvalue) as low
Comment 1 John Sanda 2012-10-26 17:29:42 EDT
Here is an example to illustrate everything going on with this query. Suppose we will be querying against the rhq_measurement_data_num_1h table and for the date range specified, we have the following rows that will be included in aggregation/data points query,

      time_stamp     | schedule_id | value | minvalue | maxvalue 
 Mon Jul 09 11:48:35 |       10013 |     2 |        1 |        3
 Mon Jul 09 12:48:35 |       10013 |     5 |        4 |        6
 Mon Jul 09 13:48:35 |       10013 |     3 |        3 |        3
 Fri Jul 20 07:24:35 |       10013 |     5 |        2 |        9
 Fri Jul 20 08:24:35 |       10013 |     5 |        4 |        6
 Fri Jul 20 09:24:35 |       10013 |     3 |        3 |        3

The data points query takes the time range, which is 11 days in this example, and divides it into 60 buckets. Each of the above rows will fall into one of the buckets. A bucket can have zero or more rows in it. The first three rows fall into bucket 0 while the last three rows fall into bucket 59.

The aggregation/data points query then generates the average, max, and min of the value column of the rows in each bucket. We then wind up with the following results,

Bucket[0] - {avg: 3.333, min: 2, max: 5}
Bucket[1..58] - {avg: Double.NaN, min: Double.NaN, max: Double.NaN}
Bucket[59] - {avg: 4.333, min: 3, max: 5}

Note that the minvalue and maxvalue columns are not used all. This query supports the candlestick graphs and provides a mechanism to summarize the data which is helpful when there are a lot of data points. As I found this afternoon, it does not look like we provide an API to allow users to get at the unsummarized data. A bigger potential problem with this query is that if the date range changes slightly, it can yield different results even if that range covers the same data points.
Comment 2 John Sanda 2012-10-26 17:31:30 EDT
I am going to take this bug since I already have tests in the jsanda/metrics-rhq branch that exercise this query. In fact my example in comment 1 comes from one of the tests. I can see about porting those tests over to master.