870564 – Peak and Low metric values are incorrectly aggregated from compressed avg. value instead of minvalue and maxvalue

Bug 870564 - Peak and Low metric values are incorrectly aggregated from compressed avg. value instead of minvalue and maxvalue

Summary: Peak and Low metric values are incorrectly aggregated from compressed avg. va...

Keywords:
Status:	NEW
Alias:	None
Product:	RHQ Project
Classification:	Other
Component:	Monitoring
Sub Component:
Version:	4.4
Hardware:	All
OS:	All
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Nobody
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	870561
TreeView+	depends on / blocked

Reported:	2012-10-26 21:16 UTC by Larry O'Leary
Modified:	2022-03-31 04:27 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Clone Of:	870561
Environment:
Last Closed:
Embargoed:

Attachments	(Terms of Use)

Description Larry O'Leary 2012-10-26 21:16:01 UTC

+++ This bug was initially created as a clone of JBoss ON Bug #870561 +++

Description of problem:
When aggregated metrics values are returned from the compressed tables, the value column is used for low and peak instead of the minvalue and maxvalue columns. This results in integer metric returning decimal values for minimum and maximum. Additionally, the returned value is just wrong because it is taking the averaged (compressed) value over the time period instead of the true min and max values during the same time period.

Version-Release number of selected component (if applicable):
4.4.0.JON311GA

How reproducible:
Always

Steps to Reproduce:
1. Review the RHQ Agent's Agent-Server Clock Difference metric graph
2. Set the time interval to previous 8 days
 
Actual results:
Min and/or max value will be a decimal value instead of integer

Expected results:
Min and max should be an integer

Additional info:
This appears to be a result of the database query that aggregates the data from the compressed tables:

    SELECT timestamp, max(av), max(peak), max(low)
    FROM ((
    SELECT timestamp, avg(value) as av, max(value) as peak, min(value) as low
    FROM ((
    SELECT beginTS as timestamp, value
    FROM (
    SELECT 1341847301298 + (15840000 * i) as beginTS, i
    FROM RHQ_numbers where i < 60) n, RHQ_MEASUREMENT_DATA_NUM_1H d
    WHERE time_stamp BETWEEN beginTS AND (beginTS + 15840000) AND
    d.schedule_id IN (
    SELECT innerSchedule.id
    FROM rhq_measurement_sched innerSchedule
    WHERE innerSchedule.definition = 10003 AND
    innerSchedule.resource_id = 10003
    )
    )
    ) data GROUP BY timestamp)
    UNION ALL (
    SELECT 1341847301298 + (15840000 * i) AS timestamp, NULL AS av, NULL AS peak, NULL AS low
    FROM RHQ_numbers
    WHERE i < 60)
    ) alldata
    GROUP BY timestamp ORDER BY timestamp


Instead of avg(value) as av, max(value) as peak, min(value) as low we should probably be using: avg(value) as av, max(maxvalue) as peak, min(minvalue) as low

Comment 1 John Sanda 2012-10-26 21:29:42 UTC

Here is an example to illustrate everything going on with this query. Suppose we will be querying against the rhq_measurement_data_num_1h table and for the date range specified, we have the following rows that will be included in aggregation/data points query,


      time_stamp     | schedule_id | value | minvalue | maxvalue 
---------------------+-------------+-------+----------+----------
 Mon Jul 09 11:48:35 |       10013 |     2 |        1 |        3
 Mon Jul 09 12:48:35 |       10013 |     5 |        4 |        6
 Mon Jul 09 13:48:35 |       10013 |     3 |        3 |        3
 Fri Jul 20 07:24:35 |       10013 |     5 |        2 |        9
 Fri Jul 20 08:24:35 |       10013 |     5 |        4 |        6
 Fri Jul 20 09:24:35 |       10013 |     3 |        3 |        3


The data points query takes the time range, which is 11 days in this example, and divides it into 60 buckets. Each of the above rows will fall into one of the buckets. A bucket can have zero or more rows in it. The first three rows fall into bucket 0 while the last three rows fall into bucket 59.

The aggregation/data points query then generates the average, max, and min of the value column of the rows in each bucket. We then wind up with the following results,

Bucket[0] - {avg: 3.333, min: 2, max: 5}
Bucket[1..58] - {avg: Double.NaN, min: Double.NaN, max: Double.NaN}
Bucket[59] - {avg: 4.333, min: 3, max: 5}

Note that the minvalue and maxvalue columns are not used all. This query supports the candlestick graphs and provides a mechanism to summarize the data which is helpful when there are a lot of data points. As I found this afternoon, it does not look like we provide an API to allow users to get at the unsummarized data. A bigger potential problem with this query is that if the date range changes slightly, it can yield different results even if that range covers the same data points.

Comment 2 John Sanda 2012-10-26 21:31:30 UTC

I am going to take this bug since I already have tests in the jsanda/metrics-rhq branch that exercise this query. In fact my example in comment 1 comes from one of the tests. I can see about porting those tests over to master.

Note You need to log in before you can comment on or make changes to this bug.