+++ This bug was initially created as a clone of Bug #1104885 +++ Description of problem: This is similar to bug 1015706. The problem manifests itself in the same way however; it slightly different. With bug 1015706, aggregate metrics were getting computed incorrectly, resulting in invalid metrics being stored. In this case the problem is in handling of query results for metrics that can be completely valid. When there is a client requesting data, we do not return the actual metrics. We break the specified date range up into 60 discrete intervals or buckets. The timestamp of each metric determines which bucket it goes into. We then compute the max/min/avg of each bucket. The aggregated buckets are what gets sent to the client. We have the same problem in the bucket code where the max gets calculated incorrectly. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: --- Additional comment from John Sanda on 2014-06-04 17:43:51 EDT --- I will work on data/steps to reproduce this to make testing and verification easier.
Here are steps to reproduce this issue. 1. Choose an existing schedule id of an actual resource to use for testing. I will use 100 just as an example. 2. cd to <JON_SERVER_HOME/rhq-storage/bin 3. Enable RPC server on storage node by running ./nodetool -p 7299 enablethrift 4. ./cqlsh -u <storage_username> -p <storage_password> 5. Execute the following commands in cqlsh use rhq; insert into raw_metrics (schedule_id, time, value) values (100, '2014-06-05 14:20:00', 33.0); insert into raw_metrics (schedule_id, time, value) values (100, '2014-06-05 14:40:00', 35.0); insert into raw_metrics (schedule_id, time, value) values (100, '2014-06-05 15:20:00', 30.0); insert into raw_metrics (schedule_id, time, value) values (100, '2014-06-05 15:40:00', 100.0); 6. Wait until the data purge job has run in the 16:00 hour so that we have 1 hour aggregate metrics for the 14:00 and 15:00 hours. 7. Go to the metrics graph in the UI for that schedule and change the date range to two weeks. 8. This should produce the exception.
Fixed bucket aggregation code. release/jon3.2.x commit hash: 447934e99
Moving to ON_QA as available for test in build: http://jon01.mw.lab.eng.bos.redhat.com:8042/dist/release/jon/3.2.2.GA/6-13-2014_0900/
simulated same scenario (with same date) - screenshot attached - no exception in server and/or agent logs, no gui failure.
Created attachment 909151 [details] charts
This has been verified and released in Red Hat JBoss Operations Network 3.2 Update 02 (3.2.2) available from the Red Hat Customer Portal[1]. [1]: https://access.redhat.com/jbossnetwork/restricted/softwareDetail.html?softwareId=31783