Bug 1015706

Summary: Graphs fail to render with 'java.lang.IllegalArgumentException: highValue' in log
Product: [Other] RHQ Project Reporter: Elias Ross <genman>
Component: Core UI, Storage NodeAssignee: RHQ Project Maintainer <rhq-maint>
Status: CLOSED CURRENTRELEASE QA Contact: Mike Foley <mfoley>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.9, 4.10, 4.11CC: hrupp, jsanda
Target Milestone: ---   
Target Release: RHQ 4.12   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1092756 (view as bug list) Environment:
Last Closed: 2014-12-15 11:36:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1092756, 1099708    

Description Elias Ross 2013-10-04 21:25:57 UTC
Description of problem:

Some graphs fail to render, and the following exception is thrown.

Caused by: java.lang.IllegalArgumentException: highValue (1.1431E10) is not greater than or equal to value (1.1475666666666666E10).
	at org.rhq.core.domain.measurement.composite.MeasurementDataNumericHighLowComposite.<init>(MeasurementDataNumericHighLowComposite.java:44) [rhq-core-domain-ejb3.jar:4.9.0]
	at org.rhq.server.metrics.MetricsServer.createComposites(MetricsServer.java:297) [rhq-server-metrics-4.9.0.jar:4.9.0]
	at org.rhq.server.metrics.MetricsServer.findDataForResource(MetricsServer.java:171) [rhq-server-metrics-4.9.0.jar:4.9.0]


Version-Release number of selected component (if applicable):

4.9


How reproducible:

Depending on the dataset, this can happen fairly frequently. The panel fails to render at all.

It would be nice if the server caught the exception, or fixed the data so it didn't fail to show anything at all.

I'm guessing the threshold is basically something that doesn't work for large numbers.

            if (highValue < value && Math.abs(highValue - value) > THRESHOLD) {

^^^ might want to use division not subtraction here.

Comment 1 Elias Ross 2013-10-04 21:28:08 UTC
Even if you fix the threshold, I'd be highly annoyed if my graph just failed to come up. Why not just have 'highValue = value' if there is a rendering issue.

Comment 2 John Sanda 2014-04-30 01:38:01 UTC
I am inclined to think that this is a server side issue. There are a few things to consider in trying to figure out the root cause. There could be an bug in the code that computes the aggregate metrics that are being requested by the client. There could be a bug in the code that "bucketizes" the metric data into 60 data points. Maybe there is an error that occurred while computing/storing the aggregate metrics which is not handled.

Elias, are you able to reproduce this fairly easily? If so, I could provide you with a patch that could log relevant details to give a better idea of what is happening.

Comment 3 Elias Ross 2014-04-30 02:27:04 UTC
I used to see this for things like disk usage or some sufficiently high numerics, for example timestamps in milliseconds.

I'm not sure I can reproduce the problem as I have patched my server to avoid this problem entirely.

I'm not sure what this 'check' is trying to do, but even if it fails, the user should get some data back.

diff --git a/modules/core/domain/src/main/java/org/rhq/core/domain/measurement/composite/MeasurementDataNumericHighLowComposite.java b/modules/
index d84744f..604cf5d 100644
--- a/modules/core/domain/src/main/java/org/rhq/core/domain/measurement/composite/MeasurementDataNumericHighLowComposite.java
+++ b/modules/core/domain/src/main/java/org/rhq/core/domain/measurement/composite/MeasurementDataNumericHighLowComposite.java
@@ -27,8 +27,6 @@
 public class MeasurementDataNumericHighLowComposite implements Serializable {
     private static final long serialVersionUID = 1L;
 
-    private static final double THRESHOLD = 0.00001d;
-
     private long timestamp;
     private double value;
     private double highValue;
@@ -40,14 +38,12 @@ protected MeasurementDataNumericHighLowComposite() {
 
     public MeasurementDataNumericHighLowComposite(long timestamp, double value, double highValue, double lowValue) {
         if (!Double.isNaN(value)) {
-            if (highValue < value && Math.abs(highValue - value) > THRESHOLD) {
-                throw new IllegalArgumentException("highValue (" + highValue
-                    + ") is not greater than or equal to value (" + value + ").");
+            if (highValue < value) {
+                highValue = value;
             }
 
-            if (lowValue > value && Math.abs(lowValue - value) > THRESHOLD) {
-                throw new IllegalArgumentException("lowValue (" + lowValue + ") is not less than or equal to value ("
-                    + value + ").");
+            if (lowValue > value) {
+                lowValue = value;
             }
         }

Comment 4 John Sanda 2014-04-30 14:02:57 UTC
Thanks Elias. Your patch makes sense. I would like to do some testing to try and determine the root cause to determine what if anything else we ought to do.

Comment 5 John Sanda 2014-05-30 20:06:00 UTC
The problem was a bug in a method that calculates aggregate metrics. It could produce an incorrect max because of if else statement that should have been an if. The code looked like,

if (metric.getMin() < min) {
    min = metric.getMin();
} else if (metric.getMax() > max) {
    max = metric.getMax();
}

This bug only effects 6 hr and 24 hr metrics. It manifests itself for 6 hour data for example when both the min and max of the 1 hour data being aggregated fall on the same 1 hour aggregate metric. Two things are needed to address this problem. 1) Stop generating invalid aggregate metrics, and 2) handle existing, invalid aggregate metrics.

The changes that were merged into master from the jsanda/metrics-schema branch already take care of calculating the aggregates correctly. For existing data, I have committed a change to master that checks for invalid max values. If we come across one, we log a warning and "adjust" the metric. We set the max to the average and and persist the updated value.

commit hash: b1b4eeef16

Comment 6 John Sanda 2014-06-06 14:02:54 UTC
There was additional commit from the release/jon3.2.x branch I needed to cherry pick over to master.

commit hash: a5afcb2b5f0

Comment 7 Heiko W. Rupp 2014-12-15 11:36:16 UTC
Bulk close of items fixed in RHQ 4.12

If you think this is not solved, then please open a *new* BZ and link to this one.