Bug 913131 - Integer out of range error when calculating OOBs
Summary: Integer out of range error when calculating OOBs
Keywords:
Status: NEW
Alias: None
Product: RHQ Project
Classification: Other
Component: Database
Version: 4.5
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: ---
: ---
Assignee: Nobody
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-02-20 13:13 UTC by Jirka Kremser
Modified: 2022-03-31 04:28 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:


Attachments (Terms of Use)

Description Jirka Kremser 2013-02-20 13:13:36 UTC
Description of problem:
When calculating OOBs following SQL query is used 

INSERT INTO rhq_measurement_oob_tmp (oob_factor, schedule_id, time_stamp ) 
     ( SELECT max(mx*100) as mxdiff, id, ? 
         FROM ( SELECT max(((d.maxvalue - b.bl_max) / (b.bl_max - b.bl_min))) AS mx, d.schedule_id as id 
                  FROM rhq_measurement_bline b, rhq_measurement_data_num_1h d, rhq_measurement_sched sc, rhq_measurement_def def 
                 WHERE b.schedule_id = d.schedule_id 
                   AND sc.id = b.schedule_id 
                   AND d.value > b.bl_max 
                   AND d.time_stamp = ? 
                   AND (b.bl_max - b.bl_min) > 0.1 
                   AND (d.maxvalue - b.bl_max) >0 
                    AND sc.enabled = true 
                   AND sc.definition = def.id 
                   AND def.numeric_type = 0 
              GROUP BY d.schedule_id 
UNION ALL 
      SELECT max(((b.bl_min - d.minvalue) / (b.bl_max - b.bl_min))) AS mx, d.schedule_id as id 
        FROM rhq_measurement_bline b, rhq_measurement_data_num_1h d, rhq_measurement_sched sc, rhq_measurement_def def 
       WHERE b.schedule_id = d.schedule_id 
         AND sc.id = b.schedule_id 
         AND d.value < b.bl_max  
         AND d.time_stamp = ? 
         AND (b.bl_max - b.bl_min) > 0.1 
         AND (b.bl_min - d.minvalue) >0 
         AND sc.enabled = true 
         AND sc.definition = def.id 
         AND def.numeric_type = 0 
    GROUP BY d.schedule_id 
   ) data 
GROUP BY id, mx 
  HAVING mx > 0.05 )

The problem is that the sub-query:

SELECT max(((d.maxvalue - b.bl_max) / (b.bl_max - b.bl_min))) AS mx, d.schedule_id as id 
                  FROM rhq_measurement_bline b, rhq_measurement_data_num_1h d, rhq_measurement_sched sc, rhq_measurement_def def 
                 WHERE b.schedule_id = d.schedule_id 
                   AND sc.id = b.schedule_id 
                   AND d.value > b.bl_max 
                   AND d.time_stamp = 1361358000000 
                   AND (b.bl_max - b.bl_min) > 0.1 
                   AND (d.maxvalue - b.bl_max) >0 
                    AND sc.enabled = true 
                   AND sc.definition = def.id 
                   AND def.numeric_type = 0 
              GROUP BY d.schedule_id ;

has the "mx" value set to 1746108415 on my box, and when multiplying it by 100 and inserting it to rhq_measurement_oob_tmp, the psql exception is thrown, because the range of integer is not sufficient for such a big value. The measurement definition responsible for this value is the "Native.MemoryInfo.actualUsed" and it comes from natural usage of RHQ (I didn't insert it to the DB directly).


Version-Release number of selected component (if applicable):
4.6.0-SNAPSHOT

How reproducible:
always for my box

Steps to Reproduce:
1. http get -> http://localhost:7080/admin/test/control.jsp?mode=calculateOOBs
  
Actual results:
 no errors

Expected results:
 error

Comment 1 Jirka Kremser 2013-02-20 14:59:45 UTC
I can provide the pg_dump file, but it has 267.1 MB though (compressed). I guess I could pg_dump only some relevant tables.

Comment 2 Jay Shaughnessy 2013-02-21 20:19:53 UTC
I think the analysis is correct above, the:

 SELECT max(mx*100) ... INTO rhq_measurement_oob_tmp (oob_factor, ...

is dangerous id mx is large, like in the description.

I don't think this is a problem in Oracle because Oracle is using a NUMBER field as opposed to an actual Integer.  Hibernate likely will deal with a conversion to the entity int field, later.

But Postgres uses an actual Integer field for rhq_measurement_oob_tmp.oob_factor and can't store the result.  I suppose there is also a chance that the OOR error is generated during the calculation as well.

I guess the resulting value is supposed to be a percentage, thus explaining the mx*100, although I'm not sure.

There are 2 main options I see:
1) Avoid storing a number that is > MAXINT
2) Change the datatype to something that can handle the larger value.

Option 1: Avoid storing a number that is > MAXINT

After playing around interactively this may not be as simple as it seems.  Although there are some alternatives, one that is easy to implement.  The most obvious approach to me was trying to cast the value to an integer before trying to do the insert. This did not work for me. It seems Postgres does not allow a dangerous downcast like what we want to do.  

The second solution would be to break up the insert and select portions of the query.  This probably would not be difficult.  Just get the results back, massage the value as needed in the code, and do the inserts.

The third solution is to step back and ask, "Is this gigantic value actually valid?".  It seems to me that an OOB of this magnitude is likely bogus. We're talking about a difference (i think) between baseline and actual value of > MAXINT percent! I wonder whether we should throw out ridiculously high values in addition to what we do currently, throwing out value < 5%.  The query could change:
  HAVING mx > 0.05
to:
  HAVING mx > 0.05 AND mx < 1000.0

Or something like that, throw out values < 100,000%.


Option 2: Avoid storing a number that is > MAXINT

In this case we'd have to make a db update to change the field from integer to something bigger. And ensure that any code changes are also made to support the new type.


Note that with cassandra coming the calculation will be done differently, and the ability to massage the data before storage may be a natural thing.  But that won't help any backporting to older versions.


Note You need to log in before you can comment on or make changes to this bug.