Bug 1292948

Summary: Some MeasurementData may not be processed by alerting
Product: [Other] RHQ Project Reporter: Jay Shaughnessy <jshaughn>
Component: AlertsAssignee: Jay Shaughnessy <jshaughn>
Status: POST --- QA Contact:
Severity: high Docs Contact:
Priority: urgent    
Version: 4.10CC: hrupp
Target Milestone: ---   
Target Release: RHQ 4.14   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1293368 (view as bug list) Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1293368    

Description Jay Shaughnessy 2015-12-18 19:50:16 UTC
A regression due to work in Bug 1028624.

Only one MeasurementData per unique timestamp, per measurement report, is forwarded for alerting evaluation.

So, if a MeasurementReport contains multiple datums with the same timestamp, only one will make it to alerting for evalaution.  In this situation if there is an alert definition dependent on the omitted data, it will not fire as needed.

This issue is exacerbated by many metrics being set for the same collection schedule.  So, to some degree the issue can be alleviated by having relevant metrics collected on a schedule that does not coincide with other collections performed by the same plugin (not so easy to do with a lot of metrics being collected)

To simulate the issue you can try the following:
1) Create an alert def on Free Swap Space for a platform. 
   ** Choose a threshold value that should always cause the alert to fire.
2) Set collection interval for Free Swap Space to 1 minute
   ** note - I am assuming all default collection intervals, so all of the
      other metrics should have much slower collection times.
   ** you should start to see an alert every minute
3) After several alerts have fired then select all of the active metrics,
   including Free Swap Space (so it is synced up), and set collection the
   intervals for all of them to 1 minute (all together, not one at a time)
   ** The alerting will likely stop, as now all of those metrics likely have
      the same timestamp and only one is making it to alerting.
4) After a pause, reset all by Free Swap to 20 minutes.
   ** Alerting should resume.

Comment 1 Jay Shaughnessy 2015-12-18 20:03:53 UTC
master commit ae88be6e4cf1fe2b9039d87416c5706aab72aff7
Author: Jay Shaughnessy <jshaughn>
Date:   Fri Dec 18 15:01:42 2015 -0500

 Because this Comparator is used to build a Set, it has to be careful not
 to allow equality just on equal timestamps, equality has to also include
 the schedule id otherwise data gets omitted.

Comment 2 Jay Shaughnessy 2016-01-04 15:18:28 UTC
Master commit 452cf503eab714c90109778d23274c5660ad86a3
Author: Jay Shaughnessy <jshaughn>
Date:   Mon Jan 4 10:17:49 2016 -0500

    [BZ 1292948] update fix to be Java 1.6 compatible

Comment 3 Mike McCune 2016-03-28 22:49:28 UTC
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions