Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1293368

Summary: Some MeasurementData may not be processed by alerting
Product: [JBoss] JBoss Operations Network Reporter: Larry O'Leary <loleary>
Component: Monitoring - AlertsAssignee: Michael Burman <miburman>
Status: CLOSED ERRATA QA Contact: vsorokin <vsorokin>
Severity: high Docs Contact:
Priority: urgent    
Version: JON 3.3.0, JON 3.3.1, JON 3.3.2, JON 3.3.3, JON 3.3.4CC: bkramer, fbrychta, jshaughn, mfoley, spinder, vsorokin
Target Milestone: ER01Keywords: Triaged
Target Release: JON 3.3.5   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1292948 Environment:
Last Closed: 2016-02-03 15:04:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1292948    
Bug Blocks:    

Description Larry O'Leary 2015-12-21 15:12:01 UTC
This fix needs to be pulled into JBoss ON 3.3.z as soon as possible. 

+++ This bug was initially created as a clone of Bug #1292948 +++

A regression due to work in Bug 1028624.

Only one MeasurementData per unique timestamp, per measurement report, is forwarded for alerting evaluation.

So, if a MeasurementReport contains multiple datums with the same timestamp, only one will make it to alerting for evalaution.  In this situation if there is an alert definition dependent on the omitted data, it will not fire as needed.

This issue is exacerbated by many metrics being set for the same collection schedule.  So, to some degree the issue can be alleviated by having relevant metrics collected on a schedule that does not coincide with other collections performed by the same plugin (not so easy to do with a lot of metrics being collected)

To simulate the issue you can try the following:
1) Create an alert def on Free Swap Space for a platform. 
   ** Choose a threshold value that should always cause the alert to fire.
2) Set collection interval for Free Swap Space to 1 minute
   ** note - I am assuming all default collection intervals, so all of the
      other metrics should have much slower collection times.
   ** you should start to see an alert every minute
3) After several alerts have fired then select all of the active metrics,
   including Free Swap Space (so it is synced up), and set collection the
   intervals for all of them to 1 minute (all together, not one at a time)
   ** The alerting will likely stop, as now all of those metrics likely have
      the same timestamp and only one is making it to alerting.
4) After a pause, reset all by Free Swap to 20 minutes.
   ** Alerting should resume.

--- Additional comment from Jay Shaughnessy on 2015-12-18 15:03:53 EST ---


master commit ae88be6e4cf1fe2b9039d87416c5706aab72aff7
Author: Jay Shaughnessy <jshaughn>
Date:   Fri Dec 18 15:01:42 2015 -0500

 Because this Comparator is used to build a Set, it has to be careful not
 to allow equality just on equal timestamps, equality has to also include
 the schedule id otherwise data gets omitted.

Comment 5 Simeon Pinder 2016-01-16 08:17:35 UTC
Moving to ON_QA for testing with the following build:

https://brewweb.devel.redhat.com//buildinfo?buildID=474795

http://download.devel.redhat.com/brewroot/packages/org.jboss.on-jboss-on-parent/3.3.0.GA/75/maven/org/jboss/on/jon-server-patch/3.3.0.GA/jon-server-patch-3.3.0.GA.zip
 *Note: jon-server-patch-3.3.0.GA.zip maps to ER01 build of
 jon-server-3.3.0.GA-update-05.zip.

Comment 6 vsorokin 2016-01-21 19:31:16 UTC
Steps of verification:

1) Create new Alert definition (Free Swap Space for 'vso-jbosson-n1',)
   Metric Value Baseline [Free Swap Space < 100% of max] 
   notification by direct email. 

2) Set collection interval for Free Swap Space to 1 minute

OBSERVED: Alerts started to arrive by email, and displayed in appropriate subwindows.

3) After several alerts have fired selected all of the active metrics,
 and set collection the intervals for all of them to 1 minute.

4) Few times collecting intervals was changed from 1 to 20 minutes, both for all metrics together and separately. 

OBSERVED: Arrival of Alerts continued with corresponding collecting intervals.

Comment 8 errata-xmlrpc 2016-02-03 15:04:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-0118.html