Bug 535756 (RHQ-2419) - Alerting for call-time data
Summary: Alerting for call-time data
Keywords:
Status: CLOSED UPSTREAM
Alias: RHQ-2419
Product: RHQ Project
Classification: Other
Component: Alerts
Version: unspecified
Hardware: All
OS: All
medium
medium
Target Milestone: ---
: ---
Assignee: Joseph Marques
QA Contact: Sudhir D
URL: http://jira.rhq-project.org/browse/RH...
Whiteboard:
: RHQ-311 (view as bug list)
Depends On:
Blocks: 607341
TreeView+ depends on / blocked
 
Reported: 2009-09-11 11:19 UTC by Frank Brueseke
Modified: 2010-09-20 17:05 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
: 607341 (view as bug list)
Environment:
Last Closed: 2010-07-20 00:10:02 UTC
Embargoed:


Attachments (Terms of Use)
patchRev5155.diff (69.93 KB, application/octet-stream)
2009-09-11 11:19 UTC, Frank Brueseke
no flags Details

Description Frank Brueseke 2009-09-11 11:19:00 UTC
Concept:
Alerting is a useful feature in RHQ. However, RHQ is so far not able to raise alerts on the basis of call-time data. Call-time data pre-aggregates data into max, min, and avg (total/count) in map indexed by call destinations.

I propose to implement alerting for call-time data. Alerting should have two features: comparison to absolute values and change detection (i.e. comparison to old values).

General issues: Alerts for call-time data will always work on one of the seperate values included in the pre-aggregated structure, i.e. max, min, avg, and count. Moreover, only those call-time data map entries will be checked to match a (optional) regular expression.

Comparison to absolute values: Alerts when a choosen value of this call-time metric (i.e. max, min, avg, count) is smaller, equal or greater than some actual value.

Change detection: Alerts when the current value of this call-time metric (i.e. max, min, avg, count) is within a certain percentage band of the previous value. The current value is only compared to a previous value if the previous value had the same call destination. Change detection knows three operators: shrinks, change, grows. 
Let "p" be the previous value and "c" be the current value. Both belong to the same call destination. Moreover, let "x%" be a percent value specified by the user. Then, an alert will be then when:
Shrinks: c < p - (x% * p)
Grows: c > p + (X% * p)
Changes: (c < p - (x% * p)) || (c > p+ (x% *p))

I supply an implementation as a (subversion) patch based on RHQ revision 5155. I donate the code on behalf of Orga Systems GmbH. We publish the code subject to the LGPL.

Implementation notes:
- Change detection compares previous and current value. We cache the previous value in the specific CacheElement. This means that the cache is flushed when the CacheElement is reinstantiated (i.e. when the cache is reloaded). In this period we might miss data constellations which might cause an alert otherwise.
- Change detection relies on the CallTimeDataValue objects to be processed in chronological order. We implement this by sorting the CallTimeDataValue objects of one data chunk in advance to processing the data. It is possible, but very unlikely that call-time data *for same resource and metric* are processed concurrently. Then our implementation will not work correctly, i.e. deliver alert when it shouldn't or deliver no alerts when it should, because the chronological ordering is broken by interleaving checks for two data sets.

Comment 1 Frank Brueseke 2009-09-11 11:19:55 UTC
Patch including the requested feature (alerting on call-time data) based on revision 5155.

Comment 2 Red Hat Bugzilla 2009-11-10 21:04:05 UTC
This bug was previously known as http://jira.rhq-project.org/browse/RHQ-2419
Imported an attachment (id=368796)


Comment 4 Heiko W. Rupp 2010-02-26 15:42:37 UTC
*** Bug 535907 has been marked as a duplicate of this bug. ***

Comment 5 Justin Harris 2010-03-03 16:35:55 UTC
This patches the old(er) struts UI code, and we have recently ported the alerting UI to JSF/Facelets - so this will likely take some substantial effort to work into the newer UI.

Comment 6 Heiko W. Rupp 2010-03-03 16:48:19 UTC
The origial patch has been applied in the alertPlus branch,
http://git.fedorahosted.org/git/rhq/rht.git?p=rhq/rhq.git;a=shortlog;h=refs/heads/alertPlus

Comment 7 Joseph Marques 2010-06-24 15:51:44 UTC
commit 2e1983f0d518b4a1133bf2922f441ca5b0d281f8
Author: Joseph Marques <joseph>
Date:   Wed Jun 23 00:59:55 2010 -0400

    BZ-535756: integrate patch (with minimal tweaks) from 'fbrueske' to support alerting for call-time data

Comment 8 Charles Crouch 2010-07-01 18:04:53 UTC
JON 2.4 build:
Test that this feature is not available, and not able to be enabled

RHQ 3.0 build:
Test the feature is available by default and works as expected.

Comment 9 Sudhir D 2010-07-07 11:33:37 UTC
Tested on 2.4 and this feature is not available.

Comment 10 Corey Welton 2010-07-20 00:10:02 UTC
QA Closing - rhq upstream bug.  This bug is presumed to be fixed in the upstream branch.  If this is not the case, this bug can be reopened.

Comment 11 John Mazzitelli 2010-09-20 17:05:56 UTC
the new gwt alert def editor now has the ability to add calltime conditions. see git commit 665349f2a221a7aed276f67197ca9289fba322b2


Note You need to log in before you can comment on or make changes to this bug.