Bug 670607 - RFE: Make baseline alert condition definitions less confusing
Summary: RFE: Make baseline alert condition definitions less confusing
Keywords:
Status: NEW
Alias: None
Product: RHQ Project
Classification: Other
Component: Alerts
Version: 3.0.0
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
: ---
Assignee: Nobody
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-01-18 19:38 UTC by Larry O'Leary
Modified: 2022-03-31 04:28 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
JON 2.4.0 on RHEL5
Last Closed:
Embargoed:


Attachments (Terms of Use)

Description Larry O'Leary 2011-01-18 19:38:01 UTC
Description of problem:
Creating an alert definition which is to fire when a set percentage of a baseline is met does not seem to get evaluated and therefore the alert condition is never met (no alert is fired).

Version-Release number of selected component (if applicable):
3.0.0 (JON 2.4.0 GA)

How reproducible:
Every time

Steps to Reproduce:
1. Navigate to RHQ Server's RHQDS resource
2. Set "Active Connections" (Monitor -> Schedules) to 1 minute interval
3. Create new Alert Definition (Alert -> Definition) "New Definition"
   * Name: High connection percentage
   * Description: Current active database connections exceeds the specified threshold percentage.
   * Expression: All
   * If Condition: Active Connections
   * is Greater than 0 % of Max Value

  
Actual results:
No alert is fired even though Active Connections is 1 and max connections is 1 (100%) (Alert -> History)

Expected results:
Alert should be fired every minute and be seen on Alert -> History page as "High connection percentage"

Additional info:
In the test case, the baseline for RHQDS shows 1 active connection with a max active connections of 1 (min and avg active connections are also 1). In this case, the alert criteria of <Current Active Connections> is > 0% of Max Active Connections (i.e. 1/1 = 1 * 100 = 100% > 0% = true) but this condition doesn't ever seem to be evaluated as I do not see any indication in the debug log output as I do with absolute criteria.

Comment 1 Larry O'Leary 2011-01-18 20:34:04 UTC
I retested using 10% instead of 0% (to ensure that it was not the 0% causing the issue) and the problem still exists. This does not appear to be an issue with the actual value of the percentage.

Comment 2 Jay Shaughnessy 2011-01-21 22:00:01 UTC
This has to do with the UI being confusing.  I was experiencing the same behavior and did some investigation.  It turns out that this is the normal behavior if you don't have a baseline set for the metric in question.  In that case the alert condition is not even evaluated, it's basically invalid until a baseline exists.

There are a couple of things that make this confusing:
1) We allow a user to specify the condition to begin with. It's fair that we do
   because it could be a template, or just intentional.  But perhaps we could do
   something more here.

2) It's easy to miss that this condition relates to baseline values.
   Min/Average/Max really means BaselineMin/BaselineAverage/BaselineMax.  It's
   confusing because the Tables subtab shows Min/Average/Max and a user may
   think those are the values being tested against. They aren't, those are just
   calculated values for the Date Range being applied to the table.

3) It's not very easy to actually see if a Baseline is set (or when it may get
   set if it isn't already). And as a doc-note, baseline generation is currently
   only covered in the FAQ.

One more technical note, it's not clear to me that if this type of condition exists, and the baseline is generated, that the agent condition cache gets refreshed in any timely manner to ensure that the new baseline value is picked up. We need to understand what happens in this scenario.

I'm changing the title of this to be an RFE for RHQ4's GUI.

Comment 3 Charles Crouch 2011-09-30 23:26:31 UTC
FutureFeature Improvement


Note You need to log in before you can comment on or make changes to this bug.