Bug 787227 - Using an Availability condition on a recovery Alert doesn't trigger Alert or Recovery
Summary: Using an Availability condition on a recovery Alert doesn't trigger Alert or ...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: JBoss Operations Network
Classification: JBoss
Component: Monitoring - Alerts
Version: JON 3.0.0
Hardware: All
OS: All
high
medium
Target Milestone: ---
: JON 3.1.2
Assignee: RHQ Project Maintainer
QA Contact: Mike Foley
URL:
Whiteboard:
Depends On:
Blocks: 801504
TreeView+ depends on / blocked
 
Reported: 2012-02-03 15:55 UTC by dsteigne
Modified: 2018-11-26 17:25 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 801504 (view as bug list)
Environment:
Last Closed: 2012-11-16 03:23:51 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 878224 0 high CLOSED Updated alert defs may not fire in an HA environment 2021-02-22 00:41:40 UTC

Internal Links: 878224

Description dsteigne 2012-02-03 15:55:41 UTC
Description of problem:
Using an Availability condition on a recovery Alert doesn't trigger Alert or Recovery.  If you define an Alert for Availability "Comes up" and use this as a Recovery Alert to re-enable another Alert Definition.  You never receive the alert for the Availability nor does it re-enable the other alert.

Version-Release number of selected component (if applicable):
3.0

How reproducible:
Everytime

Steps to Reproduce:
1. Define an alert "OOM" using the condition type "event detection", the event severity "error" and the regular expression "java.lang.OutOfMemoryError".
2. Add email notification and the "Restart" Resource Operation 
3. Select "Yes" for "Disable When Fired"
4. Define another alert "OOM (Recovery)" using condition type "Availability Change", Availability "Comes Up"
5. On the Recovery tab select the "OOM" Alert for Recover Alert.
6. Trigger an OOM error on the EAP server, I set the Max Heap size low.
7. The alert will trigger for OOM, it will be disabled and the Restart is triggered.
8. You can see that the EAP is restarted and running, but the "OOM (Recovery)" alert is not triggered, nor does it re-enable the "OOM" Alert.
9. Then change the condition type on the "OOM (Recovery)" to say "event detection" INFO on the Microcontainer started in message:
INFO  [org.jboss.bootstrap.microcontainer.ServerImpl] (main) JBoss (Microcontainer) [5.2.0.GA_SOA (build: SVNTag=5.2.0.GA_SOA date=201111090730)] Started in 1m:34s:817ms
10. With this condition type everything works, "OOM (Recovery)" alert is fired and it re-enables the "OOM" Alert.

  
Actual results:

Alert is not triggered and recover alert is not re-enabled.


Expected results:

Alert is triggered and recover alert is re-enabled.

Comment 1 Charles Crouch 2012-03-08 16:09:24 UTC
Are you sure that the OOM of the AS instance and the restart operation actually triggered a change in the availability of the EAP Server? If you choose the EAP server from the inventory then go to its Monitoring>Availability subtab do you see a row in there indicating the EAP instance was unavailable at the time of the OOM condition and then showing available again after the restart operation completed?

Comment 2 dsteigne 2012-03-08 16:59:43 UTC
Yes, both conditions show on the Availability tab

Comment 3 Jay Shaughnessy 2012-05-09 19:38:55 UTC
See comments in bug 801504

Comment 4 Charles Crouch 2012-11-05 21:26:38 UTC
Jay was specifically referring to https://bugzilla.redhat.com/show_bug.cgi?id=801504#c3

Comment 5 Charles Crouch 2012-11-06 20:10:02 UTC
Triage: If there is nothing to do here, we should close.

Comment 6 Larry O'Leary 2012-11-16 03:23:51 UTC
I agree with comment https://bugzilla.redhat.com/show_bug.cgi?id=801504#c3 in upstream bug 801504. The recovery alert never gets triggered here because the availability never actually changed. When the OOM is thrown in the EAP log and the "restart" alert is fired due to the event, the availability was never actually down. So, the restart operation is invoked, but the seeing that the restart happens between availability checks, the EAP instance was never DOWN and therefore, can not be seen as Goes Up.

Which is exactly as Charles described it in comment 1. After reviewing the original case which raised this issue it appears that the user was experiencing a configuration issue combined with what comment 1 suggested. I am closing this as NOTABUG.


Note You need to log in before you can comment on or make changes to this bug.