Bug 1038202 - recovery alert fired twice
Summary: recovery alert fired twice
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: JBoss Operations Network
Classification: JBoss
Component: Core Server
Version: JON 3.2
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: JON 3.3.0
Assignee: RHQ Project Maintainer
QA Contact: Mike Foley
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-12-04 15:41 UTC by Armine Hovsepyan
Modified: 2015-09-03 00:02 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-08-27 13:13:17 UTC
Type: Bug


Attachments (Terms of Use)
doubleRecoveryAlert (163.36 KB, image/png)
2013-12-04 15:41 UTC, Armine Hovsepyan
no flags Details
s1_server.log (530.12 KB, text/x-log)
2013-12-04 15:42 UTC, Armine Hovsepyan
no flags Details
s1_agent.log (5.54 MB, text/x-log)
2013-12-04 15:43 UTC, Armine Hovsepyan
no flags Details
s2_agent.log (754.94 KB, text/x-log)
2013-12-04 15:45 UTC, Armine Hovsepyan
no flags Details
s2_server.log (248.17 KB, text/x-log)
2013-12-04 15:46 UTC, Armine Hovsepyan
no flags Details

Description Armine Hovsepyan 2013-12-04 15:41:48 UTC
Created attachment 832709 [details]
doubleRecoveryAlert

Description of problem:
recovery alert fired twice in HA environment

Version-Release number of selected component (if applicable):
jon 3.2 CR1

How reproducible:
noticed once

Steps to Reproduce: (steps take under 1030108 )
1.  Install and start EAP 6 standalone server.
2.  Install a two server (HA) JON 3.1.2 system.
4.  Start server-02 of JON HA system and wait for it to come up.
5.  Start server-01 of JON HA system and wait for it to come up.
    Be sure server 2 is started first so Quartz is running there.
6.  Start agent in foreground.
7.  Import EAP 6 standalone server into inventory and configure connection settings.
8.  Create new _stays down for 2 minutes_ alert definition for EAP resource:

    *Name*: `Alert - Profile Down`
    *Condition*:
        *Fire alert when*:          _ANY_
        *Condition Type*:           _Availability Duration_
        *Availability Duration*:    _Stays Down_
        *Duration*:                 `2` _minutes_
    *Recovery*:
        *Disable When Fired*:   _Yes_

9.  Create new _recovery_ alert definition for EAP default resource:

    *Name*: `Recovery - Profile Down`
    *Condition*:
        *Fire alert when*:  _ANY_
        *Condition Type*:   _Availability Change_
        *Availability*:     _Goes up_
    *Recovery*:
        *Recovery Alert*:   _Alert - Profile Down_

10. From outside of JBoss ON, shutdown EAP server.
11. From agent prompt, execute avail -f
12. Verify EAP availability shows DOWN.
13. Wait approximately 1 minute and 45 seconds.
14. Start EAP server from outside of JON.
15. Wait approximately 15 seconds
16. From agent prompt, execute avail -f

Actual results:
recovery alert fired twice (1 sec difference)

Expected results:
recovery alert fired once


Additional info:
screen-shot and logs attached

Comment 1 Armine Hovsepyan 2013-12-04 15:42:22 UTC
Created attachment 832710 [details]
s1_server.log

Comment 2 Armine Hovsepyan 2013-12-04 15:43:14 UTC
Created attachment 832712 [details]
s1_agent.log

Comment 3 Armine Hovsepyan 2013-12-04 15:45:34 UTC
Created attachment 832713 [details]
s2_agent.log

Comment 4 Armine Hovsepyan 2013-12-04 15:46:00 UTC
Created attachment 832714 [details]
s2_server.log

Comment 5 Jay Shaughnessy 2014-04-04 20:48:18 UTC
I can't really explain this but there are some odd entries in the s1 log that could indicate some strangeness in the DB or possibly something off with the CR build.  Availability record repair was applied in one place, and odd issues with CriteriaQueryRunner and Alert.recoveryAlertDefinition.

Unless we see this in a GA or current master build I'm not sure there is anything to do here.

Comment 6 Jay Shaughnessy 2014-08-26 22:25:38 UTC
Armine, there is nothing to do here that I can think of. Shall we close it or do you have an idea for pursuing further?

Comment 7 Armine Hovsepyan 2014-08-27 12:36:53 UTC
I haven't seen this anymore as well, so I guess this can be closed as wont-fix or works-for-me.

Comment 8 Jay Shaughnessy 2014-08-27 13:13:17 UTC
Thanks, Armine. Closing as WorksForMe.


Note You need to log in before you can comment on or make changes to this bug.