Bug 725320 - alerts will spontaneously stop firing for some of the agents
Summary: alerts will spontaneously stop firing for some of the agents
Status: CLOSED DUPLICATE of bug 725445
Alias: None
Product: RHQ Project
Classification: Other
Component: Alerts
Version: unspecified
Hardware: Unspecified
OS: Unspecified
medium vote
Target Milestone: ---
: ---
Assignee: Simeon Pinder
QA Contact: Mike Foley
Depends On:
Blocks: jon3 rhq41beta
TreeView+ depends on / blocked
Reported: 2011-07-25 05:40 UTC by Simeon Pinder
Modified: 2011-08-11 03:05 UTC (History)
1 user (show)

Clone Of:
Last Closed: 2011-07-27 20:25:49 UTC

Attachments (Terms of Use)

External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Bugzilla 725429 None None None Never
Red Hat Bugzilla 725445 None None None Never

Internal Trackers: 725429 725445

Description Simeon Pinder 2011-07-25 05:40:42 UTC
Description of problem: 
With three agents monitoring identical DefaultDS resources for the same JBoss AS version, alerts will spontaneously stop firing for some of them.  Using an Alert template to monitor alert conditions on all the agents, over time alerts spontaneously stop firing.  By manually setting the baseline values for 'Available Connections' for the DefaultDs, and defining three alerts that guarantee that at least one alert will fire, the setup should guarantee alerts during each metric gather for the resource. 

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. Install several agents on different platforms and register them with the same JON server.
2. Install the same jboss AS instance on all of the agents, discover and import them.
3. Got to Admin > Template definition and locate the Datasource template and define three different alerts based off of the baseline value, but such that one of the alert conditions will always fire.
4. Navigate to the DefaultDS on each of the agents and verify that the defined alerts are correctly associated with each of the resources.
5. Set the metric collection intervals to 5 minutes and verify that alerts are being created.
Actual results:
Over time alerts will stop firing for one of the resources, despite correct alert conditions and valid baseline values. 
*Note* if no baseline is set then no alerts will be generated and this is valid behavior.

Expected results:
All alerts continue to fire as expected.

Additional info:

Comment 1 Simeon Pinder 2011-07-27 20:25:49 UTC
After further investigation BZ 725445 is likely the cause of this behavior.  When the agent count no longer correctly reflects the amount of agents connected/reporting to the JON server, then the Alert Conditions for missing agents can be purged from the alert evaluation process and demonstrate the reported behavior.

This BZ was one of several bugs reported when a JON Server<->Agent miss-match occurred.

*** This bug has been marked as a duplicate of bug 725445 ***

Note You need to log in before you can comment on or make changes to this bug.