Red Hat Bugzilla – Bug 725320
alerts will spontaneously stop firing for some of the agents
Last modified: 2011-08-10 23:05:18 EDT
Description of problem:
With three agents monitoring identical DefaultDS resources for the same JBoss AS version, alerts will spontaneously stop firing for some of them. Using an Alert template to monitor alert conditions on all the agents, over time alerts spontaneously stop firing. By manually setting the baseline values for 'Available Connections' for the DefaultDs, and defining three alerts that guarantee that at least one alert will fire, the setup should guarantee alerts during each metric gather for the resource.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Install several agents on different platforms and register them with the same JON server.
2. Install the same jboss AS instance on all of the agents, discover and import them.
3. Got to Admin > Template definition and locate the Datasource template and define three different alerts based off of the baseline value, but such that one of the alert conditions will always fire.
4. Navigate to the DefaultDS on each of the agents and verify that the defined alerts are correctly associated with each of the resources.
5. Set the metric collection intervals to 5 minutes and verify that alerts are being created.
Over time alerts will stop firing for one of the resources, despite correct alert conditions and valid baseline values.
*Note* if no baseline is set then no alerts will be generated and this is valid behavior.
All alerts continue to fire as expected.
After further investigation BZ 725445 is likely the cause of this behavior. When the agent count no longer correctly reflects the amount of agents connected/reporting to the JON server, then the Alert Conditions for missing agents can be purged from the alert evaluation process and demonstrate the reported behavior.
This BZ was one of several bugs reported when a JON Server<->Agent miss-match occurred.
*** This bug has been marked as a duplicate of bug 725445 ***