Have e.g. a metric sender that is sending 1,0,0,0,0,0,1,0,0,0,... (that is 1, five zeros, 1 , five zeros, ... ) and put an alert on condition "value == 0" for three consecutive times. You will see that this actually fires at index 4 (the third zero in above series, starting to count at 1 ) and at index 8, which is the first 0 after the 2nd 1. This is wrong, as the second '1' breaks the sequence of 3 consecutive 0s. Now set up a 2nd alert, where the condition is slightly altered to say "value < 0.1". Now the alert correctly fires at index 4 and 10 - each time when three zeroes have been detected. The difference between == and < is in org.rhq.enterprise.server.alert.engine.model.AlertConditionOperator : LESS_THAN(Type.STATEFUL), // EQUALS(Type.STATELESS), // Where Type.STATELESS prohibits sending a "start counting" message in org.rhq.enterprise.server.alert.engine.internal.AbstractConditionCache#processCacheElements
Ok so we have a workaround for the current release. But this should be fixed in master
Created attachment 493531 [details] Screenshot of the alert history. Screenshot of fired alerts. the 1 are sent at XX Wed Apr 20 17:28:06 CEST 2011 -- 1 XX Wed Apr 20 17:34:06 CEST 2011 -- 1 XX Wed Apr 20 17:40:06 CEST 2011 -- 1 XX Wed Apr 20 17:46:06 CEST 2011 -- 1 XX Wed Apr 20 17:52:06 CEST 2011 -- 1 XX Wed Apr 20 17:58:06 CEST 2011 -- 1 metrics are sent each minute , so the alert has to trigger 3 minutes after the 1 is sent. This happens for the lt ('<') case correctly.
Created attachment 493532 [details] Plugin that generates the pattern mentioned in the case
Changed '=' to stateful in 1bba9e4a61e82160afc7924e167f3da56187cabb
Note that you can use the new pattern-plugin to generate sequences of 1s and 0s of arbitrary lengths to test this.
verified using the attached 'pattern' plug-in. set an alert, as described, with dampening. the alert fired, and then was correctly dampened.
Bulk closing of issues that were VERIFIED, had no target release and where the status changed more than a year ago.