1) start a server and a new agent 2) import the new platform 3) Create an alert on the platform resource - a "Goes Down" availability alert. 4) in the agent prompt, invoke "shutdown" (or just kill the agent) 5) notice no alert is fired - this is the bug 6) restart the agent (or type "start" if you are still at the agent prompt) 7) repeat step 4 (shutdown the agent) 8) notice that an alert IS fired. Why does the alert fire the second time, but not the first?
documenting this is OK in JON 3.1 <mfoley_> trying this now <mfoley_> ok ... it worked for me 1st time in JON 3.1 <mfoley_> but i can retest <mfoley_> this is working for me in JON 3.1 <viet> it worked for me too first time in CR3
I am seeing this, but not 100% of the time. I just tried again, started with fresh DB, newly imported platform. I start the server, when its up, I start the agent. I import the RHQ Agent and the platform. On the platform, I create a Going Down alert. I shutdown the agent. In the server logs, I see this: 13:45:34,901 INFO [CoreServerServiceImpl] Agent [mazztower][4.5.0-SNAPSHOT(c96fb05)] would like to connect to this server 13:45:35,018 INFO [CoreServerServiceImpl] Agent [mazztower] has connected to this server at Fri Jun 08 13:45:35 EDT 2012 13:45:52,170 INFO [CoreServerServiceImpl] Got agent registration request for existing agent: mazztower[192.168.1.2:16163][4.5.0-SNAPSHOT(c96fb05)] - Will not regenerate a new token 13:46:30,143 INFO [CacheConsistencyManagerBean] localhost took [49]ms to reload cache for 1 agents 13:46:41,767 INFO [AgentManagerBean] Agent with name [mazztower] just went down 13:47:00,200 INFO [CacheConsistencyManagerBean] localhost took [43]ms to reload global cache 13:47:00,258 INFO [CacheConsistencyManagerBean] localhost took [43]ms to reload cache for 1 agents I think it might have something to do wiht the reloading of the caches.
I just tried again - clean DB, new agent. This time, the alert fired. But here's something different, I did not see the alert caches get reloaded: 14:11:54,500 INFO [CoreServerServiceImpl] Got agent registration request for existing agent: mazztower[192.168.1.2:16163][4.5.0-SNAPSHOT(c96fb05)] - Will not regenerate a new token 14:12:38,094 INFO [CacheConsistencyManagerBean] localhost took [51]ms to reload global cache 14:12:38,158 INFO [CacheConsistencyManagerBean] localhost took [49]ms to reload cache for 1 agents 14:12:56,487 INFO [AgentManagerBean] Agent with name [mazztower] just went down Notice in comment #2, when the alert didn't fire, you notice that after the agent went down, the two alert caches reloaded.
This is to be expected. See the new FAQ I added so I don't forget this again 3 years from now :) http://rhq-project.org/display/JOPR2/FAQ#FAQ-IcreatedanalertdefinitionandIknowimmediatelythereaftermyagentshouldhavereporteddatathatshouldhavetriggeredthealertbutmyalertdidnotfire.Wheredidmyalertgo%3F