Bug 701092

Summary: Turning off agent doesn't cause platform to show as down for 15mins
Product: [Other] RHQ Project Reporter: Charles Crouch <ccrouch>
Component: MonitoringAssignee: Jay Shaughnessy <jshaughn>
Status: CLOSED CURRENTRELEASE QA Contact: Mike Foley <mfoley>
Severity: unspecified Docs Contact:
Priority: medium    
Version: 4.0.0CC: hbrock, hrupp, jshaughn
Target Milestone: ---Keywords: FutureFeature
Target Release: RHQ 4.4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-09-01 10:05:07 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 741450    
Attachments:
Description Flags
platform monitoring tab none

Description Charles Crouch 2011-05-01 02:42:48 UTC
RHQ 4.0.0.CR

I turned off an agent monitoring a platform and see this in the server log:

2011-04-30 22:15:35,232 INFO  [org.rhq.enterprise.server.core.AgentManagerBean] Agent with name [i-b21d37dd] just went down

It was then 15minutes before any of the resources on the platform started showing red?

a) I thought we were back filling after 10mins?
b) If we now the agent went down, don't we want to backfill more aggressively than normal

Comment 1 Charles Crouch 2011-05-01 02:44:31 UTC
Created attachment 496017 [details]
platform monitoring tab

Comment 2 Jay Shaughnessy 2012-02-28 19:57:10 UTC
This has basically been implemented in the jshaughn/avail branch.
Backfilling has been dropped to 5 minutes and also, graceful agent
shutdown no longer depends on suspect job detection, the backfilling
will be performed immediately.

See:

http://rhq-project.org/display/RHQ/Design-Availability+Checking#Design-AvailabilityChecking-DesignandChanges

For more on planned avail changes.

Comment 3 Jay Shaughnessy 2012-03-30 20:31:11 UTC
This is in Master.

Comment 4 Heiko W. Rupp 2013-09-01 10:05:07 UTC
Bulk closing of items that are on_qa and in old RHQ releases, which are out for a long time and where the issue has not been re-opened since.