Bug 536307 (RHQ-671)

Summary: agent failover mechanism
Product: [Other] RHQ Project Reporter: Joseph Marques <jmarques>
Component: AgentAssignee: John Mazzitelli <mazz>
Status: CLOSED NEXTRELEASE QA Contact: Corey Welton <cwelton>
Severity: medium Docs Contact:
Priority: medium    
Version: unspecifiedKeywords: SubFeature
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: All   
URL: http://jira.rhq-project.org/browse/RHQ-671
Whiteboard:
Fixed In Version: 1.1 Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 536277    

Description Joseph Marques 2008-07-14 19:44:00 UTC
* agent must receive and store an ordered list of servers to connect to
* agent needs to be able to detect when it's currently connect server goes down, and switch to one of its backups
* after switch to backup server happens, agent needs to sit and wait for a green light from the server, because the server needs to perform some work on its side to get ready for this agent, notably warning up the alerts cache (but potentially others)

Comment 1 Joseph Marques 2008-08-10 02:30:10 UTC
rev1207 - implement a value object / transfer object pattern with FailoverListComposite; 
added FailoverListManager to act as a centralized interface for manipulating these objects (with LookupUtil piece to expose it); 
agent registration (found in CoreServerServiceImpl) now generates a FailoverListComposite and adds it to the AgentRegistrationResults;
the agent takes the results (found in AgentMain) and adds the FailoverListComposite to its agent configuration, which persists it through restarts; 
wrote new FailoverPromptCommand to display the results of the most recent failover list received from the server cloud (and added necessary i18n tokens for it); 

Comment 2 John Mazzitelli 2008-09-12 00:32:38 UTC
rev1435 is the first attempt at this

Comment 3 John Mazzitelli 2008-09-13 03:04:35 UTC
rev1447 completes the initial HA implementation.  let the testing begin!

Comment 4 Red Hat Bugzilla 2009-11-10 21:14:27 UTC
This bug was previously known as http://jira.rhq-project.org/browse/RHQ-671