Bug 1322068

Summary: [RFE] Native Highly Available (HA) VMDB Deployment
Product: Red Hat CloudForms Management Engine Reporter: Dustin Scott <dscott>
Component: ApplianceAssignee: Nick Carboni <ncarboni>
Status: CLOSED ERRATA QA Contact: Alex Newman <anewman>
Severity: high Docs Contact:
Priority: high    
Version: 5.5.0CC: abellott, anewman, byount, greartes, jhardy, jocarter, ldixon, ncarboni, obarenbo, simaishi
Target Milestone: GAKeywords: FutureFeature
Target Release: 5.7.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: database
Fixed In Version: 5.7.0.0 Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-01-04 12:54:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Dustin Scott 2016-03-29 17:43:25 UTC
-What is the nature and description of the request?
Ability to natively deploy an HA VMDB for CloudForms without having to configure a complicated reference architecture.
  
-Why does the customer need this? (List the business requirements here)
Eliminate the single point of failure with only one VMDB in a single region to meet uptime requirements.
  
-How would the customer like to achieve this? (List the functional requirements here)
Possibly via HA PostgreSQL cluster with the ability to deploy the cluster either via the appliance_console or the UI.
  
-For each functional requirement listed, specify how Red Hat and the customer can test to confirm the requirement is successfully implemented.
Test operability of HA cluster to see if meets customer needs.  Ensure that cluster can be deployed in automated fashion.  Ensure cluster failover happens properly when one VMDB fails.
  
-Is there already an existing RFE upstream or in Red Hat Bugzilla?
I have not found one.
  
-Does the customer have any specific timeline dependencies and which release would they like to target (i.e. RHEL5, RHEL6)?
Before the customer goes into production.  This is tentatively scheduled for now depending on the success of the current implementation in development.
  
-Is the sales team involved in this request and do they have any additional input?
Yes, sales in involved.  Input pending depending on future meeting.
  
-List any affected packages or components.
cfme-appliance-5.5.2.4-1.el7cf.x86_64
cfme-5.5.2.4-1.el7cf.x86_64

Comment 2 Nick Carboni 2016-07-14 20:38:44 UTC
Work on this is being tracked via the Pivotal Tracker Epic here https://www.pivotaltracker.com/epic/show/2561569

Comment 3 Nick Carboni 2016-08-31 14:26:52 UTC
The pivotal epic is now completed and native HA is now available to be configured using the upstream appliances.

To do this we have one important requirement:

The PostgreSQL database servers must be running on our shipped virtual appliances. This is because we make use of some very particular packages and services for use during failover. This may pose a problem for customers using "external" PostgreSQL servers. We will look into documenting how to set up the packages required for running this HA implementation in the future.

For now, the steps to set up the newly implemented HA architecture are as follows (for a two database cluster):

1. Deploy an appliance to serve as the primary database server (DB1).
  - Configure this appliance as a *database-only* appliance
  - When prompted "Do you also want to use this server as an application server?" in the console, answer no.
2. Deploy an appliance to serve as an evm server appliance
  - Configure this appliance to point to the database appliance configured in step 1 (DB1)
3. On DB1, configure the database for replication using the appliance_console
  - Select "Configure Database Replication"
  - Select "Configure Server as Primary"
  - Follow the prompts
  - Note: The IP address entered here must be the address which will be used to contact the primary server by all the *standby* servers and all the *application* servers
4. Deploy standby database server appliance (DB2)
  - Select "Configure Database Replication"
  - Select "Configure Server as Standby"
  - Follow the prompts
  - Note: The IP address entered for the Standby server must be the address which will be used to contact the current standby server by all the *other standby* servers and all the *application* servers
  - This initialization may take some time
5. On all application servers (not database servers) start the failover monitor service
  - In the appliance_console select "Configure Application Database Failover Monitor" -> "Start Database Failover Monitor"

Now the database servers will execute automatic failover upon failure of the primary server and the application servers will detect the new primary server upon failover.

Comment 8 errata-xmlrpc 2017-01-04 12:54:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2017-0012.html