Bug 971939

Summary: RHEV-M Assisted/Automatic DB Recovery
Product: Red Hat Enterprise Virtualization Manager Reporter: Luca Villa <luvilla>
Component: RFEsAssignee: Andrew Cathrow <acathrow>
Status: CLOSED DUPLICATE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: unspecifiedCC: acathrow, iheim, lpeer, pablo.iranzo, pzhukov, rbalakri
Target Milestone: ---Keywords: FutureFeature
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard: integration
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-12-03 11:22:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Luca Villa 2013-06-07 15:53:11 UTC
*. What is the nature and description of the request?
We would like RHEV-M able to manage DB recovery with minimum data loss. In particular, we would like to be able to restore all RHEV-M configuration and data, in the cases the internal DB fails. 
For example, this could be achieved by analyzing postgres Write Ahead Log or by gathering information provided by distributed vdsm agent.

*. Why does the customer need this? (List the business requirements here)
We foresee to have an high provisioning rate on that infrastructure and we need to minimize service unavailability during RHEV-M DB recovery. 
Basically we need to restore the status of the DB to the latest commit before a crash.

*. How would the customer like to achieve this? (List the functional requirements here)
The desiderata is to have 
a) RHEV-M to be able to manage inconsistencies between DB data (e.g., restored data) and the infrastructure status, and to propose a way to discard or include differences. 
b) RHEV-M always up and running
c) Supported mechanism to achieve the goal

*. For each functional requirement listed, specify how Red Hat and the customer can test to confirm the requirement is successfully implemented.
a) For example, if one new virtual machine has been created or modified after last DB recovery, and a DB fail occurs, after restoring data, RHEV-M is able to import the vm back into DB or the changes
b) never stop RHEV-M to make DB backup
c) N/A

Comment 4 Pavel Zhukov 2013-11-20 11:58:49 UTC
(In reply to Luca Villa from comment #0)
> The desiderata is to have 
> a) RHEV-M to be able to manage inconsistencies between DB data (e.g.,
> restored data) and the infrastructure status, and to propose a way to
> discard or include differences. 
Is it possible to use DB replication for this? Config files are not being changed often 
> b) RHEV-M always up and running
> c) Supported mechanism to achieve the goal
>

Comment 5 Luca Villa 2013-11-21 15:24:15 UTC
(In reply to Pavel Zhukov from comment #4)
> (In reply to Luca Villa from comment #0)
> > The desiderata is to have 
> > a) RHEV-M to be able to manage inconsistencies between DB data (e.g.,
> > restored data) and the infrastructure status, and to propose a way to
> > discard or include differences. 
> Is it possible to use DB replication for this? Config files are not being
> changed often 

We were actually considering to replicate data to a hot stand-by DB or to transaction logs stored elsewhere.
Config files are not changed often true, but it could happen and we need to ensure we are able to restore the system up-to-date and in a fully consistent way.

> > b) RHEV-M always up and running
> > c) Supported mechanism to achieve the goal
> >

Comment 6 Pavel Zhukov 2013-11-21 17:12:08 UTC
(In reply to Luca Villa from comment #5)

> Config files are not changed often true, but it could happen and we need to
> ensure we are able to restore the system up-to-date and in a fully
> consistent way.

IMHO it's the job for backup solutions (application, script, system administrator etc) not for RHEV. The next step will be something like "backup /etc/sysconfig".... It should be "standard" task like "trigger backup after changing of the important config files" for system administrator.

Comment 7 Itamar Heim 2013-12-03 11:22:58 UTC
we added a simple backup/restore script.
we will add support for importing an existing data domain and reconciling differences (note in 3.2 you can already get a list of disks in the storage and register them to the engine, but that will be extended more when we cover importing data domains).

*** This bug has been marked as a duplicate of bug 1015321 ***