Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1533404

Summary: [DR] - Fail back scenario - VM statuses should be preserve in a file and not in the memory
Product: [oVirt] ovirt-ansible-collection Reporter: Maor <mlipchuk>
Component: disaster-recoveryAssignee: Maor <mlipchuk>
Status: CLOSED CURRENTRELEASE QA Contact: Kevin Alon Goldblatt <kgoldbla>
Severity: medium Docs Contact:
Priority: medium    
Version: unspecifiedCC: lsvaty, tnisan, ylavi
Target Milestone: ovirt-4.2.7Flags: rule-engine: ovirt-4.2+
ylavi: exception+
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: DR
Fixed In Version: Doc Type: Enhancement
Doc Text:
Feature: As part of the fail back scenario, running VMs which ran in the secondary site should run on the primary site, after the fail back scenario finish. This is currently being done by saving the running VMs in the memory while the ansible play runs. Reason: The problem with that solution is that if the ansible play process crashes in the middle of the fail back flow, the user will loose the information of the running VMs, which the ansible play might have already shut them down, causing the information of the running VMs to be lost. Result: To obtain a more robust solution, the running VMs should be obtained in a file so in case the ansible process will be killed, the user will have the ability to continue the process without loosing the information of the VMs which should run. The file will contain dictionary of VM name, VM guid and whether the VM is highly available or not. If the VM is highly available it will run before the other "regular" vms on the target site. The default location of the file will be in the tmp folder, the user can change this in the defaults/main.yml The validator operation should validate whether the file exists or not (using the location set in the defaults/main.yml) and will warn whether the file exists and if the user will want to remove it or not. If the file exists it means that probably there was a failure in the previous run of the failback
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-11-02 14:31:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Maor 2018-01-11 09:41:55 UTC
Description of problem:
As part of the fail back scenario, running VMs on the secondary site, should run  on the primary site after the fail back scenario finishes.
This is currently supported by saving the running VMs in the memory while the ansible play runs.
The problem with that solution is that if the ansible play process crashes in the middle of the fail back flow, the user will loose the information of the running VMs, which the ansible play might have already shut them down, causing the information of the running VMs to be lost.

To obtain a more robust solution, the running VMs should be obtained in a file so in case the ansible process will be killed, the user will have the ability to continue the process without loosing the information of the VMs which should run.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Maor 2018-09-02 14:19:24 UTC
The solution for this bug will be to obtain a file which will contain dictionary of VM name, VM guid and whether the VM is highly available or not.
If the VM is highly available it will run before the other "regular" vms on the target site.
The default location of the file will be in the tmp folder, the user can change this in the defaults/main.yml
The validator operation should validate whether the file exists or not (using the location set in the defaults/main.yml) and will warn whether the file exists and if the user will want to remove it or not.
If the file exists it means that probably there was a failure in the previous run of the failback

Comment 2 Kevin Alon Goldblatt 2018-10-03 09:10:52 UTC
The following cases need to be tested:

1. Failback which fails(Ctrl + Break) and is run again it uses the same file with the previous status of the vms

2. After failed failback - run validate and get message that files exists

Comment 3 Kevin Alon Goldblatt 2018-10-15 10:38:07 UTC
Verified with the following code:
------------------------------------------
ovirt-ansible-disaster-recovery-1.1.2-1.el7ev.noarch


Verified with the following scenario:
------------------------------------------
1. Perform fail over of domain with vms
2. Start the vms on the secondary site
3. Perform failback >>>>> on completion the vms are up


Moving to VERIFIED!

Comment 4 Sandro Bonazzola 2018-11-02 14:31:57 UTC
This bugzilla is included in oVirt 4.2.7 release, published on November 2nd 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.7 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.