Bug 1956467

Summary: If a successfully migrated source VM is deleted, a new migration plan with different VMs fails because the scheduler is looking for the deleted VM
Product: Migration Toolkit for Virtualization Reporter: David Vaanunu <dvaanunu>
Component: GeneralAssignee: Sam Lucidi <slucidi>
Status: CLOSED ERRATA QA Contact: David Vaanunu <dvaanunu>
Severity: medium Docs Contact: Avital Pinnick <apinnick>
Priority: high    
Version: 2.0.0CC: apinnick, fdupont, istein, slucidi
Target Milestone: ---   
Target Release: 2.0.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-06-10 17:11:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description David Vaanunu 2021-05-03 18:25:11 UTC
Description of problem:

When have migration plans that already finished successfully and the migrated VM was deleted in the source system, the new plan failed to run.

The plan status is: "Running - preparing for migration"
and in the progress status is "Not Started".



Version-Release number of selected component (if applicable):

MTV-2.0.0-21
CNV-2.6.2-25

How reproducible:


Steps to Reproduce:
1. Create a migration plan, run it (should finish successfully)  
2. Delete the migrated VM from the source system, but not delete his migration plan 
3. Create another migration plan and try to run it

Actual results:

a migration plan isn't running


Expected results:
a migration plan should be running


Additional info:

Comment 1 Fabien Dupont 2021-05-03 18:29:41 UTC
What are the VMs in the second plan? The same as the first plan?
I fail to understand what you are trying to do here.

Comment 2 David Vaanunu 2021-05-03 18:58:52 UTC
The second plan use diff VMs

Comment 3 Ilanit Stein 2021-05-04 09:29:39 UTC
I got a similar behavior when I created a plan from CLI, and run it from CLI, and then I deleted the plan from UI.
The migration CR was not deleted. and when I tried to run the plan again, it didn't start as described here in the bug.
Maybe in your flow the previous migration CR was not deleted?

There is a bug that when a plan CR is deleted in CLI the migration CR should be deleted too. 
But I think that even if we run a plan, though it's previous CR still exist, it should not cause the plan to be at a "non starting" state, but rather fail.

@Fabien,
wdyt?

Comment 4 Fabien Dupont 2021-05-04 12:21:23 UTC
Would it be possible to have access to an environment where it's happening? I still don't understand the exact flow.
@slucidi, could you please help troubleshooting this?

Comment 5 Sam Lucidi 2021-05-04 16:41:12 UTC
The issue here is a specific edge case in the scheduler when it encounters a completed Plan, but one or more of the source VMs on that Plan have been removed from the inventory. VMs on completed Plans can't affect the schedule because they're not running, but they shouldn't be considered by the scheduler at all. The scheduler needs to A) disregard completed Plans, and B) not fail if a VM can't be found in the inventory. This is unrelated to any issues regarding Migration CRs being left behind after a plan is deleted.

Comment 7 Fabien Dupont 2021-05-06 21:37:47 UTC
The fix is in build 2.0.0-10 / iib:73160.

Comment 8 David Vaanunu 2021-05-11 11:51:46 UTC
The second plan use diff VMs

Comment 9 David Vaanunu 2021-05-13 08:43:19 UTC
Verified on MTV_2.0.0-12 / iib:73572

Comment 12 errata-xmlrpc 2021-06-10 17:11:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (MTV 2.0.0 images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2021:2381