Bug 1953145 - Restarting a partially successful plan migrates VMs that were already migrated successfully
Summary: Restarting a partially successful plan migrates VMs that were already migrate...
Keywords:
Status: CLOSED DUPLICATE of bug 1959731
Alias: None
Product: Migration Toolkit for Virtualization
Classification: Red Hat
Component: General
Version: 2.0.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 2.1.0
Assignee: Sam Lucidi
QA Contact: Nandini Chandra
Avital Pinnick
URL:
Whiteboard:
Depends On:
Blocks: 1953978
TreeView+ depends on / blocked
 
Reported: 2021-04-24 05:46 UTC by Nandini Chandra
Modified: 2021-06-02 20:37 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1953978 (view as bug list)
Environment:
Last Closed: 2021-06-02 20:37:03 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Nandini Chandra 2021-04-24 05:46:32 UTC
Description of problem:
----------------------
I created a plan with multiple VMs. After starting the plan, I canceled a few VMs and let the other VMs migrate. After the VMs migrated successfully, I restarted the plan so that the canceled VMs would migrate. When the plan restarted, I noticed that an attempt was made to migrate VMs that were already migrated successfully.


Version-Release number of selected component (if applicable):
-------------------------------------------------------------
MTV 2.0.0-20


How reproducible:
----------------
Always


Steps to Reproduce:
-------------------
1.
2.
3.

Actual results:
--------------
When a partially successful plan is restarted, at attempt is made to migrate VMs that had already successfully migrated.


Expected results:
-----------------
Restarting a partially successful plan should only migrate canceled or         failed VMs


Additional info:
----------------

Comment 1 Fabien Dupont 2021-04-26 11:49:43 UTC
What do you mean by "an attempt"? Does MTV create a new VirtualMachineImport CR?

Comment 2 Nandini Chandra 2021-04-26 18:41:39 UTC
Hi Fabien,

Yes, MTV creates a new VirtualMachineImport CR for the VM (that was already migrated successfully) upon restarting the canceled plan.

I'm able to reproduce this issue consistently. I could share the details of my cluster on which I've reproduced the issue

Comment 3 Sam Lucidi 2021-05-11 14:48:39 UTC
A Plan where some VMs were canceled but all the rest succeeded and none failed is a successful plan, and shouldn't be able to be retried.

https://github.com/konveyor/forklift-controller/pull/259

Comment 4 Fabien Dupont 2021-05-11 17:04:36 UTC
Really? The VMs may have been cancelled because the environment is unstable and the user prefers to postpone the migration. This would require to create a new plan?

Comment 5 Sam Lucidi 2021-05-11 19:39:49 UTC
I guess my main question is how frequently do you think the user will want to cancel some number of VMs from a running plan and then retry them later with absolutely no changes, vs changing their mind about migrating a handful of VMs entirely or with different mappings? If it's more of the former, then I can see allowing retries of otherwise successful plans. But if it's more of the latter, it seems to me that showing the user a "retry" button next to a plan that appears to be successful is going to be confusing.

Comment 6 Fabien Dupont 2021-05-11 19:58:34 UTC
@miguel what do you think. It's a fairly trivial change, but has a huge impact on usability.

Comment 7 Mike Turley 2021-05-13 12:32:35 UTC
Determined in https://github.com/konveyor/forklift-ui/pull/593 (merged) that we will block these plans (where some VMs were canceled but all the rest succeeded and none failed) from being retried. The UX for creating a new plan for the canceled VMs will be enhanced later, in part by the RFE to add a clone plan feature (https://bugzilla.redhat.com/show_bug.cgi?id=1951660).

Comment 8 Nandini Chandra 2021-05-13 17:12:37 UTC
This issue occurs with failed plans too. In other words , if a plan has 2 VMs, out of which only one VM(VM#1) migrates successfully and the other VM(VM#2) fails for various reasons, upon restarting the failed
plan, an attempt is made to migrate VM#1 again and a VirtualMachineImport CR is created for VM#1. How would this issue be fixed ?

This happens with both cold and warm migration.

Steps to reproduce:
1)Create a plan with 2 VMs and start the plan through MTV UI
2)Makes sure VM#2 fails and VM#1 succeeds. For a warm migration to fail, disable CBT on the VM.
3)MTV UI shows that the plan failed. Restart the failed plan through MTV UI.

Actual result:
A VirtualMachineImport CR is created for VM#1 and  MTV UI shows that an attempt is made to re-migrate VM1.  

Expected Result:
1)Only failed VMs should be retried
2)Migration status from previous migration run should be checked for every VM. If a VM migrated successfully in a previous run, a new VirtualMachineImport CR shouldn't be created for the VM on the restarted migration run.

Comment 9 Sam Lucidi 2021-05-13 17:17:55 UTC
That issue is addressed by the fix for https://bugzilla.redhat.com/show_bug.cgi?id=1959731

Comment 10 Mike Turley 2021-05-13 17:47:09 UTC
@slucidi does that mean we should revert https://github.com/konveyor/forklift-ui/pull/593 and allow the user to re-migrate a plan that succeeded with canceled VMs, now that it wouldn't cause errors?

Comment 11 Nandini Chandra 2021-05-14 20:34:23 UTC
(In reply to Sam Lucidi from comment #9)
> That issue is addressed by the fix for
> https://bugzilla.redhat.com/show_bug.cgi?id=1959731

Hi Sam,

Fabien emailed us today that a new build with the fix for bug 1959731 is available . 

build 2.0.0-15 / iib:75270

I tested 2.0.0-15 and the issue I've described in comment 8 doesn't seem to be fixed ? Could you please look into this ?
Let me know if you need access to my cluster .

Comment 12 Sam Lucidi 2021-05-17 13:00:45 UTC
Yeah, I think so Mike. Probably a good thing to discuss in the sync call on Wednesday if it can wait that long.

I will investigate, Nandini.

Comment 16 Nandini Chandra 2021-06-02 20:37:03 UTC
Closing this as a

*** This bug has been marked as a duplicate of bug 1959731 ***


Note You need to log in before you can comment on or make changes to this bug.