Description of problem: ---------------------- I created a plan with multiple VMs. After starting the plan, I canceled a few VMs and let the other VMs migrate. After the VMs migrated successfully, I restarted the plan so that the canceled VMs would migrate. When the plan restarted, I noticed that an attempt was made to migrate VMs that were already migrated successfully. Version-Release number of selected component (if applicable): ------------------------------------------------------------- MTV 2.0.0-20 How reproducible: ---------------- Always Steps to Reproduce: ------------------- 1. 2. 3. Actual results: -------------- When a partially successful plan is restarted, at attempt is made to migrate VMs that had already successfully migrated. Expected results: ----------------- Restarting a partially successful plan should only migrate canceled or failed VMs Additional info: ----------------
What do you mean by "an attempt"? Does MTV create a new VirtualMachineImport CR?
Hi Fabien, Yes, MTV creates a new VirtualMachineImport CR for the VM (that was already migrated successfully) upon restarting the canceled plan. I'm able to reproduce this issue consistently. I could share the details of my cluster on which I've reproduced the issue
A Plan where some VMs were canceled but all the rest succeeded and none failed is a successful plan, and shouldn't be able to be retried. https://github.com/konveyor/forklift-controller/pull/259
Really? The VMs may have been cancelled because the environment is unstable and the user prefers to postpone the migration. This would require to create a new plan?
I guess my main question is how frequently do you think the user will want to cancel some number of VMs from a running plan and then retry them later with absolutely no changes, vs changing their mind about migrating a handful of VMs entirely or with different mappings? If it's more of the former, then I can see allowing retries of otherwise successful plans. But if it's more of the latter, it seems to me that showing the user a "retry" button next to a plan that appears to be successful is going to be confusing.
@miguel what do you think. It's a fairly trivial change, but has a huge impact on usability.
Determined in https://github.com/konveyor/forklift-ui/pull/593 (merged) that we will block these plans (where some VMs were canceled but all the rest succeeded and none failed) from being retried. The UX for creating a new plan for the canceled VMs will be enhanced later, in part by the RFE to add a clone plan feature (https://bugzilla.redhat.com/show_bug.cgi?id=1951660).
This issue occurs with failed plans too. In other words , if a plan has 2 VMs, out of which only one VM(VM#1) migrates successfully and the other VM(VM#2) fails for various reasons, upon restarting the failed plan, an attempt is made to migrate VM#1 again and a VirtualMachineImport CR is created for VM#1. How would this issue be fixed ? This happens with both cold and warm migration. Steps to reproduce: 1)Create a plan with 2 VMs and start the plan through MTV UI 2)Makes sure VM#2 fails and VM#1 succeeds. For a warm migration to fail, disable CBT on the VM. 3)MTV UI shows that the plan failed. Restart the failed plan through MTV UI. Actual result: A VirtualMachineImport CR is created for VM#1 and MTV UI shows that an attempt is made to re-migrate VM1. Expected Result: 1)Only failed VMs should be retried 2)Migration status from previous migration run should be checked for every VM. If a VM migrated successfully in a previous run, a new VirtualMachineImport CR shouldn't be created for the VM on the restarted migration run.
That issue is addressed by the fix for https://bugzilla.redhat.com/show_bug.cgi?id=1959731
@slucidi does that mean we should revert https://github.com/konveyor/forklift-ui/pull/593 and allow the user to re-migrate a plan that succeeded with canceled VMs, now that it wouldn't cause errors?
(In reply to Sam Lucidi from comment #9) > That issue is addressed by the fix for > https://bugzilla.redhat.com/show_bug.cgi?id=1959731 Hi Sam, Fabien emailed us today that a new build with the fix for bug 1959731 is available . build 2.0.0-15 / iib:75270 I tested 2.0.0-15 and the issue I've described in comment 8 doesn't seem to be fixed ? Could you please look into this ? Let me know if you need access to my cluster .
Yeah, I think so Mike. Probably a good thing to discuss in the sync call on Wednesday if it can wait that long. I will investigate, Nandini.
Closing this as a *** This bug has been marked as a duplicate of bug 1959731 ***