Description of problem: Nothing prevents a user from creating multiple live migration objects for the same VMI in a quick burst, and nothing ensures those will be handled correctly. That will result in all of them trying to run in parallel, potentially creating race conditions. Version-Release number of selected component (if applicable): How reproducible: 100% Steps to Reproduce: 1. Create a VMI, either directly or by starting a VM 2. Create multiple migrations for that VMI, either in a single yaml file, or in multiple ones that get created one after the other rather quickly 3. Actual results: *Usually* the migration that was created last will succeed, the other ones will fail and leave behind a Completed virt-launcher pod Expected results: Either the creation of more than one migration for one VMI is denied (could be hard to prevent all race conditions), or all the migrations run one after the other (seems kind of pointless), or all but one fail "gracefully", i.e. before even attempting to create a target virt-launcher pod (probably a good compromise). Additional info:
This situation sounds like it would be easy to avoid by not creating multiple migrations at once. However, it's reasonable to assume there will exist workflows that make this more likely. The impact of this is fairly significant because if this is triggered, it can cause data corruption.
https://github.com/kubevirt/kubevirt/pull/5242
Master PR merged. PR backported to: - release-0.36 (CNV 2.6.z): https://github.com/kubevirt/kubevirt/pull/5365 - release-0.34 (CNV 2.5.z): https://github.com/kubevirt/kubevirt/pull/5366
To verify: 1) attempt to create 2 migrations at same time. suggest scripting this. 2) observe that second migration is rejected
Summary: Multiple migrations is rejected successfully, as seen below. [kbidarka@localhost migration]$ cat migration-job2-multi1.yaml apiVersion: kubevirt.io/v1alpha3 kind: VirtualMachineInstanceMigration metadata: creationTimestamp: null name: job22-multi1 namespace: default spec: vmiName: vm2-rhel84-secref status: {} [kbidarka@localhost migration]$ cat migration-job2-multi2.yaml apiVersion: kubevirt.io/v1alpha3 kind: VirtualMachineInstanceMigration metadata: creationTimestamp: null name: job22-multi2 namespace: default spec: vmiName: vm2-rhel84-secref status: {} [kbidarka@localhost migration]$ oc get vmi NAME AGE PHASE IP NODENAME vm2-rhel84 45h Running 10.xxx.y.zz node-07.redhat.com vm2-rhel84-secref 18m Running 10.xxx.y.mm node-06.redhat.com [kbidarka@localhost migration]$ for i in migration-job2-multi1.yaml migration-job2-multi2.yaml > do > oc apply -f $i > done virtualmachineinstancemigration.kubevirt.io/job22-multi1 created Error from server: error when creating "migration-job2-multi2.yaml": admission webhook "migration-create-validator.kubevirt.io" denied the request: in-flight migration detected. Active migration job (0d9a0dac-bd4a-4a4d-8e67-b5de565dd846) is currently already in progress for VMI vm2-rhel84-secref.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Virtualization 4.8.0 Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2920