Bug 1945593

Summary: Live migration should be blocked for VMs with host devices
Product: Container Native Virtualization (CNV) Reporter: Fabian Deutsch <fdeutsch>
Component: VirtualizationAssignee: Barak <bmordeha>
Status: CLOSED ERRATA QA Contact: Akriti Gupta <akrgupta>
Severity: medium Docs Contact:
Priority: medium    
Version: 2.6.0CC: cnv-qe-bugs, edwardh, kbidarka, kmajcher, nunnatsa, phoracek, sgott, vromanso
Target Milestone: ---   
Target Release: 4.11.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-09-14 19:28:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Fabian Deutsch 2021-04-01 11:08:04 UTC
Description of problem:
VMs using host devices can currently be migrated but this should not be possible due to technical limitations (planned  to be fixed in future)

Version-Release number of selected component (if applicable):
2.6

How reproducible:


Steps to Reproduce:
1. Create vm with a host device i.e. passed through PCI device - or a local PV
2. live migrate
3.

Actual results:
Live migration is started

Expected results:
Live migration should be blocker

Additional info:
Over time we should align to what is done for sriov (unplug, plug) - an epic will be created

Comment 1 Fabian Deutsch 2021-08-18 10:03:16 UTC
@phoracek @edwardh @vromanso some things have changed since I filed this bug.

If I am not mistaken the live migration is permitted with SR-IOV devices, is this correct?

Is it also correct that LM does currently not work with arbitrary passed through PCI devices or mediated devices?

Comment 2 Edward Haas 2021-08-18 11:57:18 UTC
(In reply to Fabian Deutsch from comment #1)
> If I am not mistaken the live migration is permitted with SR-IOV devices, is
> this correct?

Yes.

> 
> Is it also correct that LM does currently not work with arbitrary passed
> through PCI devices or mediated devices?

For PCI based devices (like SR-IOV), libvirt will block migration, so this is correct.
For MDEV based devices, I am unsure as I have never tested it.

The hot-{un}plug code needs to be extracted out from the SR-IOV handling and then apply the same logic for all types that require it.

Comment 3 Fabian Deutsch 2021-08-18 14:32:01 UTC
Thanks Edy

Comment 4 Vladik Romanovsky 2021-08-18 15:27:41 UTC
(In reply to Fabian Deutsch from comment #1)
> @phoracek @edwardh @vromanso some things
> have changed since I filed this bug.
> 
> If I am not mistaken the live migration is permitted with SR-IOV devices, is
> this correct?
> 
> Is it also correct that LM does currently not work with arbitrary passed
> through PCI devices or mediated devices?

We don't block migration with host devices right now. Hopefully, we could re-use the plug/unplug mechanism we have in place for SRIOV.
At the same time, we should be able to migrate with mdevs.

Comment 5 Fabian Deutsch 2022-01-26 20:37:21 UTC
For now, let's block live-migration

Comment 7 Edward Haas 2022-01-30 06:41:36 UTC
(In reply to Fabian Deutsch from comment #5)
> For now, let's block live-migration

I think this one took care of it: https://github.com/kubevirt/kubevirt/pull/6379

Comment 8 Nahshon Unna-Tsameret 2022-02-01 08:10:09 UTC
@edwardh is right. This bug was fix and is is part of KV v0.49.0

I'll move it to ON_QA

Comment 9 Akriti Gupta 2022-05-11 10:56:21 UTC
Verified with: iib:219905
kubevirt-virtctl-4.11.0-525.el8.x86_64.rpm

[akrgupta@fedora Downloads]$ virtctl migrate vm-rhel84-ocs
VM vm-rhel84-ocs was scheduled to migrate
[akrgupta@fedora Downloads]$ oc describe vm rhel8-excited-crocodile

Status:
  Conditions:
    Last Probe Time:       <nil>
    Last Transition Time:  2022-05-06T13:20:53Z
    Status:                True
    Type:                  Ready
    Last Probe Time:       <nil>
    Last Transition Time:  <nil>
    Message:               cannot migrate VMI: PVC rhel8-excited-crocodile-rootdisk-x7x0e is not shared, live migration requires that all PVCs must be shared (using ReadWriteMany access mode)
    Reason:                DisksNotLiveMigratable
    Status:                False
    Type:                  LiveMigratable
    Last Probe Time:       2022-05-06T13:21:40Z
    Last Transition Time:  <nil>
    Status:                True
    Type:                  AgentConnected
  Created:                 true
  Printable Status:        Running
  Ready:                   true


kubevirt-virtctl-4.11.0-580.el9.x86_64.rpm

[akrgupta@fedora Downloads]$ virtctl migrate vm-rhel84-ocs
VM vm-rhel84-ocs was scheduled to migrate
[akrgupta@fedora Downloads]$ oc describe vm rhel8-excited-crocodile

Status:
  Conditions:
    Last Probe Time:       <nil>
    Last Transition Time:  2022-05-06T13:20:53Z
    Status:                True
    Type:                  Ready
    Last Probe Time:       <nil>
    Last Transition Time:  <nil>
    Message:               cannot migrate VMI: PVC rhel8-excited-crocodile-rootdisk-x7x0e is not shared, live migration requires that all PVCs must be shared (using ReadWriteMany access mode)
    Reason:                DisksNotLiveMigratable
    Status:                False
    Type:                  LiveMigratable
    Last Probe Time:       2022-05-06T13:21:40Z
    Last Transition Time:  <nil>
    Status:                True
    Type:                  AgentConnected
  Created:                 true
  Printable Status:        Running
  Ready:                   true


With GPU as host:

[akrgupta@fedora auth]$ virtctl migrate vm-rhel84-ocs
Error migrating VirtualMachine Internal error occurred: admission webhook "migration-create-validator.kubevirt.io" denied the request: Cannot migrate VMI, Reason: HostDeviceNotLiveMigratable, Message: VMI uses a PCI host devices

[akrgupta@fedora auth]$ oc describe vm vm-rhel84-ocs

Status:
  Conditions:
    Last Probe Time:       <nil>
    Last Transition Time:  2022-05-11T10:46:18Z
    Status:                True
    Type:                  Ready
    Last Probe Time:       <nil>
    Last Transition Time:  <nil>
    Message:               VMI uses a PCI host devices
    Reason:                HostDeviceNotLiveMigratable
    Status:                False
    Type:                  LiveMigratable
    Last Probe Time:       2022-05-11T10:46:33Z
    Last Transition Time:  <nil>
    Status:                True
    Type:                  AgentConnected
  Created:                 true
  Printable Status:        Running
  Ready:                   true

Comment 12 errata-xmlrpc 2022-09-14 19:28:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Virtualization 4.11.0 Images security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:6526