Bug 1273044 - VFIO: Snapshot of VM with GPU attached cannot be created.
VFIO: Snapshot of VM with GPU attached cannot be created.
Status: CLOSED CURRENTRELEASE
Product: ovirt-engine
Classification: oVirt
Component: BLL.Virt (Show other bugs)
3.6.0.1
Unspecified Unspecified
medium Severity high (vote)
: ovirt-3.6.3
: 3.6.3
Assigned To: Martin Betak
Nisim Simsolo
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-10-19 08:49 EDT by Nisim Simsolo
Modified: 2016-02-29 11:03 EST (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-02-18 06:01:17 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Virt
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
rule-engine: ovirt‑3.6.z+
rule-engine: planning_ack+
tjelinek: devel_ack+
rule-engine: testing_ack+


Attachments (Terms of Use)
engine.log (11.37 MB, text/plain)
2015-10-19 08:50 EDT, Nisim Simsolo
no flags Details
vdsm.log (12.86 MB, text/plain)
2015-10-19 08:52 EDT, Nisim Simsolo
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 49644 master MERGED backend: Disable Live Snapshot on VMs with PCI passthrough Never
oVirt gerrit 50018 ovirt-engine-3.6 MERGED backend: Disable Live Snapshot on VMs with PCI passthrough Never

  None (edit)
Description Nisim Simsolo 2015-10-19 08:49:39 EDT
Description of problem:
Snapshot of VM with GPU attached cannot be created.
- Relevant only for PCI devices.
- Relevant for both windows and linux VMs.

Version-Release number of selected component (if applicable):
rhevm-3.6.0.1-0.1.el6
sanlock-3.2.4-1.el7.x86_64
vdsm-4.17.9-1.el7ev.noarch
qemu-kvm-rhev-2.3.0-31.el7.x86_64
libvirt-client-1.2.17-5.el7.x86_64

How reproducible:
Consistently.

Steps to Reproduce:
1. Run VM with GPU device attached.
2. Try to create snapshot.
3.

Actual results:
Snapshot failed:

- engine log: 2015-10-19 15:20:16,730 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-7-thread-18) [2fbfc07e] Correlation ID: null, Call Stack: null, Custom Event ID: -1
, Message: VDSM intel-vfio.tlv.redhat.com command failed: Snapshot failed

- vdsm.log: Thread-2855::ERROR::2015-10-19 15:20:42,616::vm::3158::virt.vm::(snapshot) vmId=`103623d3-9e88-42f5-9fcf-3aea6c4ace96`::Unable to take snapshot
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/vm.py", line 3156, in snapshot
    self._dom.snapshotCreateXML(snapxml, snapFlags)
  File "/usr/share/vdsm/virt/virdomain.py", line 68, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 124, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2581, in snapshotCreateXML
    if ret is None:raise libvirtError('virDomainSnapshotCreateXML() failed', dom=self)
libvirtError: Requested operation is not valid: domain has assigned non-USB host devices

Expected results:
Snapshot should be created.

Additional info:
Vdsm and engine logs attached.
vmId (win10_intel1): '103623d3-9e88-42f5-9fcf-3aea6c4ace96'

Issue occurred at (engine time): 15:20:11,379
Comment 1 Nisim Simsolo 2015-10-19 08:50 EDT
Created attachment 1084401 [details]
engine.log
Comment 2 Nisim Simsolo 2015-10-19 08:52 EDT
Created attachment 1084402 [details]
vdsm.log
Comment 3 Red Hat Bugzilla Rules Engine 2015-10-20 11:14:13 EDT
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.
Comment 4 Yaniv Lavi 2015-10-29 08:45:50 EDT
In oVirt testing is done on single release by default. Therefore I'm removing the 4.0 flag. If you think this bug must be tested in 4.0 as well, please re-add the flag. Please note we might not have testing resources to handle the 4.0 clone.
Comment 5 Martin Polednik 2015-12-01 05:16:10 EST
We should disable snapshots for VMs with non-USB hostdev assigned (as libvirt states). Moving to Martin as that is an engine request.
Comment 6 Red Hat Bugzilla Rules Engine 2016-01-28 07:10:38 EST
Bug tickets that are moved to testing must have target release set to make sure tester knows what to test. Please set the correct target release before moving to ON_QA.
Comment 7 Red Hat Bugzilla Rules Engine 2016-01-28 07:14:10 EST
Bug tickets that are moved to testing must have target release set to make sure tester knows what to test. Please set the correct target release before moving to ON_QA.
Comment 8 Red Hat Bugzilla Rules Engine 2016-01-28 07:18:36 EST
Bug tickets that are moved to testing must have target release set to make sure tester knows what to test. Please set the correct target release before moving to ON_QA.
Comment 9 Red Hat Bugzilla Rules Engine 2016-01-28 07:24:15 EST
Bug tickets that are moved to testing must have target release set to make sure tester knows what to test. Please set the correct target release before moving to ON_QA.
Comment 10 Red Hat Bugzilla Rules Engine 2016-01-28 07:39:22 EST
Bug tickets that are moved to testing must have target release set to make sure tester knows what to test. Please set the correct target release before moving to ON_QA.
Comment 11 Red Hat Bugzilla Rules Engine 2016-01-28 08:02:13 EST
Bug tickets that are moved to testing must have target release set to make sure tester knows what to test. Please set the correct target release before moving to ON_QA.
Comment 12 Tomas Jelinek 2016-01-29 02:27:03 EST
moving back to on_qa with the correct target release
Comment 13 Nisim Simsolo 2016-02-11 10:28:28 EST
Verified using builds: 
rhevm-3.6.3.1-0.1.el6
sanlock-3.2.4-1.el7.x86_64
libvirt-client-1.2.17-13.el7_2.2.x86_64
vdsm-4.17.20-0.el7ev.noarch
qemu-kvm-rhev-2.3.0-31.el7_2.4.x86_64

Verification scenario: 
1. Run VM with GPU attached and try to create a snapshot.
Snapshot creation rejected by webadmin with the next error (dialog operation canceled message): 
    "Error while executing action:
    1_VFIO_rhel7_amd:
    Cannot create Snapshot. VM has PCI host devices attached."
2. Run VM with USB passthrough only and create a snapshot.
Verify snapshot created and preview/commit/clone actions can be done on it.
3. Run VM with both PCI and USB devices attached to it. Try to create a snapshot.
Verify snapshot creation rejected by webadmin, with the same error as in step 1.
4. Power off VM, detach all devices, run VM and create snapshot.
Verify snapshot created and preview/commit/clone actions can be done on it.
Comment 14 Michal Skrivanek 2016-02-12 04:14:07 EST
(In reply to Nisim Simsolo from comment #13)
> 2. Run VM with USB passthrough only and create a snapshot.
> Verify snapshot created and preview/commit/clone actions can be done on it.

I don't think we could support USB passthrough operations. I think we lack the checks on resume/dst for the same host device
Martin?
Comment 15 Martin Betak 2016-02-29 11:03:03 EST
@Michal: after discussion with mpolednik and libvirt devs this seems to be safe for the guest - the only risk would be that some pending writes may not be persisted for "stateful" devices (e.g. usb stick) but otherwise should be equivalent to "Not removing usb device safely" -> maybe worth a warning in the frontend. Otherwise the usb drivers/guest OSs seem to be quite used to this kind of plug-n-play behavior.

Note You need to log in before you can comment on or make changes to this bug.