Bug 1273044

Summary: VFIO: Snapshot of VM with GPU attached cannot be created.
Product: [oVirt] ovirt-engine Reporter: Nisim Simsolo <nsimsolo>
Component: BLL.VirtAssignee: Martin Betak <mbetak>
Status: CLOSED CURRENTRELEASE QA Contact: Nisim Simsolo <nsimsolo>
Severity: high Docs Contact:
Priority: medium    
Version: 3.6.0.1CC: bugs, mbetak, michal.skrivanek, mpoledni, nsimsolo, tjelinek
Target Milestone: ovirt-3.6.3Flags: rule-engine: ovirt-3.6.z+
rule-engine: planning_ack+
tjelinek: devel_ack+
rule-engine: testing_ack+
Target Release: 3.6.3   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-02-18 11:01:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
engine.log
none
vdsm.log none

Description Nisim Simsolo 2015-10-19 12:49:39 UTC
Description of problem:
Snapshot of VM with GPU attached cannot be created.
- Relevant only for PCI devices.
- Relevant for both windows and linux VMs.

Version-Release number of selected component (if applicable):
rhevm-3.6.0.1-0.1.el6
sanlock-3.2.4-1.el7.x86_64
vdsm-4.17.9-1.el7ev.noarch
qemu-kvm-rhev-2.3.0-31.el7.x86_64
libvirt-client-1.2.17-5.el7.x86_64

How reproducible:
Consistently.

Steps to Reproduce:
1. Run VM with GPU device attached.
2. Try to create snapshot.
3.

Actual results:
Snapshot failed:

- engine log: 2015-10-19 15:20:16,730 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-7-thread-18) [2fbfc07e] Correlation ID: null, Call Stack: null, Custom Event ID: -1
, Message: VDSM intel-vfio.tlv.redhat.com command failed: Snapshot failed

- vdsm.log: Thread-2855::ERROR::2015-10-19 15:20:42,616::vm::3158::virt.vm::(snapshot) vmId=`103623d3-9e88-42f5-9fcf-3aea6c4ace96`::Unable to take snapshot
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/vm.py", line 3156, in snapshot
    self._dom.snapshotCreateXML(snapxml, snapFlags)
  File "/usr/share/vdsm/virt/virdomain.py", line 68, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 124, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2581, in snapshotCreateXML
    if ret is None:raise libvirtError('virDomainSnapshotCreateXML() failed', dom=self)
libvirtError: Requested operation is not valid: domain has assigned non-USB host devices

Expected results:
Snapshot should be created.

Additional info:
Vdsm and engine logs attached.
vmId (win10_intel1): '103623d3-9e88-42f5-9fcf-3aea6c4ace96'

Issue occurred at (engine time): 15:20:11,379

Comment 1 Nisim Simsolo 2015-10-19 12:50:53 UTC
Created attachment 1084401 [details]
engine.log

Comment 2 Nisim Simsolo 2015-10-19 12:52:06 UTC
Created attachment 1084402 [details]
vdsm.log

Comment 3 Red Hat Bugzilla Rules Engine 2015-10-20 15:14:13 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 4 Yaniv Lavi 2015-10-29 12:45:50 UTC
In oVirt testing is done on single release by default. Therefore I'm removing the 4.0 flag. If you think this bug must be tested in 4.0 as well, please re-add the flag. Please note we might not have testing resources to handle the 4.0 clone.

Comment 5 Martin Polednik 2015-12-01 10:16:10 UTC
We should disable snapshots for VMs with non-USB hostdev assigned (as libvirt states). Moving to Martin as that is an engine request.

Comment 6 Red Hat Bugzilla Rules Engine 2016-01-28 12:10:38 UTC
Bug tickets that are moved to testing must have target release set to make sure tester knows what to test. Please set the correct target release before moving to ON_QA.

Comment 7 Red Hat Bugzilla Rules Engine 2016-01-28 12:14:10 UTC
Bug tickets that are moved to testing must have target release set to make sure tester knows what to test. Please set the correct target release before moving to ON_QA.

Comment 8 Red Hat Bugzilla Rules Engine 2016-01-28 12:18:36 UTC
Bug tickets that are moved to testing must have target release set to make sure tester knows what to test. Please set the correct target release before moving to ON_QA.

Comment 9 Red Hat Bugzilla Rules Engine 2016-01-28 12:24:15 UTC
Bug tickets that are moved to testing must have target release set to make sure tester knows what to test. Please set the correct target release before moving to ON_QA.

Comment 10 Red Hat Bugzilla Rules Engine 2016-01-28 12:39:22 UTC
Bug tickets that are moved to testing must have target release set to make sure tester knows what to test. Please set the correct target release before moving to ON_QA.

Comment 11 Red Hat Bugzilla Rules Engine 2016-01-28 13:02:13 UTC
Bug tickets that are moved to testing must have target release set to make sure tester knows what to test. Please set the correct target release before moving to ON_QA.

Comment 12 Tomas Jelinek 2016-01-29 07:27:03 UTC
moving back to on_qa with the correct target release

Comment 13 Nisim Simsolo 2016-02-11 15:28:28 UTC
Verified using builds: 
rhevm-3.6.3.1-0.1.el6
sanlock-3.2.4-1.el7.x86_64
libvirt-client-1.2.17-13.el7_2.2.x86_64
vdsm-4.17.20-0.el7ev.noarch
qemu-kvm-rhev-2.3.0-31.el7_2.4.x86_64

Verification scenario: 
1. Run VM with GPU attached and try to create a snapshot.
Snapshot creation rejected by webadmin with the next error (dialog operation canceled message): 
    "Error while executing action:
    1_VFIO_rhel7_amd:
    Cannot create Snapshot. VM has PCI host devices attached."
2. Run VM with USB passthrough only and create a snapshot.
Verify snapshot created and preview/commit/clone actions can be done on it.
3. Run VM with both PCI and USB devices attached to it. Try to create a snapshot.
Verify snapshot creation rejected by webadmin, with the same error as in step 1.
4. Power off VM, detach all devices, run VM and create snapshot.
Verify snapshot created and preview/commit/clone actions can be done on it.

Comment 14 Michal Skrivanek 2016-02-12 09:14:07 UTC
(In reply to Nisim Simsolo from comment #13)
> 2. Run VM with USB passthrough only and create a snapshot.
> Verify snapshot created and preview/commit/clone actions can be done on it.

I don't think we could support USB passthrough operations. I think we lack the checks on resume/dst for the same host device
Martin?

Comment 15 Martin Betak 2016-02-29 16:03:03 UTC
@Michal: after discussion with mpolednik and libvirt devs this seems to be safe for the guest - the only risk would be that some pending writes may not be persisted for "stateful" devices (e.g. usb stick) but otherwise should be equivalent to "Not removing usb device safely" -> maybe worth a warning in the frontend. Otherwise the usb drivers/guest OSs seem to be quite used to this kind of plug-n-play behavior.