Bug 1020326

Summary: vfio does not work
Product: [Fedora] Fedora Reporter: James Hubbard <jameshubbard>
Component: qemuAssignee: Fedora Virtualization Maintainers <virt-maint>
Status: CLOSED DEFERRED QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 19CC: alex.williamson, amit.shah, bdas, berrange, cfergeau, clalancette, crobinso, dwmw2, ehabkost, extras-orphan, itamar, knoel, markmc, notting, pbonzini, quintela, rjones, scottt.tw, virt-maint
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-11-17 19:51:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
dmesg output
none
lspci output
none
lsmod output none

Description James Hubbard 2013-10-17 13:01:29 UTC
Created attachment 813313 [details]
dmesg output

Description of problem:
vfio does not work.
I have a windows 7 pro virtual machine that I have attached a pci-e graphics card.  When I attempt to start the virtual machine I get an error.  

IOMMU is enabled in the kernel. SR-IOV shows up.  I have to modprobe vfio-pci to get the module loaded.  After loading vfio-pci, I start the VM that has the attached pci device.  I get the following error.

# virsh start win7pro-cli5
error: Failed to start domain win7pro-cli5
error: internal error: Invalid device 0000:83:00.0 driver file /sys/bus/pci/devices/0000:83:00.0/driver is not a symlink

Using SuperMirco motherboard X9DRD-iF with latest firmware (903). It didn't work with the 1.0b firmware.  The pci card is an Nvidia GT520

Version-Release number of selected component (if applicable):
virt-preview repo installed
libvirt: 1.1.3-2fc19
virt-manager: 0.10.0-4.git79196cdf.fc19
uname: 3.11.4-201.fc19.x86_64 #1 SMP Thu Oct 10 14:11:18 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux



How reproducible:

Happens every time.


Actual results:
# virsh start win7pro-cli5
error: Failed to start domain win7pro-cli5
error: internal error: Invalid device 0000:83:00.0 driver file /sys/bus/pci/devices/0000:83:00.0/driver is not a symlink


Expected results:
VM starts no errors.

Additional info:
nouveau is blacklisted so that the kernel doesn't try to load it. Also, it's removed from the initrd.

ls -al  /sys/bus/pci/devices/0000\:83\:00.0/
total 0
drwxr-xr-x. 3 root root         0 Oct 17 12:56 .
drwxr-xr-x. 6 root root         0 Oct 17 12:56 ..
-r--r--r--. 1 root root      4096 Oct 17 12:59 boot_vga
-rw-r--r--. 1 root root      4096 Oct 17 12:59 broken_parity_status
-r--r--r--. 1 root root      4096 Oct 17 12:56 class
-rw-r--r--. 1 root root      4096 Oct 17 12:56 config
-r--r--r--. 1 root root      4096 Oct 17 12:59 consistent_dma_mask_bits
-rw-r--r--. 1 root root      4096 Oct 17 12:59 d3cold_allowed
-r--r--r--. 1 root root      4096 Oct 17 12:56 device
-r--r--r--. 1 root root      4096 Oct 17 12:59 dma_mask_bits
-rw-------. 1 root root      4096 Oct 17 12:59 enable
lrwxrwxrwx. 1 root root         0 Oct 17 12:56 iommu_group -> ../../../../kernel/iommu_groups/31
-r--r--r--. 1 root root      4096 Oct 17 12:56 irq
-r--r--r--. 1 root root      4096 Oct 17 12:59 local_cpulist
-r--r--r--. 1 root root      4096 Oct 17 12:59 local_cpus
-r--r--r--. 1 root root      4096 Oct 17 12:59 modalias
-rw-r--r--. 1 root root      4096 Oct 17 12:59 msi_bus
-r--r--r--. 1 root root      4096 Oct 17 12:59 numa_node
drwxr-xr-x. 2 root root         0 Oct 17 12:59 power
--w--w----. 1 root root      4096 Oct 17 12:59 remove
--w--w----. 1 root root      4096 Oct 17 12:59 rescan
-r--r--r--. 1 root root      4096 Oct 17 12:56 resource
-rw-------. 1 root root  16777216 Oct 17 12:59 resource0
-rw-------. 1 root root 134217728 Oct 17 12:59 resource1
-rw-------. 1 root root 134217728 Oct 17 12:59 resource1_wc
-rw-------. 1 root root  33554432 Oct 17 12:59 resource3
-rw-------. 1 root root  33554432 Oct 17 12:59 resource3_wc
-rw-------. 1 root root       128 Oct 17 12:59 resource5
-rw-------. 1 root root    524288 Oct 17 12:59 rom
lrwxrwxrwx. 1 root root         0 Oct 17 12:56 subsystem -> ../../../../bus/pci
-r--r--r--. 1 root root      4096 Oct 17 12:59 subsystem_device
-r--r--r--. 1 root root      4096 Oct 17 12:59 subsystem_vendor
-rw-r--r--. 1 root root      4096 Oct 17 12:56 uevent
-r--r--r--. 1 root root      4096 Oct 17 12:56 vendor


ls -l /dev/vfio
total 0
crw-rw-rw-. 1 root root 246, 0 Oct 17 12:57 vfio

Comment 1 James Hubbard 2013-10-17 13:02:39 UTC
Created attachment 813315 [details]
lspci output

lscpi output

Comment 2 James Hubbard 2013-10-17 13:03:23 UTC
Created attachment 813316 [details]
lsmod output

Comment 3 James Hubbard 2013-10-29 16:50:54 UTC
The above error only occurs when I the nouveau driver isn't loaded.  I thought having it blacklisted would prevent the bus reset problems.  When I load the nouveau driver (modprobe) and attempt to start the VM, I get the following error. 

#virsh start win7pro-cli5
error: Failed to start domain win7pro-cli5
error: internal error: Unable to reset PCI device 0000:83:00.0: internal error: Active 0000:83:00.1 devices on bus with 0000:83:00.0, not doing bus reset

Comment 4 Cole Robinson 2013-11-17 19:51:19 UTC
Hi James, thanks for the report. That error message reported in comment #0 is a libvirt issue, fixed upstream, and the fix is in rawhide and f20 fedora-virt-preview.

That said, vga passthrough still has lots of problems. There's been recent work focusing on GPU passthrough of a few specific nvidia cards (quadro and grid but I don't know the particulars), but that's not the same thing as VGA passthrough. Google 'vfio vga passthrough' for some more info.

So given that this stuff is still under active development upstream, there's not much usefulness in tracking this bug against F19. If you are really interested in this, my recommendation would be to upgrade to F20 at the least, set up fedora-virt-preview, and follow the latest upstream news about what's happening in this space.

Comment 5 James Hubbard 2013-11-18 20:23:24 UTC
I suspected as much after I began the efforts of the users on the following Arch Linux thread.
https://bbs.archlinux.org/viewtopic.php?id=162768&p=1