This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 1273172 - Q35 configuration with NVIDIA GPU behind PCIe root port results in Code 12 in Win7 guest
Q35 configuration with NVIDIA GPU behind PCIe root port results in Code 12 in...
Status: ASSIGNED
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev (Show other bugs)
7.2
Unspecified Unspecified
medium Severity medium
: rc
: ---
Assigned To: Vadim Rozenfeld
Yanan Fu
:
Depends On:
Blocks: 1311684
  Show dependency treegraph
 
Reported: 2015-10-19 16:32 EDT by Alex Williamson
Modified: 2017-10-03 21:25 EDT (History)
17 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
win7-64 hckx log (2.06 MB, application/zip)
2015-12-17 07:59 EST, lijin
no flags Details
win7_64_q35_nvidia_GPU whql log (2.14 MB, application/zip)
2016-02-19 02:20 EST, Peixiu Hou
no flags Details

  None (edit)
Description Alex Williamson 2015-10-19 16:32:18 EDT
Description of problem:
NVIDIA GPU device assignment does not work for Windows 7 guests when using the q35 machine type and placing the GPU behind a PCIe root port as viewed by the guest.  The result is a Code 12 error in the guest (insufficient resources).  An identical configuration works with either Windows 8.1 or RHEL7 guests.

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.3.0-29.el7.x86_64
kernel-3.10.0-322.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Assign supported NVIDIA GPU behind PCIe root port using q35 VM machine type
2. Attempt to use the GPU with a Windows 7 guest OS
3.

Actual results:
Code 12 - "This device cannot find enough free resources that it can use."

Expected results:
Works

Additional info:
/usr/libexec/qemu-kvm -name win7-q35 -S -machine pc-q35-rhel7.2.0,accel=kvm,usb=off,vmport=off -cpu IvyBridge,hv_time,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff -m 8192 -realtime mlock=off -smp 6,sockets=1,cores=6,threads=1 -uuid 6b18fbc5-9fbc-47a7-bf3f-4654c168a00b -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-win7-q35/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device i82801b11-bridge,id=pci.1,bus=pcie.0,addr=0x1e -device ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,chassis=1,id=pcie.1 -device ioh3420,bus=pcie.0,addr=1c.1,port=2,chassis=2,id=pcie.2 -device pci-bridge,chassis_nr=2,id=pci.2,bus=pci.1,addr=0x1 -device ich9-usb-ehci1,id=usb,bus=pci.2,addr=0x3.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.2,multifunction=on,addr=0x3 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.2,addr=0x3.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.2,addr=0x3.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.2,addr=0x4 -drive file=/dev/rhel/win7-q35,if=none,id=drive-sata0-0-0,format=raw,cache=none,aio=native -device ide-hd,bus=ide.0,drive=drive-sata0-0-0,id=sata0-0-0,bootindex=2 -drive if=none,media=cdrom,id=drive-sata0-0-1,readonly=on,format=raw -device ide-cd,bus=ide.1,drive=drive-sata0-0-1,id=sata0-0-1,bootindex=1 -netdev tap,fd=23,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:07:22:7e,bus=pci.2,addr=0x1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -device usb-tablet,id=input0 -spice port=5900,addr=127.0.0.1,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vgamem_mb=16,bus=pcie.0,addr=0x1 -device intel-hda,id=sound0,bus=pci.2,addr=0x2 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1 -device vfio-pci,host=06:00.0,id=hostdev0,bus=pcie.1,multifunction=on,addr=0x0 -device vfio-pci,host=03:00.0,id=hostdev1,bus=pcie.2,multifunction=on,addr=0x0 -device vfio-pci,host=03:00.1,id=hostdev2,bus=pcie.2,addr=0x0.0x1 -device vfio-pci,host=03:00.2,id=hostdev3,bus=pcie.2,addr=0x0.0x2 -device vfio-pci,host=03:00.3,id=hostdev4,bus=pcie.2,addr=0x0.0x3 -device vfio-pci,host=03:00.4,id=hostdev5,bus=pcie.2,addr=0x0.0x4 -device virtio-balloon-pci,id=balloon0,bus=pci.2,addr=0x5 -msg timestamp=on

06:00.0 VGA compatible controller: NVIDIA Corporation GK106GL [Quadro K4000] (rev a1)
03:00.0 USB controller: Teradici Corp. Device 2200
03:00.1 USB controller: Teradici Corp. Device 2200
03:00.2 Audio device: Teradici Corp. Device 2200
03:00.3 Serial bus controller [0c80]: Teradici Corp. Device 2240
03:00.4 Serial bus controller [0c80]: Teradici Corp. Device 2240
Comment 2 Alex Williamson 2015-10-19 16:50:33 EDT
Win7 also reports Code 12 if the GPU is attached to the pci.2 bus, which is currently the default for any libvirt attached device.  Moving the GPU to pcie.0 by changing bus to 0x00 in the XML allows the device to work, however note that we don't support hotplug on pcie.0.  Perhaps the default configuration is how we can report this to NVIDIA.

[Also note that in the configuration listed in comment 0, a Teradici card is also placed behind a separate PCIe root port in the VM, this device reports no driver issues]
Comment 3 Alex Williamson 2015-10-20 16:38:20 EDT
Reported to NVIDIA:

https://partners.nvidia.com/bug/viewbug/1696961

Since Windows 7 appears to get a Code 12 any time the GPU is behind a PCI-bridge, I reported the case of a 440FX VM configured with a pci-bridge and the GPU attached to the secondary bus.
Comment 4 Vadim Rozenfeld 2015-10-25 21:33:20 EDT
Hi Alex,

Can you take the following steps:

1 Set SetupApi LogLevel to -1 (0xffffffff) (https://msdn.microsoft.com/en-us/library/windows/hardware/ff550808%28v=vs.85%29.aspx)
2. Reboot the VM.
3. Try to install NVIDIA driver and share the setupapi installation log (https://msdn.microsoft.com/en-us/library/windows/hardware/ff550863%28v=vs.85%29.aspx)

Thanks,
Vadim.
Comment 9 juzhang 2015-12-10 22:11:42 EST
It's ok Lijin and Amnon.

Hi Zhiyi,

Could you handle this issue?

Best Regards,
Junyi
Comment 17 lijin 2015-12-17 07:59 EST
Created attachment 1106717 [details]
win7-64 hckx log

the PNP job all failed at setup stage with error "WDTF_TEST : Found a device that has a non-zero problem code or is phantom. Logging device info."

the PCI hardware compliance test failed due to error "Assertion 1587DC0B-FE59-494E-85B5-C2A59D0CC098: FAILED. Bit 6 (Common Clock Configuration) in the Link Control register (offset 10h) in the PCI Express Capability table must be read-writable ."

please check the hckx log for details
Comment 18 Vadim Rozenfeld 2016-02-10 04:56:21 EST
(In reply to lijin from comment #17)
> Created attachment 1106717 [details]
> win7-64 hckx log
> 
> the PNP job all failed at setup stage with error "WDTF_TEST : Found a device
> that has a non-zero problem code or is phantom. Logging device info."
> 
> the PCI hardware compliance test failed due to error "Assertion
> 1587DC0B-FE59-494E-85B5-C2A59D0CC098: FAILED. Bit 6 (Common Clock
> Configuration) in the Link Control register (offset 10h) in the PCI Express
> Capability table must be read-writable ."
> 
> please check the hckx log for details

Hi, Lijin

Yes, according to the PCI hardware compliance test there are three problematic bits - 6,7,and 8 which all must be RW. Apart from that there are a bunch of PnP tests that failed due to Configuration Manager probe conflict. 
I wonder if we can run the same set of tests on a different configuration, where the NVIDIA board located on the primary PCI bus?
And it also will be helpful to know if the same problem can be reproduced on a "legacy" i440-based system.

Thanks,
Vadim.
Comment 19 lijin 2016-02-13 21:54:31 EST
(In reply to Vadim Rozenfeld from comment #18)
> (In reply to lijin from comment #17)
> > Created attachment 1106717 [details]
> > win7-64 hckx log
> > 
> > the PNP job all failed at setup stage with error "WDTF_TEST : Found a device
> > that has a non-zero problem code or is phantom. Logging device info."
> > 
> > the PCI hardware compliance test failed due to error "Assertion
> > 1587DC0B-FE59-494E-85B5-C2A59D0CC098: FAILED. Bit 6 (Common Clock
> > Configuration) in the Link Control register (offset 10h) in the PCI Express
> > Capability table must be read-writable ."
> > 
> > please check the hckx log for details
> 
> Hi, Lijin
> 
> Yes, according to the PCI hardware compliance test there are three
> problematic bits - 6,7,and 8 which all must be RW. Apart from that there are
> a bunch of PnP tests that failed due to Configuration Manager probe
> conflict. 
> I wonder if we can run the same set of tests on a different configuration,
> where the NVIDIA board located on the primary PCI bus?
> And it also will be helpful to know if the same problem can be reproduced on
> a "legacy" i440-based system.

QE will update the result once finish.

Hi zhguo,is the env. in comment#11 still available?
Comment 21 Peixiu Hou 2016-02-19 02:20 EST
Created attachment 1128483 [details]
win7_64_q35_nvidia_GPU whql log

reproduce on q35 system:
the PNP job all failed with error "WDTF_TEST: Found a device that has a non-zero problem code or is phantom. Logging device info."

the PCI hardware compilance test passed.

reproduced on i440 system:
passthrough NVIDIA K5000 to guest success, but after update the GPU driver, reboot the guest, guest os will hung in 'Starting Windows' page, tried 4 times, results are same.


use virt-manager boot a i440 system, command as follows:
/usr/libexec/qemu-kvm -name win7-2 -machine pc-i440fx-rhel7.2.0,accel=kvm,usb=off,vmport=off -cpu SandyBridge,hv_time,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff -m 4096 -realtime mlock=off -smp 8,sockets=2,cores=4,threads=1 -uuid d518b1aa-f1e4-41fe-bf9c-98c0489504fb -qmp tcp::4444,server,nowait -rtc base=localtime,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -vga cirrus -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=test.qcow2,if=none,id=drive-ide0-0-0,format=qcow2 -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -drive file=en_windows_7_ultimate_x64_dvd_x15-65922.iso,if=none,id=drive-ide0-0-1,readonly=on,format=raw -device ide-cd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1,bootindex=0 -netdev tap,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:01:a8:eb,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -vnc 0.0.0.0:2
qemu     17427 14.9 17.9 5367728 4394944 ?     SLl  11:11  36:39 /usr/libexec/qemu-kvm -name win7 -S -machine pc-i440fx-rhel7.2.0,accel=kvm,usb=off,vmport=off -cpu SandyBridge,hv_time,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff -m 4096 -realtime mlock=off -smp 8,sockets=8,cores=1,threads=1 -uuid d83fc7f7-cb55-4b87-a40b-e86243a78ed8 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-win7/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x6.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x6 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x6.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x6.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/home/peixiu_handle_bug/i440.qcow2,if=none,id=drive-ide0-0-0,format=qcow2 -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -drive file=/home/peixiu_handle_bug/en_windows_7_ultimate_x64_dvd_x15-65922.iso,if=none,id=drive-ide0-0-1,readonly=on,format=raw -device ide-cd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1 -netdev tap,fd=23,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:19:2f:ef,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -device usb-tablet,id=input0 -spice port=5900,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vgamem_mb=16,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1 -device vfio-pci,host=04:00.0,id=hostdev0,bus=pci.0,addr=0x8 -device vfio-pci,host=04:00.1,id=hostdev1,bus=pci.0,addr=0x9 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -msg timestamp=on

Best Regards!
Peixiu Hou
Comment 22 Sergey 2016-02-28 05:16:02 EST
Vadim,

it seems the problem is not Nvidia-dependent. I observe the same 'code 12' error with Win7 and ATI Radeon R9 290 adapter with both 440 and q35 chipsets while the adapter is connected to RC or bridge or pcie-root-port as soon as QXL VGA adapter is present. When I remove QXL adapter then the Radeon driver starts just fine. 

I am using latest F20 kernel on the host, libvirt 1.2.21 (recompiled virtpreview package myself for F20) and qemu-2.1.2-7.

Please let me know if you need any specific details.
Comment 23 Vadim Rozenfeld 2016-02-29 02:50:48 EST
(In reply to Sergey from comment #22)
> Vadim,
> 
> it seems the problem is not Nvidia-dependent. I observe the same 'code 12'
> error with Win7 and ATI Radeon R9 290 adapter with both 440 and q35 chipsets
> while the adapter is connected to RC or bridge or pcie-root-port as soon as
> QXL VGA adapter is present. When I remove QXL adapter then the Radeon driver
> starts just fine. 
> 
> I am using latest F20 kernel on the host, libvirt 1.2.21 (recompiled
> virtpreview package myself for F20) and qemu-2.1.2-7.
> 
> Please let me know if you need any specific details.

Hi Sergey,

We are still investigating this issue.
It might be related to the resource allocation mechanism,
at least we are observing some weird problems when running
WHQL PnP related tests on viostor driver sitting behind 
pci-to-pci bridge, and never if virtio-blk device attached
to the primary pci bus.

Best regards,
Vadim.
Comment 24 Sergey 2016-02-29 13:36:03 EST
(In reply to Vadim Rozenfeld from comment #23)
> 
> Hi Sergey,
> 
> We are still investigating this issue.
> It might be related to the resource allocation mechanism,
> at least we are observing some weird problems when running
> WHQL PnP related tests on viostor driver sitting behind 
> pci-to-pci bridge, and never if virtio-blk device attached
> to the primary pci bus.
> 
> Best regards,
> Vadim.

Vadim, 

yesterday I gave up to force the Win7 to do what I need. Then I first tried the Win8.1, but for some reason the Windows setup did not recognize any disk drives in no configurations (IDE/VirtIO/SCSI and I supplied the drivers where appropriate). Finally I succeeded to setup Win10 and got Radeon _and_ QXL adapters working simultaneously. Just simple setting up of a VM using virt-manager and later addition of the host PCI device without any extra tricks (except Radeon Catalyst driver installation) works as expected: both devices are functional.

Kind regards,
Sergey.

Note You need to log in before you can comment on or make changes to this bug.