Description of problem: When assigning a PCI device to a guest the device dependent PCI config space emulation is broken. KVM will hardcode offsets to PCI capabilites for MSI and MSI-X. This may conflict with existing capabilities or device registers causing the guest OS device driver to fail. Version-Release number of selected component (if applicable): >= kvm-83-105.el5 How reproducible: This is 100% reproducible for devices that place capabilities or registers in the area where KVM puts MSI and MSI-X capabilities. So reproducible, but device dependent. Steps to Reproduce: 1. launch guest with assigned PCI device 2. start device driver in guest 3. driver fails to initialize
I have upgraded to the BIOS that AMD says supports IOMMU. There is still an issue with 10G Neterion Inc. X3100 Series cards, with the following errors: Mar 16 09:47:42 perf31 kernel: PCI: Failed to allocate mem resource #12:8000000@d0000000 for 0000:03:00.0 Mar 16 09:47:42 perf31 kernel: vxge 0000:03:00.0: not enough MMIO resources for SR-IOV This is known good module, SR-IOV works on an Intel able to create VFs. The 1G KUWALA cards still are functional and able create VFs. /var/log/messages: Mar 16 09:47:12 perf31 kernel: AMD IOMMU: Using protection domain 24 for device 03:00.0 Mar 16 09:47:12 perf31 kernel: ACPI: PCI interrupt for device 0000:03:00.0 disabled Mar 16 09:47:42 perf31 kernel: vxge: Copyright(c) 2002-2009 Neterion Inc Mar 16 09:47:42 perf31 kernel: vxge: Driver version: 2.0.6.18937-k Mar 16 09:47:42 perf31 kernel: PCI: Enabling device 0000:03:00.0 (0140 -> 0142) Mar 16 09:47:42 perf31 kernel: ACPI: PCI Interrupt 0000:03:00.0[A] -> Link [LN32] -> GSI 32 (level, high) -> IRQ 138 Mar 16 09:47:42 perf31 kernel: PCI: Failed to allocate mem resource #12:8000000@d0000000 for 0000:03:00.0 Mar 16 09:47:42 perf31 kernel: vxge 0000:03:00.0: not enough MMIO resources for SR-IOV Mar 16 09:47:42 perf31 kernel: eth0: SERIAL NUMBER: SXC0949004 Mar 16 09:47:42 perf31 kernel: eth0: PART NUMBER: X3110SR0001 Mar 16 09:47:42 perf31 kernel: eth0: Neterion X3110 Single-Port SR 10GbE Server Adapter Mar 16 09:47:42 perf31 kernel: eth0: MAC ADDR: 00:0C:FC:01:00:C7 Mar 16 09:47:42 perf31 kernel: eth0: Link Width x8 Mar 16 09:47:42 perf31 kernel: eth0: Firmware version : 1.4.4 Date : 09/17/2009 Mar 16 09:47:42 perf31 kernel: eth0: Single Root IOV Mode Enabled Mar 16 09:47:42 perf31 kernel: eth0: 1 Vpath(s) opened Mar 16 09:47:42 perf31 kernel: eth0: Interrupt type MSI-X Mar 16 09:47:42 perf31 kernel: eth0: RTH steering enabled for TCP_IPV4 Mar 16 09:47:42 perf31 kernel: eth0: Tx port steering enabled Mar 16 09:47:42 perf31 kernel: eth0: Generic receive offload enabled Mar 16 09:47:42 perf31 kernel: eth0: Rx doorbell mode enabled Mar 16 09:47:42 perf31 kernel: eth0: VLAN tag stripping Enabled Mar 16 09:47:42 perf31 kernel: eth0: Ring blocks : 2 Mar 16 09:47:42 perf31 kernel: eth0: Fifo blocks : 14 Mar 16 09:47:42 perf31 kernel: eth0: MTU is 1500 ./lspci 03:00.0 Class 0200: Device 17d5:5833 (rev 02) Subsystem: Device 17d5:6030 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 32 bytes Interrupt: pin A routed to IRQ 138 Region 0: Memory at ce800000 (64-bit, prefetchable) [size=8M] Region 2: Memory at ce201000 (64-bit, prefetchable) [size=4K] Region 4: Memory at ce200000 (64-bit, prefetchable) [size=256] [virtual] Expansion ROM at ce280000 [disabled] [size=512K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [70] Express (v1) Endpoint, MSI 00 DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <2us, L1 <2us ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset- MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend- LnkCap: Port #0, Speed 2.5GT/s, Width x8, ASPM L0s L1, Latency L0 <256ns, L1 <4us ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- Capabilities: [a0] MSI-X: Enable+ Count=4 Masked- Vector table: BAR=2 offset=00000000 PBA: BAR=2 offset=00000800 Capabilities: [c0] Vital Product Data Not readable Capabilities: [50] MSI: Enable- Count=1/4 Maskable+ 64bit+ Address: 0000000000000000 Data: 0000 Masking: 00000000 Pending: 00000000 Capabilities: [100] Alternative Routing-ID Interpretation (ARI) ARICap: MFVC- ACS-, Next Function: 0 ARICtl: MFVC- ACS-, Function Group: 0 Capabilities: [110] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr+ BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- Capabilities: [150] Power Budgeting <?> Capabilities: [160] Device Serial Number 00-0c-fc-00-00-01-00-c7 Capabilities: [170] Single Root I/O Virtualization (SR-IOV) IOVCap: Migration-, Interrupt Message Number: 000 IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy+ IOVSta: Migration- Initial VFs: 16, Total VFs: 16, Number of VFs: 1, Function Dependency Link: 00 VF offset: 1, stride: 1, Device ID: 5833 Supported Page Size: 000007ff, System Page Size: 00000001 Region 0: Memory at 0000000000000000 (64-bit, prefetchable) Region 2: Memory at 0000000000000000 (64-bit, prefetchable) Region 4: Memory at 0000000000000000 (64-bit, prefetchable) VF Migration: offset: 00000000, BIR: 0 Capabilities: [1b0] Vendor Specific Information <?> Kernel driver in use: vxge Kernel modules: vxge cat /proc/iomem 00010000-0009d7ff : System RAM 0009d800-0009ffff : reserved 000a0000-000bffff : Video RAM area 000c0000-000cafff : Video ROM 000cb000-000cbfff : Adapter ROM 000d1800-000d19ff : Adapter ROM 000f0000-000fffff : System ROM 00100000-c7e7ffff : System RAM 00200000-00484429 : Kernel code 0048442a-005c9b8f : Kernel data 08000000-0bffffff : GART c7e80000-c7e8afff : ACPI Tables c7e8b000-c7e8cfff : ACPI Non-volatile Storage c7e8d000-c7ffffff : reserved c8004000-c8007fff : amd_iommu c8024000-c80243ff : 0000:00:11.0 c8024000-c80243ff : ahci c8024400-c80244ff : 0000:00:12.2 c8024400-c80244ff : ehci_hcd c8024800-c80248ff : 0000:00:13.2 c8024800-c80248ff : ehci_hcd c8025000-c8025fff : 0000:00:12.0 c8025000-c8025fff : ohci_hcd c8026000-c8026fff : 0000:00:12.1 c8026000-c8026fff : ohci_hcd c8027000-c8027fff : 0000:00:13.0 c8027000-c8027fff : ohci_hcd c8028000-c8028fff : 0000:00:13.1 c8028000-c8028fff : ohci_hcd c8029000-c8029fff : 0000:00:14.5 c8029000-c8029fff : ohci_hcd c8100000-c8bfffff : PCI Bus #01 c8100000-c8103fff : 0000:01:00.0 c8100000-c8103fff : igb c8104000-c8107fff : 0000:01:00.1 c8104000-c8107fff : igb c8120000-c813ffff : 0000:01:00.0 c8120000-c813ffff : igb c8140000-c815ffff : 0000:01:00.1 c8140000-c815ffff : igb c8160000-c817ffff : 0000:01:00.0 c8160000-c8163fff : 0000:01:10.0 c8160000-c8163fff : kvm_assigned_device c8164000-c8167fff : 0000:01:10.2 c8164000-c8167fff : igbvf c8180000-c819ffff : 0000:01:00.0 c8180000-c8183fff : 0000:01:10.0 c8180000-c8183fff : kvm_assigned_device c8184000-c8187fff : 0000:01:10.2 c8184000-c8187fff : igbvf c81a0000-c81bffff : 0000:01:00.1 c81a0000-c81a3fff : 0000:01:10.1 c81a0000-c81a3fff : igbvf c81a4000-c81a7fff : 0000:01:10.3 c81a4000-c81a7fff : igbvf c81c0000-c81dffff : 0000:01:00.1 c81c0000-c81c3fff : 0000:01:10.1 c81c0000-c81c3fff : igbvf c81c4000-c81c7fff : 0000:01:10.3 c81c4000-c81c7fff : igbvf c8400000-c87fffff : 0000:01:00.0 c8400000-c87fffff : igb c8800000-c8bfffff : 0000:01:00.1 c8800000-c8bfffff : igb ca000000-cdffffff : PCI Bus #02 ca000000-cbffffff : 0000:02:00.0 ca000000-cbffffff : bnx2 cc000000-cdffffff : 0000:02:00.1 cc000000-cdffffff : bnx2 ce000000-ce0fffff : PCI Bus #04 ce000000-ce003fff : 0000:04:00.0 ce000000-ce003fff : qla2xxx ce004000-ce007fff : 0000:04:00.1 ce004000-ce007fff : qla2xxx ce100000-ce1fffff : PCI Bus #05 ce100000-ce10ffff : 0000:05:06.0 ce120000-ce13ffff : 0000:05:06.0 ce200000-ceffffff : PCI Bus #03 ce200000-ce2000ff : 0000:03:00.0 ce200000-ce2000ff : vxge ce201000-ce201fff : 0000:03:00.0 ce201000-ce201fff : vxge ce280000-ce2fffff : 0000:03:00.0 ce800000-ceffffff : 0000:03:00.0 ce800000-ceffffff : vxge cf000000-cf7fffff : PCI Bus #01 cf000000-cf3fffff : 0000:01:00.0 cf400000-cf7fffff : 0000:01:00.1 cf800000-cf8fffff : PCI Bus #02 cf800000-cf81ffff : 0000:02:00.0 cf820000-cf83ffff : 0000:02:00.1 cf900000-cf9fffff : PCI Bus #03 cfa00000-cfafffff : PCI Bus #04 cfa00000-cfa3ffff : 0000:04:00.0 cfa40000-cfa7ffff : 0000:04:00.1 d0000000-d7ffffff : PCI Bus #05 d0000000-d7ffffff : 0000:05:06.0 e0000000-efffffff : reserved fec00000-fec0ffff : reserved fee00000-fee00fff : reserved fff00000-ffffffff : reserved 100000000-837ffffff : System RAM BIOS version: # dmidecode 2.10 SMBIOS 2.5 present. 56 structures occupying 1852 bytes. Table at 0xC7EDA000. Handle 0x0000, DMI type 0, 24 bytes BIOS Information Vendor: Phoenix Technologies Ltd. Version: PDNAX1-A Release Date: 01/26/2010 Address: 0xE3230 Runtime Size: 118224 bytes 2.6.18-191.el5 #1 SMP Mon Mar 1 15:59:02 EST 2010 x86_64 x86_64 x86_64 GNU/Linux kmod-kvm-83-160.el5 kvm-83-160.el5 etherboot-zroms-kvm-5.4.4-13.el5 kvm-qemu-img-83-160.el5 bob
Will clone this bug for RHEL6.... No bug open at this time but Exar (formerly Neterion) indicating failure on the latest RHEL6 build.
Don, is there an update on this for 5.6 based on your discussion with Alex? If not, we need to push this to 5.7.
(In reply to comment #7) > Don, is there an update on this for 5.6 based on your discussion with Alex? > If not, we need to push this to 5.7. This will need to be pushed to 5.7. Alex is working on a solution for upstream. Once that's completed and accepted upstream, then we can work on backporting that support back to 5.7 (& rhel6.x). It's fairly invasive (requires adding vfio and qemu interfaces to it), so it's going to be a while for the solution to settle upstream and we have enough testing that it's ready for enterprise use. I added Alex to the cc: list so he can directly comment if he wants to add more.
Any updated status on this bug? Will it make 5.6?
Pushed to 5.7