Bug 1181409
| Summary: | PCI pass-through device works improperly due to the PHB's index being set to a big value | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Xu Han <xuhan> |
| Component: | qemu-kvm-rhev | Assignee: | David Gibson <dgibson> |
| Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 7.1 | CC: | dgibson, hhuang, knoel, michen, ngu, sherold, virt-maint, ypu, zhengtli |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | ppc64 | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | qemu-kvm-rhev-2.2.0-8.el7 | Doc Type: | Bug Fix |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2015-12-04 16:25:15 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Also tested with a negative value (-0x4000), the device didnot functioned well either.
PCI info (part):
----------------
# lspci -vvv -s 0001:00:01.0
0001:00:01.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02) (prog-if 30 [XHCI])
Subsystem: IBM Device 04b2
Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Interrupt: pin A routed to IRQ 19
Region 0: Memory at fffc0100b0000000 (64-bit, non-prefetchable) [size=64K]
Region 2: Memory at fffc0100b0010000 (64-bit, non-prefetchable) [size=8K]
...
Qtree info:
-----------
dev: spapr-pci-vfio-host-bridge, id "vfiohost"
iommu = 3 (0x3)
index = -16384 (0xffffffffffffc000)
buid = 576460752840278016 (0x80000001fffc000)
liobn = 2147467264 (0x7fffc000)
mem_win_addr = -1124797710860288 (0xfffc0100a0000000)
mem_win_size = 536870912 (0x20000000)
io_win_addr = -1124798247731200 (0xfffc010080000000)
io_win_size = 65536 (0x10000)
irq 0
bus: vfiohost.0
type PCI
dev: vfio-pci, id ""
host = "0003:03:00.0"
x-intx-mmap-timeout-ms = 1100 (0x44c)
x-vga = false
bootindex = -1 (0xffffffffffffffff)
addr = 01.0
romfile = ""
rombar = 1 (0x1)
multifunction = false
command_serr_enable = true
class USB controller, addr 00:01.0, pci id 104c:8241 (sub 1014:04b2)
bar 0: mem at 0x90000000 [0x9000ffff]
bar 2: mem at 0x90010000 [0x90011fff]
Ok, it's expected that such large (or negative) index values won't work. But qemu should provide a better error message. I'll work on this upstream. I'm still working on this upstream, but I've made a preliminary downstream build here: http://brewweb.devel.redhat.com/brew/taskinfo?taskID=8523816 That should give a proper error message for out-of-range index values. Can you give that a test, please? (In reply to David Gibson from comment #4) > I'm still working on this upstream, but I've made a preliminary downstream > build here: > > http://brewweb.devel.redhat.com/brew/taskinfo?taskID=8523816 > > That should give a proper error message for out-of-range index values. > > Can you give that a test, please? Have made following tests. Step: # ./qemu-kvm ... \ -device spapr-pci-vfio-host-bridge,id=vfiohost,iommu=3,index={ 0x4000 | 0xff | 0x100 | -0x1 | 0x100000000 } \ -device vfio-pci,host=0003:03:00.0,bus=vfiohost.0,addr=0x1 Results: [index=0x4000]: qemu-kvm: -device spapr-pci-vfio-host-bridge,id=vfiohost,iommu=3,index=0x4000: "index" for PAPR PHB is too large (max 255) qemu-kvm: -device spapr-pci-vfio-host-bridge,id=vfiohost,iommu=3,index=0x4000: Device 'spapr-pci-vfio-host-bridge' could not be initialized [index=0xff]: # lspci -vvv -s 0001:00:01.0 0001:00:01.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02) (prog-if 30 [XHCI]) Subsystem: IBM Device 04b2 ... Kernel driver in use: xhci_hcd [index=0x100]: qemu-kvm: -device spapr-pci-vfio-host-bridge,id=vfiohost,iommu=3,index=0x100: "index" for PAPR PHB is too large (max 255) qemu-kvm: -device spapr-pci-vfio-host-bridge,id=vfiohost,iommu=3,index=0x100: Device 'spapr-pci-vfio-host-bridge' could not be initialized [index=-0x1]: qemu-kvm: -device spapr-pci-vfio-host-bridge,id=vfiohost,iommu=3,index=-0x1: Parameter 'index' expects uint32_t [index=0x100000000]: qemu-kvm: -device spapr-pci-vfio-host-bridge,id=vfiohost,iommu=3,index=0x100000000: Parameter 'index' expects uint32_t Thanks. Upstream patch posted at http://lists.gnu.org/archive/html/qemu-devel/2015-01/msg01540.html Fix included in qemu-kvm-rhev-2.2.0-8.el7 Reproduced the problem with the envs below: Host: 3.10.0-295.el7.ppc64 qemu-kvm-rhev-2.1.2-21.el7 And passed with envs: Host: 3.10.0-295.el7.ppc64 qemu-kvm-rhev-2.3.0-2.el7 [root@ibm-p8-rhevm-15 ~]# sh boot.sh QEMU 2.3.0 monitor - type 'help' for more information (qemu) 2015-08-07T06:05:35.129717Z qemu-kvm: -device spapr-pci-vfio-host-bridge,id=vfiohost,iommu=2,index=0x4000: "index" for PAPR PHB is too large (max 255) 2015-08-07T06:05:35.129752Z qemu-kvm: -device spapr-pci-vfio-host-bridge,id=vfiohost,iommu=2,index=0x4000: Device 'spapr-pci-vfio-host-bridge' could not be initialized So I think the bug is verified Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-2546.html |
Description of problem: Tested this issue with two index values: ---------------------------------------- [1] index=0x1 (good value) (guest)# ls /sys/bus/pci/drivers/xhci_hcd/ 0001:00:01.0 bind module new_id remove_id uevent unbind [2] index=0x4000 (bad value) (guest)# ls /sys/bus/pci/drivers/xhci_hcd/ bind module new_id remove_id uevent unbind (guest)# dmesg | grep xhci [ 3.349434] xhci_hcd 0001:00:01.0: xHCI Host Controller [ 3.349484] xhci_hcd 0001:00:01.0: new USB bus registered, assigned bus number 2 [ 3.452413] xhci_hcd 0001:00:01.0: Host not halted after 16000 microseconds. [ 3.452415] xhci_hcd 0001:00:01.0: can't setup: -110 [ 3.452493] xhci_hcd 0001:00:01.0: USB bus 2 deregistered [ 3.453011] xhci_hcd 0001:00:01.0: init 0001:00:01.0 fail, -110 [ 3.453039] xhci_hcd: probe of 0001:00:01.0 failed with error -110 The difference of PCI info between these two values: ---------------------------------------------------- # diff -u good-value bad-value --- good-value 2015-01-13 14:06:01.208379809 +0800 +++ bad-value 2015-01-13 14:06:29.137472307 +0800 @@ -1,11 +1,10 @@ 0001:00:01.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02) (prog-if 30 [XHCI]) Subsystem: IBM Device 04b2 -Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+ +Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- -Latency: 0 Interrupt: pin A routed to IRQ 19 -Region 0: Memory at 110b0000000 (64-bit, non-prefetchable) [size=64K] -Region 2: Memory at 110b0010000 (64-bit, non-prefetchable) [size=8K] +Region 0: Memory at 40100b0000000 (64-bit, non-prefetchable) [size=64K] +Region 2: Memory at 40100b0010000 (64-bit, non-prefetchable) [size=8K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold-) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- @@ -27,7 +26,7 @@ DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- -Capabilities: [c0] MSI-X: Enable+ Count=8 Masked- +Capabilities: [c0] MSI-X: Enable- Count=8 Masked- Vector table: BAR=2 offset=00000000 PBA: BAR=2 offset=00001000 Capabilities: [100 v2] Advanced Error Reporting @@ -38,4 +37,3 @@ CEMsk:RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap:First Error Pointer: 00, GenCap+ CGenEn+ ChkCap+ ChkEn+ Capabilities: [150 v1] Device Serial Number 08-00-28-00-00-20-00-00 -Kernel driver in use: xhci_hcd Version-Release number of selected component (if applicable): qemu-kvm-rhev-2.1.2-17.el7.ppc64 How reproducible: 100% Steps to Reproduce: 1. /usr/libexec/qemu-kvm ... \ -device spapr-pci-vfio-host-bridge,id=vfiohost,iommu=3,index=0x4000 \ -device vfio-pci,host=0003:03:00.0,bus=vfiohost.0,addr=0x1 Actual results: PCI pass-through device works improperly. Expected results: Additional info: QEMU cmdline: ------------- /usr/libexec/qemu-kvm \ -name vfio-test-xuhan \ -machine pseries,accel=kvm,usb=off \ -m 32768 \ -realtime mlock=off \ -cpu POWER8 \ -smp 20,sockets=2,cores=10,threads=1 \ -uuid 5125cf27-4b01-4493-b46d-734d08becc6b \ -no-user-config \ -nodefaults \ -chardev socket,id=charmonitor,path=monitor,server,nowait \ -mon chardev=charmonitor,id=monitor,mode=control \ -rtc base=utc \ -boot strict=on \ -device pci-ohci,id=usb,bus=pci.0,addr=0x1 \ -device spapr-vscsi,id=scsi0,reg=0x2000 \ -drive file=/home/mnt/xuhan/nfs/install-test.qcow2,if=none,id=drive-scsi0-0-0-0,format=qcow2 \ -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 \ -netdev tap,id=hostnet0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown,ifname=vnetvfio \ -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:5d:c7:9e,bus=pci.0,addr=0x2 \ -chardev socket,id=charserial0,path=serial,server,nowait \ -device spapr-vty,chardev=charserial0,reg=0x30000000 \ -device spapr-pci-vfio-host-bridge,id=vfiohost,iommu=3,index=0x4000 \ -device vfio-pci,host=0003:03:00.0,bus=vfiohost.0,addr=0x1 \ -device usb-kbd,id=input0 \ -device usb-mouse,id=input1 \ -vnc 0.0.0.0:50 \ -k en-us \ -device VGA,id=video0,vgamem_mb=16,bus=pci.0,addr=0x3 \ -global spapr-nvram.reg=0x3000 \ -monitor stdio PCI info (with the bad index value): ------------------------------------ 0001:00:01.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02) (prog-if 30 [XHCI]) Subsystem: IBM Device 04b2 Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Interrupt: pin A routed to IRQ 19 Region 0: Memory at 40100b0000000 (64-bit, non-prefetchable) [size=64K] Region 2: Memory at 40100b0010000 (64-bit, non-prefetchable) [size=8K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold-) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [48] MSI: Enable- Count=1/8 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [70] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 1024 bytes, PhantFunc 0, Latency L0s <512ns, L1 <8us ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported+ RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 256 bytes, MaxReadReq 512 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <128ns, L1 <32us ClockPM+ Surprise- LLActRep- BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk- ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [c0] MSI-X: Enable- Count=8 Masked- Vector table: BAR=2 offset=00000000 PBA: BAR=2 offset=00001000 Capabilities: [100 v2] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr+ BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap+ CGenEn+ ChkCap+ ChkEn+ Capabilities: [150 v1] Device Serial Number 08-00-28-00-00-20-00-00 Qtree info: ----------- dev: spapr-pci-vfio-host-bridge, id "vfiohost" iommu = 3 (0x3) index = 16384 (0x4000) buid = 576460752840310784 (0x800000020004000) liobn = 2147500032 (0x80004000) mem_win_addr = 1127002102824960 (0x40100a0000000) mem_win_size = 536870912 (0x20000000) io_win_addr = 1127001565954048 (0x4010080000000) io_win_size = 65536 (0x10000) irq 0 bus: vfiohost.0 type PCI dev: vfio-pci, id "" host = "0003:03:00.0" x-intx-mmap-timeout-ms = 1100 (0x44c) x-vga = false bootindex = -1 (0xffffffffffffffff) addr = 01.0 romfile = "" rombar = 1 (0x1) multifunction = false command_serr_enable = true class USB controller, addr 00:01.0, pci id 104c:8241 (sub 1014:04b2) bar 0: mem at 0x90000000 [0x9000ffff] bar 2: mem at 0x90010000 [0x90011fff]