Hide Forgot
Description of problem: Guest view of an i915-GVTg_V4_1: 00:0a.0 VGA compatible controller: Intel Corporation HD Graphics 5500 (rev 09) (prog-if 00 [VGA controller]) Subsystem: Intel Corporation Device 2058 Physical Slot: 10 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Interrupt: pin A routed to IRQ 24 Region 0: Memory at 140000000 (64-bit, non-prefetchable) [size=16M] Region 2: Memory at 180000000 (64-bit, prefetchable) [size=1G] Region 4: Memory at <ignored> (32-bit, non-prefetchable) Region 5: Memory at <ignored> (32-bit, non-prefetchable) Expansion ROM at febd6000 [disabled] [size=2K] Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit- Address: fee01000 Data: 4042 Capabilities: [d0] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [a4] PCI Advanced Features AFCap: TP+ FLR+ AFCtrl: FLR- AFStatus: TP- Kernel driver in use: i915 Kernel modules: i915 Note that BAR4 and 5 are listed as <ignored>. [root@localhost ~]# setpci -s a.0 BASE_ADDRESS_4 febd4000 [root@localhost ~]# setpci -s a.0 BASE_ADDRESS_4=ffffffff [root@localhost ~]# setpci -s a.0 BASE_ADDRESS_4 ffffffff [root@localhost ~]# setpci -s a.0 BASE_ADDRESS_4=00000000 [root@localhost ~]# setpci -s a.0 BASE_ADDRESS_4 00000000 BAR5 behaves the same way. KVMGT has essentially implemented these as scratch registers. This does not adhere to the PCI spec for implementation of these registers. On the physical device BAR4 is an I/O port space and BAR5 is unimplemented, if I run the same test on the BAR5 of the physical device, we see: [root@nuc ~]# setpci -s 2.0 BASE_ADDRESS_5 00000000 [root@nuc ~]# setpci -s 2.0 BASE_ADDRESS_5=ffffffff [root@nuc ~]# setpci -s 2.0 BASE_ADDRESS_5 00000000 The register is read-only. Additionally, when I look at drivers/gpu/drm/i915/gvt/kvmgt.c:intel_vgpu_rw(), I see: switch (index) { case VFIO_PCI_CONFIG_REGION_INDEX: ... break; case VFIO_PCI_BAR0_REGION_INDEX: case VFIO_PCI_BAR1_REGION_INDEX: ... break; case VFIO_PCI_BAR2_REGION_INDEX: case VFIO_PCI_BAR3_REGION_INDEX: case VFIO_PCI_BAR4_REGION_INDEX: case VFIO_PCI_BAR5_REGION_INDEX: case VFIO_PCI_VGA_REGION_INDEX: case VFIO_PCI_ROM_REGION_INDEX: default: gvt_vgpu_err("unsupported region: %u\n", index); } This raises two concerns, first the BAR0 and BAR1 regions are aliased together. I assume this is done because BAR0 is a 64bit BAR and therefore effectively consumes both BASE_ADDRESS_0 and BASE_ADDRESS_1 in PCI config space. But this doesn't mean that there is a BAR1 that's aliased to BAR0, it just means that there isn't a BAR1. vfio-pci would handle a 64bit BAR by exposing only the region for the starting BAR. The second concern is that the above PCI listing shows Region 0 and 2 implemented, equating to BAR0 and BAR2, but the read-write access function backing the regions doesn't support access to BAR2. Does this imply that BAR2 is *only* accessible via mmap? We only need to enable vfio-pci.x-no-mmap=on on QEMU to break this. VFIO really does not have a way to express an mmap-only region. Read-write access should be supported. Version-Release number of selected component (if applicable): 3.10.0-675.el7.x86_64 How reproducible: 100% Steps to Reproduce: 1. See above 2. 3. Actual results: See above Expected results: BAR4 & BAR5 should be read-only to be considered unimplemented, read-write access to BAR2 should be provided. Additional info: The Moon is Waxing Gibbous (53% of Full)
Moved to RHEL 7.5.
Will fixed by below two: https://lists.freedesktop.org/archives/intel-gvt-dev/2017-August/001690.html https://lists.freedesktop.org/archives/intel-gvt-dev/2017-August/001691.html
(In reply to Changbin Du from comment #3) > Will fixed by below two: > https://lists.freedesktop.org/archives/intel-gvt-dev/2017-August/001690.html > https://lists.freedesktop.org/archives/intel-gvt-dev/2017-August/001691.html I don't see that the BAR4 and BAR5 register implementations are fixed by either of these. Is there a third patch for that?
(In reply to Alex Williamson from comment #4) > > I don't see that the BAR4 and BAR5 register implementations are fixed by > either of these. Is there a third patch for that? Yes, there is another fix under cooking. I found there are some problems of PCI BAR read/write emulation. Sorry for missed this one.
(In reply to Changbin Du from comment #5) > (In reply to Alex Williamson from comment #4) > > > > I don't see that the BAR4 and BAR5 register implementations are fixed by > > either of these. Is there a third patch for that? > > Yes, there is another fix under cooking. I found there are some problems of > PCI BAR read/write emulation. Sorry for missed this one. Any status update on 3 patch?
patch in drm-intel: drm/i915: Add interface to reserve fence registers for vGPU https://cgit.freedesktop.org/drm-intel/commit/?h=drm-intel-nightly&id=969b0950a188750bd6ad12693fa3b6e8d63036fb
(In reply to weinanli from comment #7) > patch in drm-intel: > drm/i915: Add interface to reserve fence registers for vGPU > https://cgit.freedesktop.org/drm-intel/commit/?h=drm-intel- > nightly&id=969b0950a188750bd6ad12693fa3b6e8d63036fb This appears entirely unrelated. How is it relevant to this bz?
(In reply to Alex Williamson from comment #8) > (In reply to weinanli from comment #7) > > patch in drm-intel: > > drm/i915: Add interface to reserve fence registers for vGPU > > https://cgit.freedesktop.org/drm-intel/commit/?h=drm-intel- > > nightly&id=969b0950a188750bd6ad12693fa3b6e8d63036fb > > This appears entirely unrelated. How is it relevant to this bz? Sorry my bad, pls ignore it, this patch is for #1449711
patches in drm-intel https://cgit.freedesktop.org/drm-intel/ commit 5d5fe176155e6cfa4a53accb90e4010baa5266d0 drm/i915/kvmgt: Sanitize PCI bar emulation commit f090a00df9ecdab5d066b099c1797e0070e27a36 drm/i915/gvt: Add emulation for BAR2 (aperture) with normal file RW approach commit f1751362d6357a90bc6e53176cec715ff2dbed74 drm/i915/gvt: Fix incorrect PCI BARs reporting commit 02d578e5edd980eac3fbed15db4d9e5665f22089 drm/i915/gvt: Add support for PCIe extended configuration space
Hello, We now have all the related patches upstreamed, anyone can help to close this issue or is there pending procedure? Bugzilla has start complaining to me for 'Outstanding Requests', but we Intel guys seems do not have permission to close it. Thanks.
The patches have not even been posted for the RHEL kernel, moving back to ASSIGNED. This bug is tracking a RHEL bug, not upstream. In order to progress from ASSIGNED, *backports* of these patches need to be posted via the internal process.
(In reply to Alex Williamson from comment #12) > The patches have not even been posted for the RHEL kernel, moving back to > ASSIGNED. This bug is tracking a RHEL bug, not upstream. In order to > progress from ASSIGNED, *backports* of these patches need to be posted via > the internal process. I see, Thanks for your reply.
Patch(es) committed on kernel repository and an interim kernel build is undergoing testing
Patch(es) available on kernel-3.10.0-816.el7
This issue has been fixed and cannot be reproduced at RHEL7.5 Alpha (with kernel 3.10.0-799.el7), so close it.
(In reply to Terrence Xu from comment #19) > This issue has been fixed and cannot be reproduced at RHEL7.5 Alpha (with > kernel 3.10.0-799.el7), so close it. This fix will be captured in a RHEL errata when released publicly in 7.5. The RH tools will advance the bug state from here to GA.
Test against kernel-3.10.0-823.el7.x86_64(host & guest) and qemu-kvm-rhev-2.10.0-12.el7.x86_64. Vgpu used:i915-GVTg_V5_4 For ba4 & bar 5: Inside guest: # lspci -vv -s 00:05.0 00:05.0 VGA compatible controller: Intel Corporation Iris Pro Graphics 580 (rev 09) (prog-if 00 [VGA controller]) Subsystem: Intel Corporation Device 2064 Physical Slot: 5 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Interrupt: pin A routed to IRQ 24 Region 0: Memory at 140000000 (64-bit, non-prefetchable) [size=16M] Region 2: Memory at 180000000 (64-bit, prefetchable) [size=1G] Expansion ROM at febf1000 [disabled] [size=2K] Capabilities: [40] Vendor Specific Information: Len=0c <?> Capabilities: [70] Express (v2) Root Complex Integrated Endpoint, MSI 00 DevCap: MaxPayload 128 bytes, PhantFunc 0 ExtTag- RBE+ DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 128 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable- 64bit- Address: fee01000 Data: 4061 Capabilities: [d0] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Kernel driver in use: i915 Kernel modules: i915 Bar 4 & Bar 5 has been hidden now. For Bar2, try to enable x-no-mmap=on, qemu cli: /usr/libexec/qemu-kvm -name input-test -m 4G \ -cpu Broadwell,enforce \ -smp 2 \ -device VGA \ -netdev tap,id=idinWyYp,vhost=on -device e1000,mac=42:ce:a9:d2:4d:d7,id=idlbq7eA,netdev=idinWyYp \ -uuid 215e11b2-a869-41b5-91cd-6a32a907be7e \ -device ich9-usb-uhci6 \ -drive file=/home/V8_1.qcow2,if=none,id=drive-scsi-disk0,format=qcow2,cache=none,werror=stop,rerror=stop -device ide-drive,drive=drive-scsi-disk0 \ -qmp unix:/tmp/input-port,server,nowait \ -monitor stdio \ -vnc :0 \ -device usb-tablet \ -device vfio-pci,id=kvmgt,x-no-mmap=on,sysfsdev=/sys/bus/mdev/devices/31efe556-0821-460b-b1f8-ccce561b25ca \ Host will hang immediately with log: ... <3>[ 133.932958] ip_va=ffffa09e4843cfe0: <3>[ 133.936528] cccccccc cccccccc <3>[ 133.940137] cccccccc cccccccc <3>[ 133.943785] cccccccc cccccccc <3>[ 133.947528] cccccccc cccccccc <3>[ 133.951135] <6>[ 142.588262] [drm] GPU HANG: ecode 9:0:0xeada1d47, reason: Hang on rcs0, action: reset <6>[ 142.596774] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. <6>[ 142.606780] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel <6>[ 142.616602] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. <6>[ 142.627078] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. <6>[ 142.636855] [drm] GPU crash dump saved to /sys/class/drm/card0/error <5>[ 142.643943] i915 0000:00:02.0: Resetting rcs0 after gpu hang <3>[ 143.642590] ip_va=ffffa09e4843d1f8: cccccccc cccccccc cccccccc cccccccc
FailQA per comment 21
We got the GPU hang issue as zhiyi said, but host didn't crash. It is the GPU hang issue which be exposed after the old issue (VM cannot boot up with "x-no-mmap=on" option). The old behavior is VM cannot boot up with "x-no-mmap=on" option, our bug fix patches have resolved this issue. So it is the new behavior. Talked with Zhiyi, we still use this bug with this title for track.
Created attachment 1371987 [details] host gpu hang dmesg log
As noted in bug 1533634: 1) BAR 4 & 5 are exposed as scratch registers, as evidenced by the lspci and setpci examples run from within the guest in the bug report. 2) BAR0 and BAR1 are aliases of each, which is improper implementation of a 64bit BAR through the vfio API. The proper implementation is that only vfio BAR region 0 should report a size, BAR region 1 should be zero sized. 3) read/write backing of regions is not implemented as evidenced by the failure of the VM to work properly with the QEMU vfio-pci device option x-no-mmap=on. Let's handle 3) in bug 1533634. For verification of this bug, comment 21 partially verifies 1). I would suggest also using setpci from within the guest on the vGPU as shown in the original example. We can verify 2) using gdb. Install qemu-kvm-rhev-debuginfo, create a vGPU and use gdb as follows: # gdb /usr/libexec/qemu-kvm ... (gdb) b vfio_bars_setup Breakpoint 1 at 0x2d316c: file /usr/include/bits/unistd.h, line 99. (gdb) run -m 1G -net none -monitor stdio --enable-kvm -serial none -vga none -nographic -device vfio-pci,sysfsdev=/sys/bus/mdev/devices/7611722f-5296-40c0-a248-cb3da38fb7a5 -S <replace UUID with that of vGPU on test system> Starting program: /usr/libexec/qemu-kvm -m 1G -net none -monitor stdio --enable-kvm -serial none -vga none -nographic -device vfio-pci,sysfsdev=/sys/bus/mdev/devices/7611722f-5296-40c0-a248-cb3da38fb7a5 -S ... Breakpoint 1, vfio_realize (pdev=0x555557e3a000, errp=0x7fffffffdc80) at /usr/src/debug/qemu-2.10.0/hw/vfio/pci.c:2818 2818 vfio_bars_setup(vdev); ... (gdb) p vdev->bars[0] $1 = {region = {vbasedev = 0x555557e3a8e0, fd_offset = 0, mem = 0x555556cf71d0, size = 16777216, flags = 3, nr_mmaps = 0, mmaps = 0x0, nr = 0 '\000'}, ioport = false, mem64 = false, quirks = { lh_first = 0x0}} (gdb) p vdev->bars[1] $2 = {region = {vbasedev = 0x555557e3a8e0, fd_offset = 1099511627776, mem = 0x0, size = 0, flags = 0, nr_mmaps = 0, mmaps = 0x0, nr = 1 '\001'}, ioport = false, mem64 = false, quirks = {lh_first = 0x0}} (gdb) p vdev->bars[2] $3 = {region = {vbasedev = 0x555557e3a8e0, fd_offset = 2199023255552, mem = 0x555556cf73b0, size = 1073741824, flags = 15, nr_mmaps = 1, mmaps = 0x555556dccc60, nr = 2 '\002'}, ioport = false, mem64 = false, quirks = {lh_first = 0x0}} (gdb) p vdev->bars[3] $4 = {region = {vbasedev = 0x555557e3a8e0, fd_offset = 3298534883328, mem = 0x0, size = 0, flags = 0, nr_mmaps = 0, mmaps = 0x0, nr = 3 '\003'}, ioport = false, mem64 = false, quirks = {lh_first = 0x0}} (gdb) quit The key information here is the "size = " value. It should be non-zero for indexes 0 & 2 and zero for indexes 1 & 3, as shown above.
(In reply to Alex Williamson from comment #25) > As noted in bug 1533634: > > 1) BAR 4 & 5 are exposed as scratch registers, as evidenced by the lspci and > setpci examples run from within the guest in the bug report. > > 2) BAR0 and BAR1 are aliases of each, which is improper implementation of a > 64bit BAR through the vfio API. The proper implementation is that only vfio > BAR region 0 should report a size, BAR region 1 should be zero sized. > > 3) read/write backing of regions is not implemented as evidenced by the > failure of the VM to work properly with the QEMU vfio-pci device option > x-no-mmap=on. > > > Let's handle 3) in bug 1533634. > > For verification of this bug, comment 21 partially verifies 1). I would > suggest also using setpci from within the guest on the vGPU as shown in the > original example. > > We can verify 2) using gdb. Install qemu-kvm-rhev-debuginfo, create a vGPU > and use gdb as follows: > <...> > The key information here is the "size = " value. It should be non-zero for > indexes 0 & 2 and zero for indexes 1 & 3, as shown above. So am I correct in assuming there are no coding changes expected for this BZ and we only need verification from QE? If yes, please change the status back to ON_QA.
Yes
Test on skylake host with gpu Iris Pro Graphics 580 Package used: 3.10.0-830.el7.x86_64(host & guest) qemu-kvm-rhev-2.10.0-17.el7.x86_64 Device info: # lspci -vv -s 00:05.0 00:05.0 VGA compatible controller: Intel Corporation Iris Pro Graphics 580 (rev 09) (prog-if 00 [VGA controller]) Subsystem: Intel Corporation Device 2064 Physical Slot: 5 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Interrupt: pin A routed to IRQ 24 Region 0: Memory at 140000000 (64-bit, non-prefetchable) [size=16M] Region 2: Memory at 180000000 (64-bit, prefetchable) [size=1G] Expansion ROM at febf1000 [disabled] [size=2K] Capabilities: [40] Vendor Specific Information: Len=0c <?> Capabilities: [70] Express (v2) Root Complex Integrated Endpoint, MSI 00 DevCap: MaxPayload 128 bytes, PhantFunc 0 ExtTag- RBE+ DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 128 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable- 64bit- Address: fee00000 Data: 4041 Capabilities: [d0] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Kernel driver in use: i915 Kernel modules: i915 For 1) in comment 25, Read the value of bar 4: # setpci -s 5.0 BASE_ADDRESS_4 00000000 Try to write to it: # setpci -s 5.0 BASE_ADDRESS_4=ffffffff Check if bar4 is writable: # setpci -s 5.0 BASE_ADDRESS_4 00000000 This prove bar4 is not writable but readable Apply same test to bar5: # setpci -s 5.0 BASE_ADDRESS_5 00000000 # setpci -s 5.0 BASE_ADDRESS_5=ffffffff # setpci -s 5.0 BASE_ADDRESS_5 00000000 Bar5 is not writable but readable too For 2), follow the instructions provided by Alex: # gdb /usr/libexec/qemu-kvm (gdb) b vfio_bars_setup Breakpoint 1 at 0x2d319c: file /usr/include/bits/unistd.h, line 99. (gdb) run -m 1G -net none -monitor stdio --enable-kvm -serial none -vga none -nographic -device vfio-pci,sysfsdev=/sys/bus/mdev/devi ces/0042ed6e-cfd8-44bf-b825-ba22f1b1005f -S Starting program: /usr/libexec/qemu-kvm -m 1G -net none -monitor stdio --enable-kvm -serial none -vga none -nographic -device vfio-p ci,sysfsdev=/sys/bus/mdev/devices/0042ed6e-cfd8-44bf-b825-ba22f1b1005f -S Breakpoint 1, vfio_realize (pdev=0x555557e38000, errp=0x7fffffffdd40) at /usr/src/debug/qemu-2.10.0/hw/vfio/pci.c:2818 2818 vfio_bars_setup(vdev); (gdb) p vdev->bars[0] $1 = {region = {vbasedev = 0x555557e388e0, fd_offset = 0, mem = 0x555556d911d0, size = 16777216, flags = 3, nr_mmaps = 0, mmaps = 0x0, nr = 0 '\000'}, ioport = false, mem64 = false, quirks = {lh_first = 0x0}} (gdb) p vdev->bars[1] $2 = {region = {vbasedev = 0x555557e388e0, fd_offset = 1099511627776, mem = 0x0, size = 0, flags = 0, nr_mmaps = 0, mmaps = 0x0, nr = 1 '\001'}, ioport = false, mem64 = false, quirks = {lh_first = 0x0}} (gdb) p vdev->bars[2] $3 = {region = {vbasedev = 0x555557e388e0, fd_offset = 2199023255552, mem = 0x555556d913b0, size = 1073741824, flags = 15, nr_mmaps = 1, mmaps = 0x555556dc6c60, nr = 2 '\002'}, ioport = false, mem64 = false, quirks = {lh_first = 0x0}} (gdb) p vdev->bars[3] $4 = {region = {vbasedev = 0x555557e388e0, fd_offset = 3298534883328, mem = 0x0, size = 0, flags = 0, nr_mmaps = 0, mmaps = 0x0, nr = 3 '\003'}, ioport = false, mem64 = false, quirks = {lh_first = 0x0}} *size* of bar0 is 16777216 *size* of bar1 is 0 *size* of bar2 is 1073741824 *size* of bar3 is 0 Match the expect result.
Verified per comment 28
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:1062