Bug 1529618
Summary: | [Q35] MMIO hint is not passed to the guest OS when set mem-reserve=4G | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | jingzhao <jinzhao> | ||||||||||||||||||
Component: | ovmf | Assignee: | Laszlo Ersek <lersek> | ||||||||||||||||||
Status: | CLOSED NOTABUG | QA Contact: | FuXiangChun <xfu> | ||||||||||||||||||
Severity: | high | Docs Contact: | |||||||||||||||||||
Priority: | high | ||||||||||||||||||||
Version: | 7.5 | CC: | chayang, jinzhao, juzhang, marcel, virt-maint | ||||||||||||||||||
Target Milestone: | rc | ||||||||||||||||||||
Target Release: | 7.4 | ||||||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||||||
OS: | Unspecified | ||||||||||||||||||||
Whiteboard: | |||||||||||||||||||||
Fixed In Version: | Doc Type: | Enhancement | |||||||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||||||
Clone Of: | 1390346 | Environment: | |||||||||||||||||||
Last Closed: | 2018-01-05 12:40:04 UTC | Type: | Bug | ||||||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||||
Embargoed: | |||||||||||||||||||||
Attachments: |
|
Comment 2
Laszlo Ersek
2018-01-03 11:14:36 UTC
Hi Laszlo summarized the test result, hope you can see it clearly 1. used the default value "-device pcie-root-port,id=root.3,slot=4 " and plug memory to the pcie-root-port {'execute': 'object-add', 'arguments': {'id': 'shmmem-shmem0', 'qom-type': 'memory-backend-ram', 'props': {'policy': 'default', 'size': 4294967296}}} {"return": {}} {'execute': 'device_add', 'arguments':{'id': 'shmem0','driver': 'ivshmem-plain', 'memdev': 'shmmem-shmem0', 'bus':'root.3'}} {"return": {}} test result: [root@localhost ~]# lspci -vvv -t -[0000:00]-+-00.0 Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller +-01.0 Red Hat, Inc. QXL paravirtual graphic card +-02.0-[01]----00.0 Red Hat, Inc. Virtio block device +-03.0-[02-06]----00.0-[03-06]--+-00.0-[04]-- | +-01.0-[05]-- | \-02.0-[06]-- +-04.0-[07]----00.0 Red Hat, Inc. Virtio network device +-05.0-[08]----00.0 Red Hat, Inc. Inter-VM shared memory +-1f.0 Intel Corporation 82801IB (ICH9) LPC Interface Controller +-1f.2 Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA Controller [AHCI mode] \-1f.3 Intel Corporation 82801I (ICH9 Family) SMBus Controller [root@localhost ~]# lspci -v -s 08:00.0 08:00.0 RAM memory: Red Hat, Inc. Inter-VM shared memory (rev 01) Subsystem: Red Hat, Inc. QEMU Virtual Machine Physical Slot: 4 Flags: fast devsel Memory at 98000000 (32-bit, non-prefetchable) [size=256] Memory at <unassigned> (64-bit, prefetchable) Kernel modules: virtio_pci 2. used the "mem-reserve=1G" (-device pcie-root-port,id=root.3,slot=4,mem-reserve=1G) and do the hotplug operation [root@localhost ~]# lspci -v -s 08:00.0 08:00.0 RAM memory: Red Hat, Inc. Inter-VM shared memory (rev 01) Subsystem: Red Hat, Inc. QEMU Virtual Machine Physical Slot: 4 Flags: fast devsel Memory at 98b00000 (32-bit, non-prefetchable) [size=256] Memory at <unassigned> (64-bit, prefetchable) Kernel modules: virtio_pci [root@localhost ~]# lspci -vvv -t -[0000:00]-+-00.0 Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller +-01.0 Red Hat, Inc. QXL paravirtual graphic card +-02.0-[01]----00.0 Red Hat, Inc. Virtio block device +-03.0-[02-06]----00.0-[03-06]--+-00.0-[04]-- | +-01.0-[05]-- | \-02.0-[06]-- +-04.0-[07]----00.0 Red Hat, Inc. Virtio network device +-05.0-[08]----00.0 Red Hat, Inc. Inter-VM shared memory +-1f.0 Intel Corporation 82801IB (ICH9) LPC Interface Controller +-1f.2 Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA Controller [AHCI mode] \-1f.3 Intel Corporation 82801I (ICH9 Family) SMBus Controller 3. used the "mem-reserve=4G" (-device pcie-root-port,id=root.3,slot=4,mem-reserve=4G) and do the hotplug operation [root@localhost ~]# lspci -vvv -t -[0000:00]-+-00.0 Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller +-01.0 Red Hat, Inc. QXL paravirtual graphic card +-02.0-[01]----00.0 Red Hat, Inc. Virtio block device +-03.0-[02-06]----00.0-[03-06]--+-00.0-[04]-- | +-01.0-[05]-- | \-02.0-[06]-- +-04.0-[07]----00.0 Red Hat, Inc. Virtio network device +-05.0-[08]----00.0 Red Hat, Inc. Inter-VM shared memory +-1f.0 Intel Corporation 82801IB (ICH9) LPC Interface Controller +-1f.2 Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA Controller [AHCI mode] \-1f.3 Intel Corporation 82801I (ICH9 Family) SMBus Controller [root@localhost ~]# lspci -v -s 08:00.0 08:00.0 RAM memory: Red Hat, Inc. Inter-VM shared memory (rev 01) Subsystem: Red Hat, Inc. QEMU Virtual Machine Physical Slot: 4 Flags: fast devsel Memory at 98b00000 (32-bit, non-prefetchable) [size=256] Memory at <unassigned> (64-bit, prefetchable) Kernel modules: virtio_pci the detailed dmesg log and ovmf log, please check the attachment Created attachment 1376671 [details]
dmesg of default parameter
Created attachment 1376672 [details]
dmesg log of mem reserve 1G
Created attachment 1376673 [details]
dmesg of mem reserve 4G
Created attachment 1376674 [details]
ovmf log of default
Created attachment 1376675 [details]
ovmf log of mem reserve 1G
Created attachment 1376676 [details]
ovmf log of mem reserve 4G
Hi Jing, thank you for summarizing the steps in comment 3. I have executed the same test (with ivshmem) earlier, successfully. And, I don't even need to check the logs in comments 4 through 9; I can see the problem in comment 3 already. Let me explain: (1) The ivshmem device has two BARs: - a small, fixed size (256 byte) MMIO BAR that is non-prefetchable, - a large MMIO BAR that is prefetchable; this BAR size is controlled by the end-user (on the QEMU command line or in the monitor command) (2) A "prefetchable" BAR means that it can be read by the system (not just the CPU, but also by other system components) at any time, without side effects to the device. A "non-prefetchable" BAR may only be read by the system if the CPU actually wants the device to do something. Therefore, a "prefetchable" BAR may be placed in both prefetched and non-prefetched apertures. If it is allocated from a prefetched aperture, then the system might perform spurious reads from the BAR, but the device is fine with that. If the BAR is allocated from a non-prefetched aperture, then the system will simply not perform spurious reads. Conversely, a "non-prefetchable" BAR may only be placed in non-prefetched apertures. Otherwise, the system might perform a spurious read to the BAR (through the prefetched aperture), and the device would *not* like that. This means that the small BAR of the ivshmem device may only be allocated from non-prefetched aperture (*all* reads to this BAR will have an effect on the device). Whereas the large BAR of the ivshmem device (which is actually used for inter-VM memory sharing) may be allocated from both prefetched and non-prefetched apertures -- the device couldn't care less about "spurious" reads to the shared guest RAM. (3) The "pcie-root-port" device has 3 properties that control MMIO aperture reservations: (a) mem-reserve: reserve non-prefetched MMIO aperture, 32-bit *only* (b) pref32-reserve: reserve prefetched MMIO aperture, 32-bit (c) pref64-reserve: reserve prefetched MMIO aperture, 64-bit There are two rules about them: - each one of the three reservation hints is optional, - the "pref32-reserve" and "pref64-reserve" hints are mutually exclusive. (4) Given that you want to make the "large BAR" of the ivshmem device 4GB in size, you have to reserve *at least* 4GB aperture on the PCI Express root port level. Furthermore, given the >=4GB reservation size, *only* the "pref64-reserve" reservation hint is suitable. NEEDINFO: Therefore, please replace the "mem-reserve" property on your QEMU command line, with "pref64-reserve", and repeat the test with the original (unchanged) QMP commands. The expected results are: - The non-prefetchable "small BAR" of the ivshmem device will be allocated from the non-prefetched, 32-bit only, MMIO aperture that OVMF reserves by default for the PCI Express root port -- 2MB in size; - The prefetchable "large BAR" of the device will be allocated from the prefetched, 64-bit MMIO aperture that OVMF will reserve due to the "pref64-reserve" property. (5) Side topic: let's say you only want to use a 256MB ivshmem device. This means we need to reserve at least 256MB prefetched or non-prefetched aperture for the large BAR, and 256B non-prefetched aperture for the small BAR. Any of the following would work for that: - mem-reserve=512M: the non-prefetched aperture reservation would contain both BARs - pref32-reserve=256M: the small BAR would go into the default 2MB non-pref aperture, the large BAR would go into the 256MB 32-bit pref reservation - pref64-reserve=256M: the small BAR would go into the default 2MB non-pref aperture, the large BAR would go into the 256MB 64-bit pref reservation Thanks! Hi Laszlo Thanks your detailed explain Test against with "pref64-reserve=4G" and detailed test result 1. Boot guest with qemu command line /usr/libexec/qemu-kvm \ -M q35 \ -cpu SandyBridge \ -nodefaults -rtc base=utc \ -m 4G \ -smp 2,sockets=2,cores=1,threads=1 \ -enable-kvm \ -name rhel7.4 \ -uuid 990ea161-6b67-47b2-b803-19fb01d30d12 \ -k en-us \ -serial unix:/tmp/console,server,nowait \ -boot menu=on \ -qmp tcp::8887,server,nowait \ -vga qxl \ -drive file=/usr/share/OVMF/OVMF_CODE.secboot.fd,if=pflash,format=raw,unit=0,readonly=on \ -drive file=/home/test/OVMF_VARS.fd,if=pflash,format=raw,unit=1,readonly=on \ -spice port=5932,disable-ticketing \ -debugcon file:/home/ovmf.log \ -global isa-debugcon.iobase=0x402 \ -device pcie-root-port,id=root.0,slot=1 \ -drive file=/home/test/rhel75-ovmf-bk.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,werror=stop,rerror=stop \ -device virtio-blk-pci,bus=root.0,drive=drive-virtio-disk0,id=virtio-disk0,disable-legacy=on,disable-modern=off,bootindex=1 \ -device pcie-root-port,id=root.1,slot=2 \ -device x3130-upstream,bus=root.1,id=upstream1 \ -device xio3130-downstream,bus=upstream1,id=downstream1,chassis=1 \ -device xio3130-downstream,bus=upstream1,id=downstream2,chassis=2 \ -device xio3130-downstream,bus=upstream1,id=downstream3,chassis=3 \ -device pcie-root-port,id=root.2,slot=3 \ -netdev tap,id=hostnet1 \ -device virtio-net-pci,netdev=hostnet1,id=net1,mac=54:52:00:B6:40:22,bus=root.2 \ -device pcie-root-port,id=root.3,slot=4,pref64-reserve=4G \ -monitor stdio \ 2. Hotplug memory to root.3 3. check result in guest [root@localhost home]# lspci 00:00.0 Host bridge: Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller 00:01.0 VGA compatible controller: Red Hat, Inc. QXL paravirtual graphic card (rev 04) 00:02.0 PCI bridge: Red Hat, Inc. QEMU PCIe Root port 00:03.0 PCI bridge: Red Hat, Inc. QEMU PCIe Root port 00:04.0 PCI bridge: Red Hat, Inc. QEMU PCIe Root port 00:05.0 PCI bridge: Red Hat, Inc. QEMU PCIe Root port 00:1f.0 ISA bridge: Intel Corporation 82801IB (ICH9) LPC Interface Controller (rev 02) 00:1f.2 SATA controller: Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA Controller [AHCI mode] (rev 02) 00:1f.3 SMBus: Intel Corporation 82801I (ICH9 Family) SMBus Controller (rev 02) 01:00.0 SCSI storage controller: Red Hat, Inc. Virtio block device (rev 01) 02:00.0 PCI bridge: Texas Instruments XIO3130 PCI Express Switch (Upstream) (rev 02) 03:00.0 PCI bridge: Texas Instruments XIO3130 PCI Express Switch (Downstream) (rev 01) 03:01.0 PCI bridge: Texas Instruments XIO3130 PCI Express Switch (Downstream) (rev 01) 03:02.0 PCI bridge: Texas Instruments XIO3130 PCI Express Switch (Downstream) (rev 01) 07:00.0 Ethernet controller: Red Hat, Inc. Virtio network device (rev 01) 08:00.0 RAM memory: Red Hat, Inc. Inter-VM shared memory (rev 01) [root@localhost home]# lspci -vvv -t -[0000:00]-+-00.0 Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller +-01.0 Red Hat, Inc. QXL paravirtual graphic card +-02.0-[01]----00.0 Red Hat, Inc. Virtio block device +-03.0-[02-06]----00.0-[03-06]--+-00.0-[04]-- | +-01.0-[05]-- | \-02.0-[06]-- +-04.0-[07]----00.0 Red Hat, Inc. Virtio network device +-05.0-[08]----00.0 Red Hat, Inc. Inter-VM shared memory +-1f.0 Intel Corporation 82801IB (ICH9) LPC Interface Controller +-1f.2 Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA Controller [AHCI mode] \-1f.3 Intel Corporation 82801I (ICH9 Family) SMBus Controller [root@localhost home]# lspci -v -s 08:00.0 08:00.0 RAM memory: Red Hat, Inc. Inter-VM shared memory (rev 01) Subsystem: Red Hat, Inc. QEMU Virtual Machine Physical Slot: 4 Flags: fast devsel Memory at 98000000 (32-bit, non-prefetchable) [size=256] Memory at 800000000 (64-bit, prefetchable) [size=4G] Kernel modules: virtio_pci dmesg log and ovmf log, please check the attachment Created attachment 1377359 [details]
dmesg with pref64-reserve=4G
Created attachment 1377360 [details]
ovmf log with pref64-reserve=4G
* From the OVMF log: The hint is parsed fine in OVMF (see "Prefetchable64BitMmio"): > PciBus: Discovered PPB @ [00|05|00] > GetResourcePadding: Address=00:05.0 DevicePath=PciRoot(0x0)/Pci(0x5,0x0) > GetResourcePadding: BusNumbers=0xFFFFFFFF Io=0xFFFFFFFFFFFFFFFF NonPrefetchable32BitMmio=0xFFFFFFFF > GetResourcePadding: Prefetchable32BitMmio=0xFFFFFFFF Prefetchable64BitMmio=0x100000000 > Padding: Type = Mem32; Alignment = 0x1FFFFF; Length = 0x200000 > Padding: Type = Io; Alignment = 0x1FF; Length = 0x200 > Padding: Type = PMem64; Alignment = 0xFFFFFFFF; Length = 0x100000000 > BAR[0]: Type = Mem32; Alignment = 0xFFF; Length = 0x1000; Offset = 0x10 You can see the default, non-pref, 32-bit only, aperture reservation, 2MB in size ("Mem32"). Similarly the default 512B IO-port reservation, "Io" (which would be disabled by "io-reserve=0" in other use cases). The "PMem64" padding stands for the 4GB reservation from "pref64-reserve=4G". The final entry ("BAR[0]") is the root port's SHPC (standard hot plug controller) BAR. > PciBus: Resource Map for Bridge [00|05|00] > Type = Io16; Base = 0x6000; Length = 0x200; Alignment = 0xFFF > Base = Padding; Length = 0x200; Alignment = 0x1FF > Type = Mem32; Base = 0x98000000; Length = 0x200000; Alignment = 0x1FFFFF > Base = Padding; Length = 0x200000; Alignment = 0x1FFFFF > Type = Mem32; Base = 0x98C03000; Length = 0x1000; Alignment = 0xFFF > Type = PMem64; Base = 0x800000000; Length = 0x100000000; Alignment = 0xFFFFFFFF > Base = Padding; Length = 0x100000000; Alignment = 0xFFFFFFFF This shows where the aperture reservations were actually allocated. The non-pref 32-bit-only 2MB reservation is allocated at 0x9800_0000. The pref 64-bit 4GB reservation is allocated at 0x8_0000_0000. * From the guest kernel dmesg: > [ 54.047108] pciehp 0000:00:05.0:pcie004: Slot(4): Attention button pressed > [ 54.047117] pciehp 0000:00:05.0:pcie004: Slot(4): Card present > [ 54.047143] pciehp 0000:00:05.0:pcie004: Slot(4) Powering on due to button press > [ 55.149156] pci 0000:08:00.0: [1af4:1110] type 00 class 0x050000 > [ 55.149198] pci 0000:08:00.0: reg 0x10: [mem 0x00000000-0x000000ff] > [ 55.149262] pci 0000:08:00.0: reg 0x18: [mem 0x00000000-0xffffffff 64bit pref] > [ 55.149888] pci 0000:08:00.0: BAR 2: assigned [mem 0x800000000-0x8ffffffff 64bit pref] > [ 55.161699] pci 0000:08:00.0: BAR 0: assigned [mem 0x98000000-0x980000ff] > [ 55.162091] pcieport 0000:00:05.0: PCI bridge to [bus 08] > [ 55.162098] pcieport 0000:00:05.0: bridge window [io 0x6000-0x6fff] > [ 55.162681] pcieport 0000:00:05.0: bridge window [mem 0x98000000-0x981fffff] > [ 55.176399] pcieport 0000:00:05.0: bridge window [mem 0x800000000-0x8ffffffff 64bit pref] This shows -- matching the last "lspci" output from comment 11 -- that Linux allocates the ivshmem device's "small BAR" from the root port's 2MB reservation at 0x9800_0000, and the "large BAR" from the root port's 4GB reservation at 0x8_0000_0000. This matches the "expected results" from my comment 10 bullet (4). Thus, everything's fine. Closing this BZ as NOTABUG -- using the "mem-reserve=4G" property was the issue; "pref64-reserve=4G" proved OK. |