Bug 2174749
Summary: | [edk2] re-enable dynamic mmio window | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Gerd Hoffmann <kraxel> |
Component: | edk2 | Assignee: | Gerd Hoffmann <kraxel> |
Status: | CLOSED ERRATA | QA Contact: | Xueqiang Wei <xuwei> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 9.2 | CC: | alex.williamson, berrange, coli, germano, gveitmic, jinzhao, juzhang, kraxel, mcasquer, nanliu, nilal, pbonzini, vgoyal, virt-bugs, virt-maint, xiaohli, xuwei, yanghliu, yduan, ymankad, yuma, zhguo |
Target Milestone: | rc | Keywords: | RFE, Triaged |
Target Release: | --- | ||
Hardware: | All | ||
OS: | All | ||
Whiteboard: | |||
Fixed In Version: | edk2-20230524-2.el9 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | 2174605 | Environment: | |
Last Closed: | 2023-11-07 08:24:29 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 2171860 | ||
Bug Blocks: | 2055123, 2209005, 2209571 |
Description
Gerd Hoffmann
2023-03-02 11:24:03 UTC
https://gitlab.com/kraxel/centos-edk2/-/commits/bz2174749-enable-mmio-window https://kojihub.stream.centos.org/koji/taskinfo?taskID=2134405 Tested edk2 test loop with the scratch build on amd and intel host, now new bug was found. Versions: kernel-5.14.0-299.el9.x86_64 qemu-kvm-8.0.0-1.el9 edk2-ovmf-20230301gitf80f052277c8-2.el9.bz2174749.20230418.1152.noarch Guest: rhel9.3, win11 Job link: amd host: http://virtqetools.lab.eng.pek2.redhat.com/kvm_autotest_job_log/?jobid=7775914 Existing bug: Bug 2168446 - Booting VM failed on AMD EPYC 7252 host with npt=0 intel host: http://virtqetools.lab.eng.pek2.redhat.com/kvm_autotest_job_log/?jobid=7775924 Hi, I'm trying to migrate RHEL 9.3 OVMF guest from the src host: Xeon(R) Silver 4110 to the destination host: Xeon(R) CPU E3-1240 v5 on edk2-ovmf-20230301gitf80f052277c8-2.el9.bz2174749.20230418.1152.noarch, the Address sizes of the source is 46 bits physical, the destination is 39 bits physical when boot guest without host-phys-bits-limit=39, then dst qemu would core dump after migration completion: qemu-kvm: error: failed to set MSR 0x202 to 0xe000000000 qemu-kvm: ../target/i386/kvm/kvm.c:3177: int kvm_buf_set_msrs(X86CPU *): Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed. But with host-phys-bits-limit=39, migration succeeds, and VM works well after migration. The above test results are our expectation, right? (In reply to Li Xiaohui from comment #6) > Hi, I'm trying to migrate RHEL 9.3 OVMF guest from the src host: Xeon(R) > Silver 4110 to the destination host: Xeon(R) CPU E3-1240 v5 on > edk2-ovmf-20230301gitf80f052277c8-2.el9.bz2174749.20230418.1152.noarch, > > the Address sizes of the source is 46 bits physical, the destination is 39 > bits physical > > when boot guest without host-phys-bits-limit=39, then dst qemu would core > dump after migration completion: > qemu-kvm: error: failed to set MSR 0x202 to 0xe000000000 > qemu-kvm: ../target/i386/kvm/kvm.c:3177: int kvm_buf_set_msrs(X86CPU *): > Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed. > > But with host-phys-bits-limit=39, migration succeeds, and VM works well > after migration. > > > The above test results are our expectation, right? Yes. If you check /proc/iomem and /proc/mtrr in the guest (running on the host with 46 bits physical) you can see the different address space layouts used with/without host-phys-bits-limit=39. (In reply to Gerd Hoffmann from comment #3) > https://gitlab.com/kraxel/centos-edk2/-/commits/bz2174749-enable-mmio-window > https://kojihub.stream.centos.org/koji/taskinfo?taskID=2134405 I use the above build to test Bug 2055123 - [Q35] Failed to hot-plug a device whose membar > 2M into the vm My test result shows Bug 2055123 still can be reproduced. Test env: # uname -r 5.14.0-310.el9.x86_64 # rpm -q qemu-kvm qemu-kvm-8.0.0-1.el9.x86_64 # rpm -qa|grep edk2 edk2-tools-20230301gitf80f052277c8-2.el9.bz2174749.20230418.1152.x86_64 edk2-ovmf-20230301gitf80f052277c8-2.el9.bz2174749.20230418.1152.noarch Test step: (1) start a domain (2) hot-plug a XL710 PF into domain # lspci -s 87:00.0 87:00.0 Ethernet controller: Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 02) # /bin/virsh attach-device rhel93 /tmp/device/0000:87:00.0.xml Device attached successfully (3) check the PF status in the domain # ifconfig <-- I can not get any PF info here # dmesg [ 96.606506] pci 0000:04:00.0: [8086:1583] type 00 class 0x020000 [ 96.607008] pci 0000:04:00.0: reg 0x10: [mem 0x00000000-0x00ffffff 64bit pref] [ 96.608131] pci 0000:04:00.0: reg 0x1c: [mem 0x00000000-0x00007fff 64bit pref] [ 96.608873] pci 0000:04:00.0: reg 0x30: [mem 0x00000000-0x0007ffff pref] [ 96.609398] pci 0000:04:00.0: Max Payload Size set to 128 (was 256, max 2048) [ 96.619171] pci 0000:04:00.0: BAR 0: no space for [mem size 0x01000000 64bit pref] [ 96.619176] pci 0000:04:00.0: BAR 0: failed to assign [mem size 0x01000000 64bit pref] [ 96.619179] pci 0000:04:00.0: BAR 6: assigned [mem 0x80200000-0x8027ffff pref] [ 96.619184] pci 0000:04:00.0: BAR 3: assigned [mem 0x80400000-0x80407fff 64bit pref] [ 96.674774] i40e: Intel(R) Ethernet Connection XL710 Network Driver [ 96.674776] i40e: Copyright (c) 2013 - 2019 Intel Corporation. [ 96.675213] i40e 0000:04:00.0: enabling device (0000 -> 0002) [ 96.676685] i40e 0000:04:00.0: Cannot map registers, bar size 0x0 too small, aborting [ 96.677242] i40e: probe of 0000:04:00.0 failed with error -12 > I use the above build to test Bug 2055123 - [Q35] Failed to hot-plug a
> device whose membar > 2M into the vm
>
> My test result shows Bug 2055123 still can be reproduced.
What is the exact qemu command line (or libvirt xml)?
(In reply to Gerd Hoffmann from comment #9) > > I use the above build to test Bug 2055123 - [Q35] Failed to hot-plug a > > device whose membar > 2M into the vm > > > > My test result shows Bug 2055123 still can be reproduced. > > What is the exact qemu command line (or libvirt xml)? The virt-install I used to import a domain: # virt-install --machine=q35 --noreboot --name=rhel93 --memory=4096 --vcpus=4 --graphics type=vnc,port=5993,listen=0.0.0.0 --boot=uefi --network bridge=switch,model=virtio,mac=52:54:00:00:93:93 --import --noautoconsole --disk path=/home/images/RHEL93.qcow2,bus=virtio,cache=none,format=qcow2,io=threads,size=20 --osinfo detect=on,require=off The detailed domain xml: <domain type='kvm'> <name>rhel93</name> <uuid>317e4316-4bc9-4997-b74a-acffc0056e4e</uuid> <memory unit='KiB'>4194304</memory> <currentMemory unit='KiB'>4194304</currentMemory> <vcpu placement='static'>4</vcpu> <os firmware='efi'> <type arch='x86_64' machine='pc-q35-rhel9.2.0'>hvm</type> <firmware> <feature enabled='yes' name='enrolled-keys'/> <feature enabled='yes' name='secure-boot'/> </firmware> <loader readonly='yes' secure='yes' type='pflash'>/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader> <nvram template='/usr/share/edk2/ovmf/OVMF_VARS.secboot.fd'>/var/lib/libvirt/qemu/nvram/rhel93_VARS.fd</nvram> <boot dev='hd'/> </os> <features> <acpi/> <apic/> <smm state='on'/> </features> <cpu mode='host-passthrough' check='none' migratable='on'/> <clock offset='utc'> <timer name='rtc' tickpolicy='catchup'/> <timer name='pit' tickpolicy='delay'/> <timer name='hpet' present='no'/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <pm> <suspend-to-mem enabled='no'/> <suspend-to-disk enabled='no'/> </pm> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='file' device='disk'> <driver name='qemu' type='qcow2' cache='none' io='threads'/> <source file='/home/images/RHEL93.qcow2'/> <target dev='vda' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/> </disk> <controller type='usb' index='0' model='ich9-ehci1'> <address type='pci' domain='0x0000' bus='0x00' slot='0x1d' function='0x7'/> </controller> <controller type='usb' index='0' model='ich9-uhci1'> <master startport='0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x1d' function='0x0' multifunction='on'/> </controller> <controller type='usb' index='0' model='ich9-uhci2'> <master startport='2'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x1d' function='0x1'/> </controller> <controller type='usb' index='0' model='ich9-uhci3'> <master startport='4'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x1d' function='0x2'/> </controller> <controller type='pci' index='0' model='pcie-root'/> <controller type='pci' index='1' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='1' port='0x10'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0' multifunction='on'/> </controller> <controller type='pci' index='2' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='2' port='0x11'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x1'/> </controller> <controller type='pci' index='3' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='3' port='0x12'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x2'/> </controller> <controller type='pci' index='4' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='4' port='0x13'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x3'/> </controller> <controller type='pci' index='5' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='5' port='0x14'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x4'/> </controller> <controller type='pci' index='6' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='6' port='0x15'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x5'/> </controller> <controller type='pci' index='7' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='7' port='0x16'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x6'/> </controller> <controller type='pci' index='8' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='8' port='0x17'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x7'/> </controller> <controller type='pci' index='9' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='9' port='0x18'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0' multifunction='on'/> </controller> <controller type='pci' index='10' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='10' port='0x19'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x1'/> </controller> <controller type='pci' index='11' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='11' port='0x1a'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x2'/> </controller> <controller type='pci' index='12' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='12' port='0x1b'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x3'/> </controller> <controller type='pci' index='13' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='13' port='0x1c'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x4'/> </controller> <controller type='pci' index='14' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='14' port='0x1d'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x5'/> </controller> <controller type='sata' index='0'> <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/> </controller> <interface type='bridge'> <mac address='52:54:00:00:93:93'/> <source bridge='switch'/> <model type='virtio'/> <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/> </interface> <serial type='pty'> <target type='isa-serial' port='0'> <model name='isa-serial'/> </target> </serial> <console type='pty'> <target type='serial' port='0'/> </console> <input type='tablet' bus='usb'> <address type='usb' bus='0' port='1'/> </input> <input type='mouse' bus='ps2'/> <input type='keyboard' bus='ps2'/> <graphics type='vnc' port='5993' autoport='no' listen='0.0.0.0'> <listen type='address' address='0.0.0.0'/> </graphics> <audio id='1' type='none'/> <video> <model type='bochs' vram='16384' heads='1' primary='yes'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/> </video> <watchdog model='itco' action='reset'/> <memballoon model='virtio'> <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/> </memballoon> </devices> </domain> > <domain type='kvm'>
> <cpu mode='host-passthrough' check='none' migratable='on'/>
Looks good.
Can you add 'lspci -v' output for the device and the pcie bridge
it is connected to (inside the guest, after hotplug)?
Hi Gerd, please check : [root@vm-210-139 ~]# lspci 00:00.0 Host bridge: Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller 00:01.0 Display controller: Device 1234:1111 (rev 02) 00:02.0 PCI bridge: Red Hat, Inc. QEMU PCIe Root port 00:02.1 PCI bridge: Red Hat, Inc. QEMU PCIe Root port 00:02.2 PCI bridge: Red Hat, Inc. QEMU PCIe Root port 00:02.3 PCI bridge: Red Hat, Inc. QEMU PCIe Root port 00:02.4 PCI bridge: Red Hat, Inc. QEMU PCIe Root port 00:02.5 PCI bridge: Red Hat, Inc. QEMU PCIe Root port 00:02.6 PCI bridge: Red Hat, Inc. QEMU PCIe Root port 00:02.7 PCI bridge: Red Hat, Inc. QEMU PCIe Root port 00:03.0 PCI bridge: Red Hat, Inc. QEMU PCIe Root port 00:03.1 PCI bridge: Red Hat, Inc. QEMU PCIe Root port 00:03.2 PCI bridge: Red Hat, Inc. QEMU PCIe Root port 00:03.3 PCI bridge: Red Hat, Inc. QEMU PCIe Root port 00:03.4 PCI bridge: Red Hat, Inc. QEMU PCIe Root port 00:03.5 PCI bridge: Red Hat, Inc. QEMU PCIe Root port 00:1d.0 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #1 (rev 03) 00:1d.1 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #2 (rev 03) 00:1d.2 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #3 (rev 03) 00:1d.7 USB controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #1 (rev 03) 00:1f.0 ISA bridge: Intel Corporation 82801IB (ICH9) LPC Interface Controller (rev 02) 00:1f.2 SATA controller: Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA Controller [AHCI mode] (rev 02) 00:1f.3 SMBus: Intel Corporation 82801I (ICH9 Family) SMBus Controller (rev 02) 01:00.0 Ethernet controller: Red Hat, Inc. Virtio network device (rev 01) 02:00.0 SCSI storage controller: Red Hat, Inc. Virtio block device (rev 01) 03:00.0 Unclassified device [00ff]: Red Hat, Inc. Virtio memory balloon (rev 01) 04:00.0 Ethernet controller: Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 02) [root@vm-210-139 ~]# lspci -v 00:00.0 Host bridge: Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller Subsystem: Red Hat, Inc. QEMU Virtual Machine Flags: bus master, fast devsel, latency 0 00:01.0 Display controller: Device 1234:1111 (rev 02) Subsystem: Red Hat, Inc. Device 1100 Flags: bus master, fast devsel, latency 0 Memory at c0000000 (32-bit, prefetchable) [size=16M] Memory at c2650000 (32-bit, non-prefetchable) [size=4K] Expansion ROM at 80600000 [disabled] [size=32K] Capabilities: [80] Express Root Complex Integrated Endpoint, MSI 00 Kernel driver in use: bochs-drm Kernel modules: bochs 00:02.0 PCI bridge: Red Hat, Inc. QEMU PCIe Root port (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0, IRQ 22 Memory at c264f000 (32-bit, non-prefetchable) [size=4K] Bus: primary=00, secondary=01, subordinate=01, sec-latency=0 I/O behind bridge: 00001000-00001fff [size=4K] Memory behind bridge: c2500000-c25fffff [size=1M] Prefetchable memory behind bridge: 0000385000000000-00003850000fffff [size=1M] Capabilities: [54] Express Root Port (Slot+), MSI 00 Capabilities: [48] MSI-X: Enable+ Count=1 Masked- Capabilities: [40] Subsystem: Red Hat, Inc. Device 0000 Capabilities: [100] Advanced Error Reporting Capabilities: [148] Access Control Services Kernel driver in use: pcieport 00:02.1 PCI bridge: Red Hat, Inc. QEMU PCIe Root port (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0, IRQ 22 Memory at c264e000 (32-bit, non-prefetchable) [size=4K] Bus: primary=00, secondary=02, subordinate=02, sec-latency=0 I/O behind bridge: 00002000-00002fff [size=4K] Memory behind bridge: c2400000-c24fffff [size=1M] Prefetchable memory behind bridge: 0000385000100000-00003850001fffff [size=1M] Capabilities: [54] Express Root Port (Slot+), MSI 00 Capabilities: [48] MSI-X: Enable+ Count=1 Masked- Capabilities: [40] Subsystem: Red Hat, Inc. Device 0000 Capabilities: [100] Advanced Error Reporting Capabilities: [148] Access Control Services Kernel driver in use: pcieport 00:02.2 PCI bridge: Red Hat, Inc. QEMU PCIe Root port (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0, IRQ 22 Memory at c264d000 (32-bit, non-prefetchable) [size=4K] Bus: primary=00, secondary=03, subordinate=03, sec-latency=0 I/O behind bridge: 00003000-00003fff [size=4K] Memory behind bridge: 80000000-801fffff [size=2M] Prefetchable memory behind bridge: 0000385000200000-00003850002fffff [size=1M] Capabilities: [54] Express Root Port (Slot+), MSI 00 Capabilities: [48] MSI-X: Enable+ Count=1 Masked- Capabilities: [40] Subsystem: Red Hat, Inc. Device 0000 Capabilities: [100] Advanced Error Reporting Capabilities: [148] Access Control Services Kernel driver in use: pcieport 00:02.3 PCI bridge: Red Hat, Inc. QEMU PCIe Root port (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0, IRQ 22 Memory at c264c000 (32-bit, non-prefetchable) [size=4K] Bus: primary=00, secondary=04, subordinate=04, sec-latency=0 I/O behind bridge: 00004000-00004fff [size=4K] Memory behind bridge: 80200000-803fffff [size=2M] Prefetchable memory behind bridge: 0000000080400000-00000000805fffff [size=2M] Capabilities: [54] Express Root Port (Slot+), MSI 00 Capabilities: [48] MSI-X: Enable+ Count=1 Masked- Capabilities: [40] Subsystem: Red Hat, Inc. Device 0000 Capabilities: [100] Advanced Error Reporting Capabilities: [148] Access Control Services Kernel driver in use: pcieport 00:02.4 PCI bridge: Red Hat, Inc. QEMU PCIe Root port (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0, IRQ 22 Memory at c264b000 (32-bit, non-prefetchable) [size=4K] Bus: primary=00, secondary=05, subordinate=05, sec-latency=0 I/O behind bridge: 0000f000-0000ffff [size=4K] Memory behind bridge: c2200000-c23fffff [size=2M] Prefetchable memory behind bridge: 0000380000000000-00003807ffffffff [size=32G] Capabilities: [54] Express Root Port (Slot+), MSI 00 Capabilities: [48] MSI-X: Enable+ Count=1 Masked- Capabilities: [40] Subsystem: Red Hat, Inc. Device 0000 Capabilities: [100] Advanced Error Reporting Capabilities: [148] Access Control Services Kernel driver in use: pcieport 00:02.5 PCI bridge: Red Hat, Inc. QEMU PCIe Root port (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0, IRQ 22 Memory at c264a000 (32-bit, non-prefetchable) [size=4K] Bus: primary=00, secondary=06, subordinate=06, sec-latency=0 I/O behind bridge: 0000e000-0000efff [size=4K] Memory behind bridge: c2000000-c21fffff [size=2M] Prefetchable memory behind bridge: 0000380800000000-0000380fffffffff [size=32G] Capabilities: [54] Express Root Port (Slot+), MSI 00 Capabilities: [48] MSI-X: Enable+ Count=1 Masked- Capabilities: [40] Subsystem: Red Hat, Inc. Device 0000 Capabilities: [100] Advanced Error Reporting Capabilities: [148] Access Control Services Kernel driver in use: pcieport 00:02.6 PCI bridge: Red Hat, Inc. QEMU PCIe Root port (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0, IRQ 22 Memory at c2649000 (32-bit, non-prefetchable) [size=4K] Bus: primary=00, secondary=07, subordinate=07, sec-latency=0 I/O behind bridge: 0000d000-0000dfff [size=4K] Memory behind bridge: c1e00000-c1ffffff [size=2M] Prefetchable memory behind bridge: 0000381000000000-00003817ffffffff [size=32G] Capabilities: [54] Express Root Port (Slot+), MSI 00 Capabilities: [48] MSI-X: Enable+ Count=1 Masked- Capabilities: [40] Subsystem: Red Hat, Inc. Device 0000 Capabilities: [100] Advanced Error Reporting Capabilities: [148] Access Control Services Kernel driver in use: pcieport 00:02.7 PCI bridge: Red Hat, Inc. QEMU PCIe Root port (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0, IRQ 22 Memory at c2648000 (32-bit, non-prefetchable) [size=4K] Bus: primary=00, secondary=08, subordinate=08, sec-latency=0 I/O behind bridge: 0000c000-0000cfff [size=4K] Memory behind bridge: c1c00000-c1dfffff [size=2M] Prefetchable memory behind bridge: 0000381800000000-0000381fffffffff [size=32G] Capabilities: [54] Express Root Port (Slot+), MSI 00 Capabilities: [48] MSI-X: Enable+ Count=1 Masked- Capabilities: [40] Subsystem: Red Hat, Inc. Device 0000 Capabilities: [100] Advanced Error Reporting Capabilities: [148] Access Control Services Kernel driver in use: pcieport 00:03.0 PCI bridge: Red Hat, Inc. QEMU PCIe Root port (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0, IRQ 23 Memory at c2647000 (32-bit, non-prefetchable) [size=4K] Bus: primary=00, secondary=09, subordinate=09, sec-latency=0 I/O behind bridge: 0000b000-0000bfff [size=4K] Memory behind bridge: c1a00000-c1bfffff [size=2M] Prefetchable memory behind bridge: 0000382000000000-00003827ffffffff [size=32G] Capabilities: [54] Express Root Port (Slot+), MSI 00 Capabilities: [48] MSI-X: Enable+ Count=1 Masked- Capabilities: [40] Subsystem: Red Hat, Inc. Device 0000 Capabilities: [100] Advanced Error Reporting Capabilities: [148] Access Control Services Kernel driver in use: pcieport 00:03.1 PCI bridge: Red Hat, Inc. QEMU PCIe Root port (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0, IRQ 23 Memory at c2646000 (32-bit, non-prefetchable) [size=4K] Bus: primary=00, secondary=0a, subordinate=0a, sec-latency=0 I/O behind bridge: 0000a000-0000afff [size=4K] Memory behind bridge: c1800000-c19fffff [size=2M] Prefetchable memory behind bridge: 0000382800000000-0000382fffffffff [size=32G] Capabilities: [54] Express Root Port (Slot+), MSI 00 Capabilities: [48] MSI-X: Enable+ Count=1 Masked- Capabilities: [40] Subsystem: Red Hat, Inc. Device 0000 Capabilities: [100] Advanced Error Reporting Capabilities: [148] Access Control Services Kernel driver in use: pcieport 00:03.2 PCI bridge: Red Hat, Inc. QEMU PCIe Root port (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0, IRQ 23 Memory at c2645000 (32-bit, non-prefetchable) [size=4K] Bus: primary=00, secondary=0b, subordinate=0b, sec-latency=0 I/O behind bridge: 00009000-00009fff [size=4K] Memory behind bridge: c1600000-c17fffff [size=2M] Prefetchable memory behind bridge: 0000383000000000-00003837ffffffff [size=32G] Capabilities: [54] Express Root Port (Slot+), MSI 00 Capabilities: [48] MSI-X: Enable+ Count=1 Masked- Capabilities: [40] Subsystem: Red Hat, Inc. Device 0000 Capabilities: [100] Advanced Error Reporting Capabilities: [148] Access Control Services Kernel driver in use: pcieport 00:03.3 PCI bridge: Red Hat, Inc. QEMU PCIe Root port (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0, IRQ 23 Memory at c2644000 (32-bit, non-prefetchable) [size=4K] Bus: primary=00, secondary=0c, subordinate=0c, sec-latency=0 I/O behind bridge: 00008000-00008fff [size=4K] Memory behind bridge: c1400000-c15fffff [size=2M] Prefetchable memory behind bridge: 0000383800000000-0000383fffffffff [size=32G] Capabilities: [54] Express Root Port (Slot+), MSI 00 Capabilities: [48] MSI-X: Enable+ Count=1 Masked- Capabilities: [40] Subsystem: Red Hat, Inc. Device 0000 Capabilities: [100] Advanced Error Reporting Capabilities: [148] Access Control Services Kernel driver in use: pcieport 00:03.4 PCI bridge: Red Hat, Inc. QEMU PCIe Root port (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0, IRQ 23 Memory at c2643000 (32-bit, non-prefetchable) [size=4K] Bus: primary=00, secondary=0d, subordinate=0d, sec-latency=0 I/O behind bridge: 00007000-00007fff [size=4K] Memory behind bridge: c1200000-c13fffff [size=2M] Prefetchable memory behind bridge: 0000384000000000-00003847ffffffff [size=32G] Capabilities: [54] Express Root Port (Slot+), MSI 00 Capabilities: [48] MSI-X: Enable+ Count=1 Masked- Capabilities: [40] Subsystem: Red Hat, Inc. Device 0000 Capabilities: [100] Advanced Error Reporting Capabilities: [148] Access Control Services Kernel driver in use: pcieport 00:03.5 PCI bridge: Red Hat, Inc. QEMU PCIe Root port (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0, IRQ 23 Memory at c2642000 (32-bit, non-prefetchable) [size=4K] Bus: primary=00, secondary=0e, subordinate=0e, sec-latency=0 I/O behind bridge: 00006000-00006fff [size=4K] Memory behind bridge: c1000000-c11fffff [size=2M] Prefetchable memory behind bridge: 0000384800000000-0000384fffffffff [size=32G] Capabilities: [54] Express Root Port (Slot+), MSI 00 Capabilities: [48] MSI-X: Enable+ Count=1 Masked- Capabilities: [40] Subsystem: Red Hat, Inc. Device 0000 Capabilities: [100] Advanced Error Reporting Capabilities: [148] Access Control Services Kernel driver in use: pcieport 00:1d.0 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #1 (rev 03) (prog-if 00 [UHCI]) Subsystem: Red Hat, Inc. QEMU Virtual Machine Flags: bus master, fast devsel, latency 0, IRQ 16 I/O ports at 5040 [size=32] Kernel driver in use: uhci_hcd 00:1d.1 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #2 (rev 03) (prog-if 00 [UHCI]) Subsystem: Red Hat, Inc. QEMU Virtual Machine Flags: bus master, fast devsel, latency 0, IRQ 17 I/O ports at 5060 [size=32] Kernel driver in use: uhci_hcd 00:1d.2 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #3 (rev 03) (prog-if 00 [UHCI]) Subsystem: Red Hat, Inc. QEMU Virtual Machine Flags: bus master, fast devsel, latency 0, IRQ 18 I/O ports at 5080 [size=32] Kernel driver in use: uhci_hcd 00:1d.7 USB controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #1 (rev 03) (prog-if 20 [EHCI]) Subsystem: Red Hat, Inc. QEMU Virtual Machine Flags: bus master, fast devsel, latency 0, IRQ 19 Memory at c2641000 (32-bit, non-prefetchable) [size=4K] Kernel driver in use: ehci-pci 00:1f.0 ISA bridge: Intel Corporation 82801IB (ICH9) LPC Interface Controller (rev 02) Subsystem: Red Hat, Inc. QEMU Virtual Machine Flags: bus master, fast devsel, latency 0 Kernel driver in use: lpc_ich Kernel modules: lpc_ich 00:1f.2 SATA controller: Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA Controller [AHCI mode] (rev 02) (prog-if 01 [AHCI 1.0]) Subsystem: Red Hat, Inc. QEMU Virtual Machine Flags: bus master, fast devsel, latency 0, IRQ 46 I/O ports at 50a0 [size=32] Memory at c2640000 (32-bit, non-prefetchable) [size=4K] Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [a8] SATA HBA v1.0 Kernel driver in use: ahci Kernel modules: ahci 00:1f.3 SMBus: Intel Corporation 82801I (ICH9 Family) SMBus Controller (rev 02) Subsystem: Red Hat, Inc. QEMU Virtual Machine Flags: bus master, fast devsel, latency 0, IRQ 16 I/O ports at 5000 [size=64] Kernel driver in use: i801_smbus Kernel modules: i2c_i801 01:00.0 Ethernet controller: Red Hat, Inc. Virtio network device (rev 01) Subsystem: Red Hat, Inc. Device 1100 Physical Slot: 0 Flags: bus master, fast devsel, latency 0, IRQ 22 Memory at c2500000 (32-bit, non-prefetchable) [size=4K] Memory at 385000000000 (64-bit, prefetchable) [size=16K] Expansion ROM at c2540000 [disabled] [size=256K] Capabilities: [dc] MSI-X: Enable+ Count=4 Masked- Capabilities: [c8] Vendor Specific Information: VirtIO: <unknown> Capabilities: [b4] Vendor Specific Information: VirtIO: Notify Capabilities: [a4] Vendor Specific Information: VirtIO: DeviceCfg Capabilities: [94] Vendor Specific Information: VirtIO: ISR Capabilities: [84] Vendor Specific Information: VirtIO: CommonCfg Capabilities: [7c] Power Management version 3 Capabilities: [40] Express Endpoint, MSI 00 Kernel driver in use: virtio-pci 02:00.0 SCSI storage controller: Red Hat, Inc. Virtio block device (rev 01) Subsystem: Red Hat, Inc. Device 1100 Physical Slot: 0-2 Flags: bus master, fast devsel, latency 0, IRQ 22 Memory at c2400000 (32-bit, non-prefetchable) [size=4K] Memory at 385000100000 (64-bit, prefetchable) [size=16K] Capabilities: [dc] MSI-X: Enable+ Count=5 Masked- Capabilities: [c8] Vendor Specific Information: VirtIO: <unknown> Capabilities: [b4] Vendor Specific Information: VirtIO: Notify Capabilities: [a4] Vendor Specific Information: VirtIO: DeviceCfg Capabilities: [94] Vendor Specific Information: VirtIO: ISR Capabilities: [84] Vendor Specific Information: VirtIO: CommonCfg Capabilities: [7c] Power Management version 3 Capabilities: [40] Express Endpoint, MSI 00 Kernel driver in use: virtio-pci 03:00.0 Unclassified device [00ff]: Red Hat, Inc. Virtio memory balloon (rev 01) Subsystem: Red Hat, Inc. Device 1100 Physical Slot: 0-3 Flags: bus master, fast devsel, latency 0, IRQ 22 Memory at 385000200000 (64-bit, prefetchable) [size=16K] Capabilities: [c8] Vendor Specific Information: VirtIO: <unknown> Capabilities: [b4] Vendor Specific Information: VirtIO: Notify Capabilities: [a4] Vendor Specific Information: VirtIO: DeviceCfg Capabilities: [94] Vendor Specific Information: VirtIO: ISR Capabilities: [84] Vendor Specific Information: VirtIO: CommonCfg Capabilities: [7c] Power Management version 3 Capabilities: [40] Express Endpoint, MSI 00 Kernel driver in use: virtio-pci 04:00.0 Ethernet controller: Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 02) Subsystem: Intel Corporation Ethernet Converged Network Adapter XL710-Q2 Physical Slot: 0-4 Flags: fast devsel Memory at <unassigned> (64-bit, prefetchable) Memory at 80400000 (64-bit, prefetchable) [size=32K] Expansion ROM at 80200000 [virtual] [disabled] [size=512K] Capabilities: [40] Power Management version 3 Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ Capabilities: [70] MSI-X: Enable- Count=129 Masked- Capabilities: [a0] Express Endpoint, MSI 00 Capabilities: [e0] Vital Product Data Capabilities: [100] Advanced Error Reporting Capabilities: [140] Device Serial Number a8-90-15-ff-ff-fe-fd-3c Capabilities: [1a0] Transaction Processing Hints Capabilities: [1b0] Access Control Services Kernel modules: i40e > 00:02.3 PCI bridge: Red Hat, Inc. QEMU PCIe Root port (prog-if 00 [Normal > decode]) > Flags: bus master, fast devsel, latency 0, IRQ 22 > Memory at c264c000 (32-bit, non-prefetchable) [size=4K] > Bus: primary=00, secondary=04, subordinate=04, sec-latency=0 > I/O behind bridge: 00004000-00004fff [size=4K] > Memory behind bridge: 80200000-803fffff [size=2M] > Prefetchable memory behind bridge: 0000000080400000-00000000805fffff > [size=2M] > Capabilities: [54] Express Root Port (Slot+), MSI 00 > Capabilities: [48] MSI-X: Enable+ Count=1 Masked- > Capabilities: [40] Subsystem: Red Hat, Inc. Device 0000 > Capabilities: [100] Advanced Error Reporting > Capabilities: [148] Access Control Services > Kernel driver in use: pcieport This should be the root port used by the nic. Has a 2M prefetchable memory window. > 00:02.4 PCI bridge: Red Hat, Inc. QEMU PCIe Root port (prog-if 00 [Normal > decode]) > Prefetchable memory behind bridge: 0000380000000000-00003807ffffffff > [size=32G] > 00:02.5 PCI bridge: Red Hat, Inc. QEMU PCIe Root port (prog-if 00 [Normal > decode]) > Prefetchable memory behind bridge: 0000380800000000-0000380fffffffff > [size=32G] All other ports have 32G. Hmm. There is no difference in libvirt xml ... > 04:00.0 Ethernet controller: Intel Corporation Ethernet Controller XL710 for > 40GbE QSFP+ (rev 02) > Subsystem: Intel Corporation Ethernet Converged Network Adapter > XL710-Q2 > Physical Slot: 0-4 > Flags: fast devsel > Memory at <unassigned> (64-bit, prefetchable) > Memory at 80400000 (64-bit, prefetchable) [size=32K] > Expansion ROM at 80200000 [virtual] [disabled] [size=512K] Finally the NIC. Can you attach the complete kernel log (booting plus hotplug) please? I'm wondering where these differences in pcie root port configuration are comimg from. I'd expect all ports have 32G windows ... New test build: https://kojihub.stream.centos.org/koji/taskinfo?taskID=2216516 Any change with this one? (In reply to Gerd Hoffmann from comment #15) > New test build: > https://kojihub.stream.centos.org/koji/taskinfo?taskID=2216516 > > Any change with this one? Hi Gerd, My test result shows the PF can be hot-plugged into domain successfully now. Test env: edk2-ovmf-20230301gitf80f052277c8-3.el9.bz2174749.20230515.1346.noarch Test step: (1) start a domain # virt-install --machine=q35 --noreboot --name=rhel93 --memory=4096 --vcpus=4 --graphics type=vnc,port=5993,listen=0.0.0.0 --boot=uefi --network bridge=switch,model=virtio,mac=52:54:00:00:93:93 --import --noautoconsole --disk path=/home/images/RHEL93.qcow2,bus=virtio,cache=none,format=qcow2,io=threads,size=20 --osinfo detect=on,require=off (2) hot-plug a XL710 PF into domain # lspci -s 87:00.0 87:00.0 Ethernet controller: Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 02) # /bin/virsh attach-device rhel93 /tmp/device/0000:87:00.0.xml Device attached successfully (3) check the PF status in the domain # ifconfig ... enp4s0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500 ether 3c:fd:fe:15:90:a8 txqueuelen 1000 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 # dmesg [ 38.456761] pci 0000:04:00.0: [8086:1583] type 00 class 0x020000 [ 38.457666] pci 0000:04:00.0: reg 0x10: [mem 0x00000000-0x00ffffff 64bit pref] [ 38.458598] pci 0000:04:00.0: reg 0x1c: [mem 0x00000000-0x00007fff 64bit pref] [ 38.459323] pci 0000:04:00.0: reg 0x30: [mem 0x00000000-0x0007ffff pref] [ 38.459874] pci 0000:04:00.0: Max Payload Size set to 128 (was 256, max 2048) [ 38.471403] pci 0000:04:00.0: BAR 0: assigned [mem 0x381800000000-0x381800ffffff 64bit pref] [ 38.472489] pci 0000:04:00.0: BAR 6: assigned [mem 0xc2400000-0xc247ffff pref] [ 38.472496] pci 0000:04:00.0: BAR 3: assigned [mem 0x381801000000-0x381801007fff 64bit pref] [ 38.514703] i40e: Intel(R) Ethernet Connection XL710 Network Driver [ 38.514705] i40e: Copyright (c) 2013 - 2019 Intel Corporation. [ 38.515275] i40e 0000:04:00.0: enabling device (0000 -> 0002) [ 38.536779] i40e 0000:04:00.0: fw 9.80.70867 api 1.15 nvm 9.00 0x8000cadc 21.5.9 [8086:1583] [8086:0006] [ 38.603332] i40e 0000:04:00.0: MAC address: 3c:fd:fe:15:90:a8 [ 38.604352] i40e 0000:04:00.0: FW LLDP is enabled [ 38.613157] i40e 0000:04:00.0: PCI-Express: Speed 8.0GT/s Width x8 [ 38.613832] i40e 0000:04:00.0: Features: PF-id[0] VFs: 64 VSIs: 66 QP: 4 RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA [ 38.625170] i40e 0000:04:00.0 enp4s0: renamed from eth0 The full domain dmesg of different PFs is as following: http://10.73.72.41/log/bug/Bug2174749/2023_05_15_08:15:19_XL710 http://10.73.72.41/log/bug/Bug2174749/2023_05_15_08:30:25_82599ES http://10.73.72.41/log/bug/Bug2174749/2023_05_15_08:58:13_MT2892 http://10.73.72.41/log/bug/Bug2174749/2023_05_15_09:01:27_QL41112 (In reply to Gerd Hoffmann from comment #15) > New test build: > https://kojihub.stream.centos.org/koji/taskinfo?taskID=2216516 Hi Gerd, Does QE need to do sanity test with this build instead of edk2-ovmf-20230301gitf80f052277c8-2.el9.bz2174749.20230418.1152? Thanks. Best regards Nana Patch posted upstream https://edk2.groups.io/g/devel/message/104919 > > New test build: > > https://kojihub.stream.centos.org/koji/taskinfo?taskID=2216516 > > Does QE need to do sanity test with this build instead of > edk2-ovmf-20230301gitf80f052277c8-2.el9.bz2174749.20230418.1152? Yes, please. The new scratch build has both patches, #1 which enables the dynamic mmio window, and #2 which fixes comment 8 problem. > > Hi Gerd, > > My test result shows the PF can be hot-plugged into domain successfully now. > > Test env: > edk2-ovmf-20230301gitf80f052277c8-3.el9.bz2174749.20230515.1346.noarch > > Test step: > (1) start a domain > # virt-install --machine=q35 --noreboot --name=rhel93 --memory=4096 > --vcpus=4 --graphics type=vnc,port=5993,listen=0.0.0.0 --boot=uefi --network > bridge=switch,model=virtio,mac=52:54:00:00:93:93 --import --noautoconsole > --disk > path=/home/images/RHEL93.qcow2,bus=virtio,cache=none,format=qcow2,io=threads, > size=20 --osinfo detect=on,require=off > > (2) hot-plug a XL710 PF into domain > # lspci -s 87:00.0 > 87:00.0 Ethernet controller: Intel Corporation Ethernet Controller XL710 for > 40GbE QSFP+ (rev 02) > # /bin/virsh attach-device rhel93 /tmp/device/0000:87:00.0.xml > Device attached successfully > > (3) check the PF status in the domain > # ifconfig > ... > enp4s0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500 > ether 3c:fd:fe:15:90:a8 txqueuelen 1000 (Ethernet) > RX packets 0 bytes 0 (0.0 B) > RX errors 0 dropped 0 overruns 0 frame 0 > TX packets 0 bytes 0 (0.0 B) > TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 > # dmesg > [ 38.456761] pci 0000:04:00.0: [8086:1583] type 00 class 0x020000 > [ 38.457666] pci 0000:04:00.0: reg 0x10: [mem 0x00000000-0x00ffffff 64bit > pref] > [ 38.458598] pci 0000:04:00.0: reg 0x1c: [mem 0x00000000-0x00007fff 64bit > pref] > [ 38.459323] pci 0000:04:00.0: reg 0x30: [mem 0x00000000-0x0007ffff pref] > [ 38.459874] pci 0000:04:00.0: Max Payload Size set to 128 (was 256, max > 2048) > [ 38.471403] pci 0000:04:00.0: BAR 0: assigned [mem > 0x381800000000-0x381800ffffff 64bit pref] > [ 38.472489] pci 0000:04:00.0: BAR 6: assigned [mem 0xc2400000-0xc247ffff > pref] > [ 38.472496] pci 0000:04:00.0: BAR 3: assigned [mem > 0x381801000000-0x381801007fff 64bit pref] > [ 38.514703] i40e: Intel(R) Ethernet Connection XL710 Network Driver > [ 38.514705] i40e: Copyright (c) 2013 - 2019 Intel Corporation. > [ 38.515275] i40e 0000:04:00.0: enabling device (0000 -> 0002) > [ 38.536779] i40e 0000:04:00.0: fw 9.80.70867 api 1.15 nvm 9.00 0x8000cadc > 21.5.9 [8086:1583] [8086:0006] > [ 38.603332] i40e 0000:04:00.0: MAC address: 3c:fd:fe:15:90:a8 > [ 38.604352] i40e 0000:04:00.0: FW LLDP is enabled > [ 38.613157] i40e 0000:04:00.0: PCI-Express: Speed 8.0GT/s Width x8 > [ 38.613832] i40e 0000:04:00.0: Features: PF-id[0] VFs: 64 VSIs: 66 QP: 4 > RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA > [ 38.625170] i40e 0000:04:00.0 enp4s0: renamed from eth0 > > The full domain dmesg of different PFs is as following: > http://10.73.72.41/log/bug/Bug2174749/2023_05_15_08:15:19_XL710 > http://10.73.72.41/log/bug/Bug2174749/2023_05_15_08:30:25_82599ES > http://10.73.72.41/log/bug/Bug2174749/2023_05_15_08:58:13_MT2892 > http://10.73.72.41/log/bug/Bug2174749/2023_05_15_09:01:27_QL41112 Hi Gerd, My test result shows the SPF9220 PF can still not be hot-plugged into domain even with edk2-ovmf-20230301gitf80f052277c8-3.el9.bz2174749.20230515.1346.noarch. The full domain dmesg: http://10.73.72.41/log/bug/Bug2174749/2023_05_22_17:08:19_SFC9220_BZ Only SPF9220 PF test failed currently. Could you please help check it ? > The full domain dmesg:
> http://10.73.72.41/log/bug/Bug2174749/2023_05_22_17:08:19_SFC9220_BZ
>
> Only SPF9220 PF test failed currently.
What is lspci output for this device (on the host)?
(In reply to Gerd Hoffmann from comment #23) > > The full domain dmesg: > > http://10.73.72.41/log/bug/Bug2174749/2023_05_22_17:08:19_SFC9220_BZ > > > > Only SPF9220 PF test failed currently. > > What is lspci output for this device (on the host)? Just like: # lspci -vv -s 0000:1a:00.0 1a:00.0 Ethernet controller: Solarflare Communications SFC9220 10/40G Ethernet Controller (rev 02) Subsystem: Solarflare Communications SFN8522-R2 8000 Series 10G Adapter Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Interrupt: pin A routed to IRQ 114 NUMA node: 0 IOMMU group: 23 Region 0: I/O ports at 4100 [size=256] Region 2: Memory at 9e000000 (64-bit, non-prefetchable) [size=8M] Region 4: Memory at a6904000 (64-bit, non-prefetchable) [size=16K] Expansion ROM at a6a40000 [disabled] [size=256K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold-) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [70] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 1024 bytes, PhantFunc 0, Latency L0s <64ns, L1 <8us ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 75.000W DevCtl: CorrErr- NonFatalErr+ FatalErr+ UnsupReq+ RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset- MaxPayload 256 bytes, MaxReadReq 512 bytes DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend- LnkCap: Port #0, Speed 8GT/s, Width x8, ASPM L0s L1, Exit Latency L0s unlimited, L1 <64us ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+ LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 8GT/s (ok), Width x8 (ok) TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR+ 10BitTagComp- 10BitTagReq- OBFF Via message/WAKE#, ExtFmt- EETLPPrefix- EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit- FRS- TPHComp+ ExtTPHComp- AtomicOpsCap: 32bit- 64bit- 128bitCAS- DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis- LTR- OBFF Disabled, AtomicOpsCtl: ReqEn- LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer- 2Retimers- DRS- LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+ EqualizationPhase1+ EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest- Retimer- 2Retimers- CrosslinkRes: unsupported Capabilities: [b0] MSI-X: Enable+ Count=32 Masked- Vector table: BAR=4 offset=00000000 PBA: BAR=4 offset=00002000 Capabilities: [d0] Vital Product Data Product Name: Solarflare Flareon Ultra 8000 Series 10G Adapter Read-only fields: [PN] Part number: SFN8522 [SN] Serial number: 852200210000170117100443 [EC] Engineering changes: PCBR2:CCSA2 [V0] Vendor specific: 8.0.0 [VD] Vendor specific: 8.0.0 [VL] Vendor specific: [VA] Vendor specific: 0x0000000000000000 [VF] Vendor specific: 0x0000000000000000 [RV] Reserved: checksum good, 148 byte(s) reserved End Capabilities: [100 v2] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt+ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP+ FCP+ CmpltTO+ CmpltAbrt+ UnxCmplt- RxOF+ MalfTLP+ ECRC+ UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ CEMsk: RxErr+ BadTLP+ BadDLLP+ Rollover+ Timeout+ AdvNonFatalErr+ AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn+ ECRCChkCap+ ECRCChkEn+ MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap- HeaderLog: 00000000 00000000 00000000 00000000 Capabilities: [148 v1] Device Serial Number 00-0f-53-ff-ff-4d-8c-30 Capabilities: [158 v1] Alternative Routing-ID Interpretation (ARI) ARICap: MFVC- ACS-, Next Function: 1 ARICtl: MFVC- ACS-, Function Group: 0 Capabilities: [168 v1] Secondary PCI Express LnkCtl3: LnkEquIntrruptEn- PerformEqu- LaneErrStat: 0 Capabilities: [198 v1] Single Root I/O Virtualization (SR-IOV) IOVCap: Migration-, Interrupt Message Number: 000 IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy+ IOVSta: Migration- Initial VFs: 64, Total VFs: 64, Number of VFs: 0, Function Dependency Link: 00 VF offset: 2, stride: 1, Device ID: 1a03 Supported Page Size: 00000553, System Page Size: 00000001 Region 0: Memory at 00000000a2800000 (64-bit, non-prefetchable) Region 2: Memory at 00000000a6908000 (64-bit, non-prefetchable) VF Migration: offset: 00000000, BIR: 0 Capabilities: [1d8 v1] Transaction Processing Hints Device specific mode supported No steering table available Capabilities: [26c v1] L1 PM Substates L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+ PortCommonModeRestoreTime=10us PortTPowerOnTime=10us L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1- T_CommonMode=0us LTR1.2_Threshold=0ns L1SubCtl2: T_PwrOn=10us Kernel driver in use: sfc Kernel modules: sfc > > What is lspci output for this device (on the host)? > > # lspci -vv -s 0000:1a:00.0 > 1a:00.0 Ethernet controller: Solarflare Communications SFC9220 10/40G > Ethernet Controller (rev 02) > Region 2: Memory at 9e000000 (64-bit, non-prefetchable) [size=8M] So a 8M non-prefetchable bar. Hmm. The mmio window scaling is applied only to the prefetchable memory window (which usually represent device memory and can be quite big). Changing the window via '-device pcie-root-port.mem-reserve=...' should work as workaround here. The non-prefetchable memory window is 32-bit, so we don't have that much address space available there. Bumping the default size doesn't look like a good plan. There are two bars, so we'll need a 16M window, which consume 1G address space with only 64 pcie root ports. The 'PF' in the test name suggests this is a SR/IOV device. Is there a specific reason you assign the complete PF device instead of the VFs? Tested edk2 test loop with the scratch build mentioned in Comment 15, no new issue was found. Versions: kernel-5.14.0-306.el9.x86_64 qemu-kvm-8.0.0-2.el9 edk2-ovmf-20230301gitf80f052277c8-3.el9.bz2174749.20230515.1346.noarch Edk2 test loop with scratch build edk2-ovmf-20230301gitf80f052277c8-3.el9.bz2174749.20230515.1346.noarch Job link: http://virtqetools.lab.eng.pek2.redhat.com/kvm_autotest_job_log/?jobid=7875589 (In reply to Gerd Hoffmann from comment #25) > > > What is lspci output for this device (on the host)? > > > > # lspci -vv -s 0000:1a:00.0 > > 1a:00.0 Ethernet controller: Solarflare Communications SFC9220 10/40G > > Ethernet Controller (rev 02) > > > Region 2: Memory at 9e000000 (64-bit, non-prefetchable) [size=8M] > > So a 8M non-prefetchable bar. Hmm. > > The mmio window scaling is applied only to the prefetchable memory > window (which usually represent device memory and can be quite big). > > Changing the window via '-device pcie-root-port.mem-reserve=...' > should work as workaround here. > > The non-prefetchable memory window is 32-bit, so we don't have that > much address space available there. Bumping the default size doesn't > look like a good plan. There are two bars, so we'll need a 16M window, > which consume 1G address space with only 64 pcie root ports. > > The 'PF' in the test name suggests this is a SR/IOV device. Is there > a specific reason you assign the complete PF device instead of the VFs? I think this question might be better to be answered by Alex, and see if we need to open another bug to search for a solution or document such limitation only (In reply to Guo, Zhiyi from comment #27) > (In reply to Gerd Hoffmann from comment #25) > > > > What is lspci output for this device (on the host)? > > > > > > # lspci -vv -s 0000:1a:00.0 > > > 1a:00.0 Ethernet controller: Solarflare Communications SFC9220 10/40G > > > Ethernet Controller (rev 02) > > > > > Region 2: Memory at 9e000000 (64-bit, non-prefetchable) [size=8M] > > > > So a 8M non-prefetchable bar. Hmm. > > > > The mmio window scaling is applied only to the prefetchable memory > > window (which usually represent device memory and can be quite big). > > > > Changing the window via '-device pcie-root-port.mem-reserve=...' > > should work as workaround here. > > > > The non-prefetchable memory window is 32-bit, so we don't have that > > much address space available there. Bumping the default size doesn't > > look like a good plan. There are two bars, so we'll need a 16M window, > > which consume 1G address space with only 64 pcie root ports. > > > > The 'PF' in the test name suggests this is a SR/IOV device. Is there > > a specific reason you assign the complete PF device instead of the VFs? > > I think this question might be better to be answered by Alex, and see if we > need to open another bug to search for a solution or document such > limitation only I'm not sure what I'm supposed to answer here. It's very unusual for a device to report a 64-bit, non-prefetchable BAR requirement, there is no way for a bridge to provide anything other than 32-bit non-prefetchable apertures, so these must fit within the 32-bit MMIO space. It's even more absurd that the SR-IOV BARs for the device are also non-prefetchable. Is there maybe a firmware update for this device? We were recently pointed to an Insights report for customer NIC configurations (https://issues.redhat.com/browse/INSPEC-395). There's not a single SolarFlare card there. As Gerd says, we cannot allocate arbitrarily large non-prefetchable space, there's a small, finite range of 32-bit MMIO. It seems sufficient to me if there are workarounds to make this device hot-pluggable, this extent of non-prefetchable space is simply not tenable for a default aperture margin. Hi Gerd, CPU sanity test failed with booting old cpu models (Skylake-Client-noTSX, Broadwell-noTSX, Haswell-noTSX or older). Test Env: Host: edk2-ovmf-20230301gitf80f052277c8-3.el9.bz2174749.20230515.1346.noarch 5.14.0-316.el9.x86_64 qemu-kvm-8.0.0-2.el9.x86_64 intel-eaglestream-spr-07.khw1.lab.eng.bos.redhat.com Model name: Intel(R) Xeon(R) Platinum 8468H Guest: RHEL9.3 And get the error firmware log CPU[0BF] APIC ID=00DF SMBASE=7FFAF000 SaveState=7FFBEC00 Size=00000400 Stacks - 0x7F9B4000 mSmmStackSize - 0x6000 PcdCpuSmmStackGuard - 0x1 mXdSupported - 0x1 One Semaphore Size = 0x40 Total Semaphores Size = 0xC140 PhysicalAddressBits = 48, 5LPageTable = 0. 5LevelPaging Needed - 0 1GPageTable Support - 0 PcdCpuSmmRestrictedMemoryAccess - 1 PhysicalAddressBits - 40 Initialize IDT IST field for SMM Stack Guard InstallProtocolInterface: 26EEB3DE-B689-492E-80F0-BE8BD7DA4BA7 7FFD4130 SMM IPL registered SMM Entry Point address 7FFEFD89 SmmInstallProtocolInterface: EB346B97-975F-4A9F-8B22-F8E92BB3D569 7FFD4170 SmmInstallProtocolInterface: 69B792EA-39CE-402D-A2A6-F721DE351DFE 7FFD4070 CpuSmm: SpinLock Size = 0x40, PcdCpuSmmMpTokenCountPerChunk = 0x40 SmmInstallProtocolInterface: 5D5450D7-990C-4180-A803-8E63F0608307 7FFD4220 SmmInstallProtocolInterface: 1D202CAB-C8AB-4D5C-94F7-3CFCC0D3D335 7FFD41E0 SmmInstallProtocolInterface: AA00D50B-4911-428F-B91A-A59DDB13E24C 7FFD4020 SMM CPU Module exit from SMRAM with EFI_SUCCESS SMM IPL closed SMRAM window CcMeasurementProtocol is not installed. - Not Found Tcg2Protocol is not installed. - Not Found None of Tcg2Protocol/CcMeasurementProtocol is installed. InstallProtocolInterface: 5B1B31A1-9562-11D2-8E3F-00A0C969723B 7D86F118 SmmInstallProtocolInterface: 5B1B31A1-9562-11D2-8E3F-00A0C969723B 7FFEB6C0 Loading SMM driver at 0x0007F96C000 EntryPoint=0x0007F96F8F7 FvbServicesSmm.efi QEMU Flash: Attempting flash detection at FFC00010 QemuFlashDetected => FD behaves as FLASH QemuFlashDetected => Yes Installing QEMU flash SMM FVB SmmInstallProtocolInterface: D326D041-BD31-4C01-B5A8-628BE87F0653 7F96BEB0 SmmInstallProtocolInterface: 09576E91-6D3F-11D2-8E39-00A0C969723B 7F96BE18 CcMeasurementProtocol is not installed. - Not Found Tcg2Protocol is not installed. - Not Found None of Tcg2Protocol/CcMeasurementProtocol is installed. InstallProtocolInterface: 5B1B31A1-9562-11D2-8E3F-00A0C969723B 7D86F918 SmmInstallProtocolInterface: 5B1B31A1-9562-11D2-8E3F-00A0C969723B 7FFEB2C0 Loading SMM driver at 0x0007F0B7000 EntryPoint=0x0007F1008BB VariableSmm.efi mSmmMemLibInternalMaximumSupportAddress = 0xFFFFFFFFFF VarCheckLibRegisterSetVariableCheckHandler - 0x7F0FC652 Success VarCheckLibRegisterSetVariableCheckHandler - 0x7F0FADBF Success Variable driver common space: 0x3FF9C 0x3FF9C 0x3FF9C Variable driver will work with auth variable format! ASSERT_EFI_ERROR (Status = Out of Resources) ASSERT /builddir/build/BUILD/edk2-f80f052277c8/MdeModulePkg/Universal/Variable/RuntimeDxe/VariableSmm.c(1164): !(((INTN)(RETURN_STATUS)(Status)) < 0) Could you please help this, thanks. Best regards Nana Add the full command line for Comment 30. /usr/libexec/qemu-kvm \ -S \ -name 'avocado-vt-vm1' \ -sandbox on \ -blockdev '{"node-name": "file_ovmf_code", "driver": "file", "filename": "/usr/share/OVMF/OVMF_CODE.secboot.fd", "auto-read-only": true, "discard": "unmap"}' \ -blockdev '{"node-name": "drive_ovmf_code", "driver": "raw", "read-only": true, "file": "file_ovmf_code"}' \ -blockdev '{"node-name": "file_ovmf_vars", "driver": "file", "filename": "/home/filesystem_VARS.fd", "auto-read-only": true, "discard": "unmap"}' \ -blockdev '{"node-name": "drive_ovmf_vars", "driver": "raw", "read-only": false, "file": "file_ovmf_vars"}' \ -machine q35,memory-backend=mem-machine_mem,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars \ -device '{"id": "pcie-root-port-0", "driver": "pcie-root-port", "multifunction": true, "bus": "pcie.0", "addr": "0x1", "chassis": 1}' \ -device '{"id": "pcie-pci-bridge-0", "driver": "pcie-pci-bridge", "addr": "0x0", "bus": "pcie-root-port-0"}' \ -nodefaults \ -device '{"driver": "VGA", "bus": "pcie-pci-bridge-0", "addr": "0x1"}' \ -m 125952 \ -object '{"size": 132070244352, "id": "mem-machine_mem", "qom-type": "memory-backend-ram"}' \ -smp 192,maxcpus=192,cores=96,threads=1,dies=1,sockets=2 \ -cpu 'Skylake-Client-noTSX-IBRS',enforce,+kvm_pv_unhalt \ -chardev socket,server=on,wait=off,path=/tmp/monitor-qmpmonitor,id=qmp_id_qmpmonitor1 \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -device '{"ioport": 1285, "driver": "pvpanic", "id": "idVJSPrN"}' \ -chardev socket,server=on,wait=off,path=/tmp/serial-serial0,id=chardev_serial0 \ -device '{"id": "serial0", "driver": "isa-serial", "chardev": "chardev_serial0"}' \ -device '{"id": "pcie-root-port-1", "port": 1, "driver": "pcie-root-port", "addr": "0x1.0x1", "bus": "pcie.0", "chassis": 2}' \ -device '{"driver": "qemu-xhci", "id": "usb1", "bus": "pcie-root-port-1", "addr": "0x0"}' \ -device '{"driver": "usb-tablet", "id": "usb-tablet1", "bus": "usb1.0", "port": "1"}' \ -device '{"id": "pcie-root-port-2", "port": 2, "driver": "pcie-root-port", "addr": "0x1.0x2", "bus": "pcie.0", "chassis": 3}' \ -device '{"id": "virtio_scsi_pci0", "driver": "virtio-scsi-pci", "bus": "pcie-root-port-2", "addr": "0x0"}' \ -blockdev '{"node-name": "file_image1", "driver": "file", "auto-read-only": true, "discard": "unmap", "aio": "threads", "filename": "/home/kvm_autotest_root/images/rhel930-64-virtio-scsi-ovmf.qcow2", "cache": {"direct": true, "no-flush": false}}' \ -blockdev '{"node-name": "drive_image1", "driver": "qcow2", "read-only": false, "cache": {"direct": true, "no-flush": false}, "file": "file_image1"}' \ -device '{"driver": "scsi-hd", "id": "image1", "drive": "drive_image1", "write-cache": "on"}' \ -device '{"id": "pcie-root-port-3", "port": 3, "driver": "pcie-root-port", "addr": "0x1.0x3", "bus": "pcie.0", "chassis": 4}' \ -device '{"driver": "virtio-net-pci", "mac": "9a:49:ba:f9:e1:88", "id": "idx7j24n", "netdev": "idLn9MLW", "bus": "pcie-root-port-3", "addr": "0x0"}' \ -netdev tap,id=idLn9MLW,vhost=on \ -vnc :0 \ -rtc base=localtime,clock=host,driftfix=slew \ -boot menu=off,order=cdn,once=c,strict=off \ -enable-kvm \ -monitor stdio \ -chardev file,id=firmware,path=/tmp/edk2.log \ -device isa-debugcon,iobase=0x402,chardev=firmware \ > -smp 192,maxcpus=192,cores=96,threads=1,dies=1,sockets=2 \
With that many cpus you most likely need more tseg memory.
From docs/interop/firmware.json (qemu.git):
<quote>
# Furthermore, a large guest-physical address space
# (comprising guest RAM, memory hotplug range, and 64-bit
# PCI MMIO aperture), and/or a high VCPU count, may
# present high SMRAM requirements from the firmware. On
# the "pc-q35-*" machine types of the @i386 and @x86_64
# emulation targets, the SMRAM size may be increased
# above the default 16MB with the "-global
# mch.extended-tseg-mbytes=uint16" option. As a rule of
# thumb, the default 16MB size suffices for 1TB of
# guest-phys address space and a few tens of VCPUs; for
# every further TB of guest-phys address space, add 8MB
# of SMRAM. 48MB should suffice for 4TB of guest-phys
# address space and 2-3 hundred VCPUs.
</quote>
I'd suggest to start with '-global mch.extended-tseg-mbytes=32'.
(In reply to Gerd Hoffmann from comment #33) > > -smp 192,maxcpus=192,cores=96,threads=1,dies=1,sockets=2 \ > > With that many cpus you most likely need more tseg memory. > > From docs/interop/firmware.json (qemu.git): > <quote> > # Furthermore, a large guest-physical address space > # (comprising guest RAM, memory hotplug range, and 64-bit > # PCI MMIO aperture), and/or a high VCPU count, may > # present high SMRAM requirements from the firmware. On > # the "pc-q35-*" machine types of the @i386 and @x86_64 > # emulation targets, the SMRAM size may be increased > # above the default 16MB with the "-global > # mch.extended-tseg-mbytes=uint16" option. As a rule of > # thumb, the default 16MB size suffices for 1TB of > # guest-phys address space and a few tens of VCPUs; for > # every further TB of guest-phys address space, add 8MB > # of SMRAM. 48MB should suffice for 4TB of guest-phys > # address space and 2-3 hundred VCPUs. > </quote> > > I'd suggest to start with '-global mch.extended-tseg-mbytes=32'. Thanks, vm works with adding this qemu commanline. Will we set the this value default automatically in future? Best regards Nana > > I'd suggest to start with '-global mch.extended-tseg-mbytes=32'.
>
> Thanks, vm works with adding this qemu commanline.
> Will we set the this value default automatically in future?
Default is 16 not 32. Usually 16 works fine, but for large VMs it might not be enough ...
There are no plans to change the default.
(In reply to liunana from comment #34) > (In reply to Gerd Hoffmann from comment #33) > > > -smp 192,maxcpus=192,cores=96,threads=1,dies=1,sockets=2 \ > > > > With that many cpus you most likely need more tseg memory. > > > > From docs/interop/firmware.json (qemu.git): > > <quote> > > # Furthermore, a large guest-physical address space > > # (comprising guest RAM, memory hotplug range, and 64-bit > > # PCI MMIO aperture), and/or a high VCPU count, may > > # present high SMRAM requirements from the firmware. On > > # the "pc-q35-*" machine types of the @i386 and @x86_64 > > # emulation targets, the SMRAM size may be increased > > # above the default 16MB with the "-global > > # mch.extended-tseg-mbytes=uint16" option. As a rule of > > # thumb, the default 16MB size suffices for 1TB of > > # guest-phys address space and a few tens of VCPUs; for > > # every further TB of guest-phys address space, add 8MB > > # of SMRAM. 48MB should suffice for 4TB of guest-phys > > # address space and 2-3 hundred VCPUs. > > </quote> > > > > I'd suggest to start with '-global mch.extended-tseg-mbytes=32'. > > Thanks, vm works with adding this qemu commanline. > Will we set the this value default automatically in future? > > > Best regards > Nana Hi Nana, There is a bug, Bug 1866110 - automated TSEG size calculation. Igor set ITR to 9.4. Note to self: when re-enabling also backport this commit: commit c1e853769046b322690ad336fdb98966757e7414 (github.kraxel/master) Author: Gerd Hoffmann <kraxel> Date: Thu Jun 1 09:57:31 2023 +0200 OvmfPkg/PlatformInitLib: limit phys-bits to 46. Older linux kernels have problems with phys-bits larger than 46, ubuntu 18.04 (kernel 4.15) has been reported to be affected. Reduce phys-bits limit from 47 to 46. Reported-by: Fiona Ebner <f.ebner> Signed-off-by: Gerd Hoffmann <kraxel> (In reply to Gerd Hoffmann from comment #15) > New test build: > https://kojihub.stream.centos.org/koji/taskinfo?taskID=2216516 Seems to be expired now, new test build (no changes): https://kojihub.stream.centos.org/koji/taskinfo?taskID=2399507 New scratch build (on top of the 2023-05 rebase this time): https://kojihub.stream.centos.org/koji/taskinfo?taskID=2424988 Testing: should be tested together with the upcoming libvirt-0.9.5 release (see bug 2171860, I just noticed there already release candidate builds). (In reply to Alex Williamson from comment #29) > > Hi Gerd, > > > > My test result shows the PF can be hot-plugged into domain successfully now. > > > > Test env: > > edk2-ovmf-20230301gitf80f052277c8-3.el9.bz2174749.20230515.1346.noarch > > > > Test step: > > (1) start a domain > > # virt-install --machine=q35 --noreboot --name=rhel93 --memory=4096 > > --vcpus=4 --graphics type=vnc,port=5993,listen=0.0.0.0 --boot=uefi --network > > bridge=switch,model=virtio,mac=52:54:00:00:93:93 --import --noautoconsole > > --disk > > path=/home/images/RHEL93.qcow2,bus=virtio,cache=none,format=qcow2,io=threads, > > size=20 --osinfo detect=on,require=off > > > > (2) hot-plug a XL710 PF into domain > > # lspci -s 87:00.0 > > 87:00.0 Ethernet controller: Intel Corporation Ethernet Controller XL710 for > > 40GbE QSFP+ (rev 02) > > # /bin/virsh attach-device rhel93 /tmp/device/0000:87:00.0.xml > > Device attached successfully > > > > (3) check the PF status in the domain > > # ifconfig > > ... > > enp4s0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500 > > ether 3c:fd:fe:15:90:a8 txqueuelen 1000 (Ethernet) > > RX packets 0 bytes 0 (0.0 B) > > RX errors 0 dropped 0 overruns 0 frame 0 > > TX packets 0 bytes 0 (0.0 B) > > TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 > > # dmesg > > [ 38.456761] pci 0000:04:00.0: [8086:1583] type 00 class 0x020000 > > [ 38.457666] pci 0000:04:00.0: reg 0x10: [mem 0x00000000-0x00ffffff 64bit > > pref] > > [ 38.458598] pci 0000:04:00.0: reg 0x1c: [mem 0x00000000-0x00007fff 64bit > > pref] > > [ 38.459323] pci 0000:04:00.0: reg 0x30: [mem 0x00000000-0x0007ffff pref] > > [ 38.459874] pci 0000:04:00.0: Max Payload Size set to 128 (was 256, max > > 2048) > > [ 38.471403] pci 0000:04:00.0: BAR 0: assigned [mem > > 0x381800000000-0x381800ffffff 64bit pref] > > [ 38.472489] pci 0000:04:00.0: BAR 6: assigned [mem 0xc2400000-0xc247ffff > > pref] > > [ 38.472496] pci 0000:04:00.0: BAR 3: assigned [mem > > 0x381801000000-0x381801007fff 64bit pref] > > [ 38.514703] i40e: Intel(R) Ethernet Connection XL710 Network Driver > > [ 38.514705] i40e: Copyright (c) 2013 - 2019 Intel Corporation. > > [ 38.515275] i40e 0000:04:00.0: enabling device (0000 -> 0002) > > [ 38.536779] i40e 0000:04:00.0: fw 9.80.70867 api 1.15 nvm 9.00 0x8000cadc > > 21.5.9 [8086:1583] [8086:0006] > > [ 38.603332] i40e 0000:04:00.0: MAC address: 3c:fd:fe:15:90:a8 > > [ 38.604352] i40e 0000:04:00.0: FW LLDP is enabled > > [ 38.613157] i40e 0000:04:00.0: PCI-Express: Speed 8.0GT/s Width x8 > > [ 38.613832] i40e 0000:04:00.0: Features: PF-id[0] VFs: 64 VSIs: 66 QP: 4 > > RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA > > [ 38.625170] i40e 0000:04:00.0 enp4s0: renamed from eth0 > > > > The full domain dmesg of different PFs is as following: > > http://10.73.72.41/log/bug/Bug2174749/2023_05_15_08:15:19_XL710 > > http://10.73.72.41/log/bug/Bug2174749/2023_05_15_08:30:25_82599ES > > http://10.73.72.41/log/bug/Bug2174749/2023_05_15_08:58:13_MT2892 > > http://10.73.72.41/log/bug/Bug2174749/2023_05_15_09:01:27_QL41112 > > > > Hi Gerd, > > My test result shows the SPF9220 PF can still not be hot-plugged into domain > even with > edk2-ovmf-20230301gitf80f052277c8-3.el9.bz2174749.20230515.1346.noarch. > > The full domain dmesg: > http://10.73.72.41/log/bug/Bug2174749/2023_05_22_17:08:19_SFC9220_BZ > > Only SPF9220 PF test failed currently. > My test result shows current edk2-ovmf build fix does not apply for hot-plugging a sfc PF/VF scenario . May I ask if we plan to fix it ? If no, can we request the doc team to draft a known issue for it ? > My test result shows current edk2-ovmf build fix does not apply for > hot-plugging a sfc PF/VF scenario . > > May I ask if we plan to fix it ? If no, can we request the doc team to > draft a known issue for it ? See comment 25, there is no easy automatic way for non-prefetchable bars, so we'll continue to depend on manual configuration of the bridge windows. (In reply to Gerd Hoffmann from comment #46) > > My test result shows current edk2-ovmf build fix does not apply for > > hot-plugging a sfc PF/VF scenario . > > > > May I ask if we plan to fix it ? If no, can we request the doc team to > > draft a known issue for it ? > > See comment 25, there is no easy automatic way for non-prefetchable bars, > so we'll continue to depend on manual configuration of the bridge windows. Hi Gerd, I have Checked the SFC9220 PF and VF capabilities, they are all non-prefetchable devices. For the bug whose scenario is hot-plug a sfc PF/VF into vm , we can close them as WONFIX, am I right ? Those bugs are like: Bug 2209571 - [sfc] no VF interface in the VM after attached the VF to the VM Bug 2137782 - [sfc] could not enable MSI-X & failed to create NIC # lspci -v -s 0000:1a:00.1 1a:00.1 Ethernet controller: Solarflare Communications SFC9220 10/40G Ethernet Controller (rev 02) Subsystem: Solarflare Communications SFN8522-R2 8000 Series 10G Adapter Flags: bus master, fast devsel, latency 0, IRQ 363, NUMA node 0, IOMMU group 34 I/O ports at 4000 [size=256] Memory at 9d800000 (64-bit, non-prefetchable) [size=8M] Memory at a6800000 (64-bit, non-prefetchable) [size=16K] Expansion ROM at a6a80000 [disabled] [size=256K] Capabilities: [40] Power Management version 3 Capabilities: [70] Express Endpoint, MSI 00 Capabilities: [b0] MSI-X: Enable+ Count=32 Masked- Capabilities: [d0] Vital Product Data Capabilities: [100] Advanced Error Reporting Capabilities: [148] Device Serial Number 00-0f-53-ff-ff-4d-8c-30 Capabilities: [158] Alternative Routing-ID Interpretation (ARI) Capabilities: [198] Single Root I/O Virtualization (SR-IOV) Capabilities: [1d8] Transaction Processing Hints Kernel driver in use: sfc Kernel modules: sfc # lspci -v -s 0000:1a:08.2 1a:08.2 Ethernet controller: Solarflare Communications SFC9220 10/40G Ethernet Controller (Virtual Function) (rev 02) Subsystem: Solarflare Communications Device 8017 Flags: bus master, fast devsel, latency 0, NUMA node 0, IOMMU group 188 Memory at 9e800000 (64-bit, non-prefetchable) [virtual] [size=1M] Memory at a6804000 (64-bit, non-prefetchable) [virtual] [size=16K] Capabilities: [70] Express Endpoint, MSI 00 Capabilities: [b0] MSI-X: Enable+ Count=4 Masked- Capabilities: [100] Alternative Routing-ID Interpretation (ARI) Capabilities: [110] Transaction Processing Hints Kernel driver in use: sfc Kernel modules: sfc QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass. > I have Checked the SFC9220 PF and VF capabilities, they are all > non-prefetchable devices. > # lspci -v -s 0000:1a:00.1 > 1a:00.1 Ethernet controller: Solarflare Communications SFC9220 10/40G > Ethernet Controller (rev 02) > Memory at 9d800000 (64-bit, non-prefetchable) [size=8M] This is the PF I assume? non-prefetchable window size is 2M, so this is too big. Requires manual pcie root port configuration using the mem-reserve= property as discussed previously. > # lspci -v -s 0000:1a:08.2 > 1a:08.2 Ethernet controller: Solarflare Communications SFC9220 10/40G > Ethernet Controller (Virtual Function) (rev 02) > Memory at 9e800000 (64-bit, non-prefetchable) [virtual] [size=1M] > Memory at a6804000 (64-bit, non-prefetchable) [virtual] [size=16K] This probably is the VF ... This should work, there two pci bars should fit into the 2M bridge window. Test GPU passthrough against edk2-20230524-2.el9 and GPU devices with large video memory, I don't see any problems: The devices I used: passthrough 4x Nvidia V100 GPUs into a single rhel 9.3 and windows 11 VM on Intel host. The GPU: 61:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 SXM2 32GB] (rev a1) Subsystem: NVIDIA Corporation Device 1249 Flags: bus master, fast devsel, latency 0, IRQ 318, NUMA node 0, IOMMU group 5 Memory at c4000000 (32-bit, non-prefetchable) [size=16M] Memory at 3bf000000000 (64-bit, prefetchable) [size=32G] Memory at 3bf800000000 (64-bit, prefetchable) [size=32M] Capabilities: [60] Power Management version 3 Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [78] Express Endpoint, MSI 00 Capabilities: [100] Virtual Channel Capabilities: [258] L1 PM Substates Capabilities: [128] Power Budgeting <?> Capabilities: [420] Advanced Error Reporting Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?> Capabilities: [900] Secondary PCI Express Capabilities: [ac0] Designated Vendor-Specific: Vendor=10de ID=0001 Rev=1 Len=12 <?> Kernel driver in use: nvidia Kernel modules: nouveau, nvidia_vgpu_vfio, nvidia The host cpu: Address sizes: 46 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 80 On-line CPU(s) list: 0-79 Vendor ID: GenuineIntel BIOS Vendor ID: Intel(R) Corporation Model name: Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz BIOS Model name: Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz passthrough 1x Nvidia A100 GPU into a single rhel 9.3 and windows 11 VM on AMD host. The GPU: 41:00.0 3D controller: NVIDIA Corporation GA100 [A100 PCIe 40GB] (rev a1) Subsystem: NVIDIA Corporation Device 145f Physical Slot: 1 Flags: bus master, fast devsel, latency 0, IRQ 211, IOMMU group 43 Memory at b0000000 (32-bit, non-prefetchable) [size=16M] Memory at 26000000000 (64-bit, prefetchable) [size=64G] Memory at 28020000000 (64-bit, prefetchable) [size=32M] Capabilities: [60] Power Management version 3 Capabilities: [68] Null Capabilities: [78] Express Endpoint, MSI 00 Capabilities: [c8] MSI-X: Enable+ Count=6 Masked- Capabilities: [100] Virtual Channel Capabilities: [258] L1 PM Substates Capabilities: [128] Power Budgeting <?> Capabilities: [420] Advanced Error Reporting Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?> Capabilities: [900] Secondary PCI Express Capabilities: [bb0] Physical Resizable BAR Capabilities: [bcc] Single Root I/O Virtualization (SR-IOV) Capabilities: [c14] Alternative Routing-ID Interpretation (ARI) Capabilities: [c1c] Physical Layer 16.0 GT/s <?> Capabilities: [d00] Lane Margining at the Receiver <?> Capabilities: [e00] Data Link Feature <?> Kernel driver in use: vfio-pci Kernel modules: nouveau, nvidia_vgpu_vfio, nvidia The host cpu: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 48 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Vendor ID: AuthenticAMD BIOS Vendor ID: Advanced Micro Devices, Inc. Model name: AMD EPYC 7262 8-Core Processor BIOS Model name: AMD EPYC 7262 8-Core Processor These GPUs are identified by nvidia driver correctly inside both rhel and windows VM and GPU function tests are also pass qemu & kernel I used: qemu-kvm-8.0.0-7.el9.x86_64 & 5.14.0-335.el9.x86_64 > Host CPU info: > address sizes : 43 bits physical, 48 bits virtual > > Guest CPU info: > address sizes : 48 bits physical, 48 bits virtual The difference is due to the memory encryption bits. The size of the address space is 43 bits, but the size of the *physical address field in page tables* is 48 bits on both host and guest. Because memory encryption is not enabled in the guest, the address space reduction due to encryption is not included in /proc/cpuinfo, which only reports the size of the physical address field in page tables. So it's ugly, but it's expected. Test env: edk2-20230524-2.el9 Test result: 2023-07-14 18:14:55 | PASS - hot plug 1 MT2892 pf into rhel93 domain 2023-07-14 18:17:06 | PASS - hot plug 2 MT2892 pf into rhel93 domain 2023-07-14 18:09:48 | PASS - hot plug 1 MT2892 vf into rhel93 domain 2023-07-14 18:12:50 | PASS - hot plug 7 MT2892 vf into rhel93 domain 2023-07-14 18:22:19 | PASS - hot plug 1 QL41112 pf into rhel93 domain 2023-07-14 18:27:46 | PASS - hot plug 2 QL41112 pf into rhel93 domain 2023-07-17 11:08:12 | PASS - hot plug 1 QL41112 vf into rhel93 domain 2023-07-17 11:14:37 | PASS - hot plug 7 QL41112 vf into rhel93 domain 2023-07-14 18:20:39 | PASS - hot plug 1 82599ES vf into rhel93 domain 2023-07-14 18:23:32 | PASS - hot plug 7 82599ES vf into rhel93 domain 2023-07-14 18:16:27 | PASS - hot plug 1 82599ES pf into rhel93 domain 2023-07-14 18:18:39 | PASS - hot plug 2 82599ES pf into rhel93 domain 2023-07-14 18:11:45 | PASS - hot plug 1 E810 vf into rhel93 domain 2023-07-14 18:14:36 | PASS - hot plug 7 E810 vf into rhel93 domain 2023-07-14 18:07:52 | PASS - hot plug 1 E810 pf into rhel93 domain 2023-07-14 18:09:52 | PASS - hot plug 2 E810 pf into rhel93 domain 2023-07-17 10:57:10 | PASS - hot plug 1 XXV710 pf into rhel93 domain 2023-07-17 11:02:17 | PASS - hot plug 2 XXV710 pf into rhel93 domain 2023-07-17 11:07:24 | PASS - hot plug 1 XXV710 vf into rhel93 domain 2023-07-17 11:13:18 | PASS - hot plug 7 XXV710 vf into rhel93 domain 2023-07-19 02:43:01 | PASS - hot plug 1 SFC9220 pf to rhel93 domain whose pcie-root-port is 16M 2023-07-19 02:48:23 | PASS - hot plug 2 SFC9220 pf into rhel93 domain whose pcie-root-port is 16M 2023-07-17 11:04:35 | PASS - hot plug 1 SFC9220 vf to rhel93 domain 2023-07-17 11:10:41 | PASS - hot plug 7 SFC9220 vf into rhel93 domain CPU model sanity test on AMD Genoa PASS. Test Env: edk2-ovmf-20230524-2.el9.noarch 5.14.0-341.el9.x86_64 qemu-kvm-8.0.0-8.el9.x86_64 libvirt-client-9.5.0-3.el9.x86_64 amd-genoa-02.khw1.lab.eng.bos.redhat.com Guest: latest rhel9.3 CPU other sanity test on Intel Icelake PASS: Test results: https://beaker.engineering.redhat.com/recipes/14283495#task163423385,task163423386 Run the following test loops, no new bug was found. Versions: kernel-5.14.0-333.el9.x86_64 qemu-kvm-8.0.0-6.el9 edk2-ovmf-20230524-2.el9.noarch 1. Qemu_gating_test_rhel9 Job link: http://fileshare.hosts.qa.psi.pek2.redhat.com/pub/logs/qemu_gating_test_rhel9_with_edk2-ovmf-20230524-2.el9/results.html 2. Rhel8.6, rhel8.7, rhel8.8, rhel8.9, rhel9.0, rhel9.1, rhel9.2 secure boot with edk2-20230524-2.el9 Job link: http://fileshare.hosts.qa.psi.pek2.redhat.com/pub/logs/rhel860-rhel920_secure_boot_with_edk2-20230524-2.el9/results.html 3. win11_secure_boot_with_edk2-20230524-2.el9 Job link: http://fileshare.hosts.qa.psi.pek2.redhat.com/pub/logs/win11_secure_boot_with_edk2-20230524-2.el9/results.html 4. edk2_test_loop_on_intel_host Job link: http://fileshare.hosts.qa.psi.pek2.redhat.com/pub/logs/edk2_test_with_edk2-20230524-2.el9/results.html 5. Edk2_test_loop_on_amd_host Job link: http://fileshare.hosts.qa.psi.pek2.redhat.com/pub/logs/edk2_test_with_edk2-20230524-2.el9_amd_host/results.html Existing bug: Bug 2168446 - Booting VM failed on AMD EPYC 7252 host with npt=0 Tested parameter host-phys-bits-limit on amd host and intel host, hit one low priority issue which tracking by RHEL-917. Details: Test on an AMD host check host phys-bits # lscpu | grep "Address sizes" Address sizes: 43 bits physical, 48 bits virtual 1. host-phys-bits-limit testing with -1 (i.e. host-phys-bits=on,host-phys-bits-limit=-1) get the error message: (qemu) qemu-kvm: can't apply global EPYC-Rome-x86_64-cpu.host-phys-bits-limit=-1: Parameter 'host-phys-bits-limit' expects uint8_t 2. host-phys-bits-limit testing with 1 (i.e. host-phys-bits=on,host-phys-bits-limit=1) get the prompt message: qemu-kvm: phys-bits should be between 32 and 52 (but is 1) 3. host-phys-bits-limit testing with 36(i.e. host-phys-bits=on,host-phys-bits-limit=36) guest boot up successfully, and the guest phys-bit is 36 # lscpu | grep "Address sizes" Address sizes: 36 bits physical, 48 bits virtual 4. host-phys-bits-limit testing with 40(i.e. host-phys-bits=on,host-phys-bits-limit=40) guest boot up successfully, and the guest phys-bit is 40 # lscpu | grep "Address sizes" Address sizes: 40 bits physical, 48 bits virtual 5. host-phys-bits-limit testing with 43(i.e. host-phys-bits=on,host-phys-bits-limit=43) guest boot up successfully, and the guest phys-bit is 43 # lscpu | grep "Address sizes" Address sizes: 43 bits physical, 48 bits virtual 6. host-phys-bits-limit testing with 48(i.e. host-phys-bits=on,host-phys-bits-limit=48) guest boot up successfully, and the guest phys-bit is 48 # lscpu | grep "Address sizes" Address sizes: 48 bits physical, 48 bits virtual And get the output from edk2 debug log: PlatformAddressWidthFromCpuid: Signature: 'AuthenticAMD', PhysBits: 48, QemuQuirk: On, Valid: Yes PlatformAddressWidthFromCpuid: limit PhysBits to 46 (avoid 5-level paging) 7. host-phys-bits-limit testing with 53(i.e. host-phys-bits=on,host-phys-bits-limit=53) the guest boot successfully, no error message. tracking by RHEL-917. Test on an intel host check host phys-bits # lscpu | grep "Address sizes" Address sizes: 46 bits physical, 57bits virtual 1. host-phys-bits-limit testing with -1(i.e. host-phys-bits=on,host-phys-bits-limit=-1) get the error message: (qemu) qemu-kvm: can't apply global Icelake-Server-x86_64-cpu.host-phys-bits-limit=-1: Parameter 'host-phys-bits-limit' expects uint8_t 2. host-phys-bits-limit testing with 1(i.e. host-phys-bits=on,host-phys-bits-limit=1) get the message: qemu-kvm: phys-bits should be between 32 and 52 (but is 1) 3. host-phys-bits-limit testing with 36(i.e. host-phys-bits=on,host-phys-bits-limit=36) get the prompt message: (qemu) qemu-kvm: Address space limit 0xfffffffff < 0x17bfffffff phys-bits too low (36) 4. host-phys-bits-limit testing with 39(i.e. host-phys-bits=on,host-phys-bits-limit=39) guest boot up successfully, and the guest phys-bit is 39 # lscpu |grep "Address sizes" Address sizes: 39 bits physical, 57 bits virtual 5. host-phys-bits-limit testing with 46(i.e. host-phys-bits=on,host-phys-bits-limit=46) guest boot up successfully, and the guest phys-bit is 46 # lscpu |grep "Address sizes" Address sizes: 46 bits physical, 57 bits virtual 6. host-phys-bits-limit testing with 52(i.e. host-phys-bits=on,host-phys-bits-limit=52) guest boot up successfully, and the guest phys-bit is 46 # lscpu | grep "Address sizes" Address sizes: 46 bits physical, 57 bits virtual 7. host-phys-bits-limit testing with 53(i.e. host-phys-bits=on,host-phys-bits-limit=53) the guest boot successfully, no error message. tracking by RHEL-917. Test on an intel host with 52 phys-bits # lscpu |grep Address Address sizes: 52 bits physical, 57 bits virtual 1. host-phys-bits-limit testing with 36(i.e. host-phys-bits=on,host-phys-bits-limit=36) guest boot up successfully, and the guest phys-bit is 36. # lscpu | grep "Address sizes" Address sizes: 36 bits physical, 57 bits virtual 2. host-phys-bits-limit testing with 52(i.e. host-phys-bits=on,host-phys-bits-limit=52) guest boot up successfully, and the guest phys-bit is 52. # lscpu |grep "Address sizes" Address sizes: 52 bits physical, 57 bits virtual Get the following message from edk2 debug log: PlatformAddressWidthFromCpuid: Signature: 'GenuineIntel', PhysBits: 52, QemuQuirk: On, Valid: Yes PlatformAddressWidthFromCpuid: limit PhysBits to 46 (avoid 5-level paging) Migrate between hosts that have below physical address size on qemu-kvm-8.0.0-10.el9.x86_64 and edk2-ovmf-20230524-2.el9.noarch, all pass. 1. Intel 39 <-> 46 46 <-> 52 39 <-> 52 2. AMD 43 <-> 48 Thank you Zhiyi, Yanbin, Yanghang, Nana, Mario and Xiaohui. Many thanks. According to Comment 51, Comment 57, Comment 59, Comment 60, Comment 61, Comment 62, Comment 63 and Comment 64, set status to VERIFIED. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: edk2 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:6330 |