RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2001719 - fail to hotplug NIC with edk2 firmware
Summary: fail to hotplug NIC with edk2 firmware
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: edk2
Version: 9.0
Hardware: x86_64
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Gerd Hoffmann
QA Contact: Lei Yang
URL:
Whiteboard:
Depends On: 2018388
Blocks: 2005548
TreeView+ depends on / blocked
 
Reported: 2021-09-07 02:12 UTC by FuXiangChun
Modified: 2022-05-25 06:41 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2005548 (view as bug list)
Environment:
Last Closed: 2022-05-25 06:40:48 UTC
Type: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-96357 0 None None None 2021-09-07 02:31:52 UTC

Description FuXiangChun 2021-09-07 02:12:50 UTC
Description of problem:
can not get this NIC inside guest after hotplugging NIC. But lspci command can get NIC inside guest. 

Version-Release number of selected component (if applicable):

edk2-ovmf-20210527gite1999b264f1f-6.el9.noarch
qemu-kvm-core-6.1.0-1.el9.x86_64
5.14.0-0.rc7.54.el9.x86_64

How reproducible:

always

Steps to Reproduce:
1./usr/libexec/qemu-kvm \
-name 'avocado-vt-vm1'  \
-sandbox on  \
-blockdev node-name=file_ovmf_code,driver=file,filename=/usr/share/OVMF/OVMF_CODE.secboot.fd,auto-read-only=on,discard=unmap \
-blockdev node-name=drive_ovmf_code,driver=raw,read-only=on,file=file_ovmf_code \
-blockdev node-name=file_ovmf_vars,driver=file,filename=/home/kvm_autotest_root/images/avocado-vt-vm1_rhel900-64-virtio-scsi_avocado-vt-vm1.qcow2_VARS.fd,auto-read-only=on,discard=unmap \
-blockdev node-name=drive_ovmf_vars,driver=raw,read-only=off,file=file_ovmf_vars \
-machine q35,memory-backend=mem-machine_mem,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars \
-device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
-device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
-nodefaults \
-device VGA,bus=pcie.0,addr=0x2 \
-device i6300esb,bus=pcie-pci-bridge-0,addr=0x1 \
-watchdog-action reset \
-m 30720 \
-object memory-backend-ram,size=30720M,id=mem-machine_mem  \
-smp 28,cores=14,threads=1,dies=1,sockets=2  \
-cpu 'Icelake-Server-noTSX',+kvm_pv_unhalt \
-device intel-hda,bus=pcie-pci-bridge-0,addr=0x2 \
-device hda-duplex \
-chardev socket,server=on,path=/tmp/avocado_bzlt5q3m/monitor-qmpmonitor1-20210903-071553-ZhkpziLY,wait=off,id=qmp_id_qmpmonitor1  \
-mon chardev=qmp_id_qmpmonitor1,mode=control \
-chardev socket,server=on,path=/tmp/avocado_bzlt5q3m/monitor-catch_monitor-20210903-071553-ZhkpziLY,wait=off,id=qmp_id_catch_monitor  \
-mon chardev=qmp_id_catch_monitor,mode=control \
-device pvpanic,ioport=0x505,id=idYNIxWq \
-chardev socket,server=on,path=/tmp/avocado_bzlt5q3m/serial-serial0-20210903-071553-ZhkpziLY,wait=off,id=chardev_serial0 \
-device isa-serial,id=serial0,chardev=chardev_serial0 \
-object rng-random,filename=/dev/random,id=passthrough-dQJbnZAC \
-device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
-device virtio-rng-pci,id=virtio-rng-pci-Ik565wM9,rng=passthrough-dQJbnZAC,bus=pcie-root-port-1,addr=0x0  \
-chardev socket,id=seabioslog_id_20210903-071553-ZhkpziLY,path=/tmp/avocado_bzlt5q3m/seabios-20210903-071553-ZhkpziLY,server=on,wait=off \
-device isa-debugcon,chardev=seabioslog_id_20210903-071553-ZhkpziLY,iobase=0x402 \
-device ich9-usb-ehci1,id=usb1,addr=0x1d.0x7,multifunction=on,bus=pcie.0 \
-device ich9-usb-uhci1,id=usb1.0,multifunction=on,masterbus=usb1.0,addr=0x1d.0x0,firstport=0,bus=pcie.0 \
-device ich9-usb-uhci2,id=usb1.1,multifunction=on,masterbus=usb1.0,addr=0x1d.0x2,firstport=2,bus=pcie.0 \
-device ich9-usb-uhci3,id=usb1.2,multifunction=on,masterbus=usb1.0,addr=0x1d.0x4,firstport=4,bus=pcie.0 \
-device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
-device qemu-xhci,id=usb2,bus=pcie-root-port-2,addr=0x0 \
-device usb-tablet,id=usb-tablet1,bus=usb2.0,port=1 \
-device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-3,addr=0x0 \
-blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/root/rhel900-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \
-blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
-device scsi-hd,id=image1,drive=drive_image1,write-cache=on  \
-vnc :0  \
-rtc base=utc,clock=host,driftfix=slew  \
-boot menu=off,order=cdn,once=c,strict=off \
-net none \
-no-hpet \
-enable-kvm \
-device pcie-root-port,id=pcie-root-port-4,port=0x4,addr=0x1.0x4,bus=pcie.0,chassis=5 \
-device virtio-balloon-pci,id=balloon0,bus=pcie-root-port-4,addr=0x0 \
-device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=6 \
-device pcie-root-port,id=pcie_extra_root_port_1,addr=0x3.0x1,bus=pcie.0,chassis=7 \
-device pcie-root-port,id=pcie_extra_root_port_2,addr=0x3.0x2,bus=pcie.0,chassis=8 \
-device pcie-root-port,id=pcie_extra_root_port_3,addr=0x3.0x3,bus=pcie.0,chassis=9 \
-monitor stdio \
-vnc :1 \

2.{"execute": "netdev_add", "arguments": {"type": "tap", "id": "netdev0"}}
{"return": {}}
3.{"execute": "device_add", "arguments": {"id": "nic0", "driver": "e1000e", "netdev": "netdev0", "mac": "9a:45:bb:a8:b1:90", "bus": "pcie_extra_root_port_0", "addr": "0x0"}}
{"return": {}}


Actual results:
1.can not get NIC hotplugged via ifconfig/ip link show command inside guest.
2.lspci can get NIC

Expected results:
ifconfig can get NIC

Additional info:
seabios works well.

Comment 1 Gerd Hoffmann 2021-09-07 05:58:46 UTC
What does 'lspci -v' print for the e1000e?
Any error messages in the guest kernel log?

Comment 2 Lei Yang 2021-09-07 07:50:50 UTC
(In reply to Gerd Hoffmann from comment #1)
> What does 'lspci -v' print for the e1000e?
> Any error messages in the guest kernel log?

Hi Gerd

After hotplug nic, guest dmesg info:
# dmesg
......
[   73.867174] pci 0000:07:00.0: [8086:10d3] type 00 class 0x020000
[   73.867275] pci 0000:07:00.0: reg 0x10: [mem 0x00000000-0x0001ffff]
[   73.867311] pci 0000:07:00.0: reg 0x14: [mem 0x00000000-0x0001ffff]
[   73.867346] pci 0000:07:00.0: reg 0x18: [io  0x0000-0x001f]
[   73.867380] pci 0000:07:00.0: reg 0x1c: [mem 0x00000000-0x00003fff]
[   73.867495] pci 0000:07:00.0: reg 0x30: [mem 0x00000000-0x0003ffff pref]
[   73.868638] pci 0000:07:00.0: BAR 6: no space for [mem size 0x00040000 pref]
[   73.868640] pci 0000:07:00.0: BAR 6: failed to assign [mem size 0x00040000 pref]
[   73.868642] pci 0000:07:00.0: BAR 0: no space for [mem size 0x00020000]
[   73.868643] pci 0000:07:00.0: BAR 0: failed to assign [mem size 0x00020000]
[   73.868644] pci 0000:07:00.0: BAR 1: no space for [mem size 0x00020000]
[   73.868645] pci 0000:07:00.0: BAR 1: failed to assign [mem size 0x00020000]
[   73.868646] pci 0000:07:00.0: BAR 3: no space for [mem size 0x00004000]
[   73.868647] pci 0000:07:00.0: BAR 3: failed to assign [mem size 0x00004000]
[   73.868648] pci 0000:07:00.0: BAR 2: no space for [io  size 0x0020]
[   73.868649] pci 0000:07:00.0: BAR 2: failed to assign [io  size 0x0020]
[   73.925950] e1000e: Intel(R) PRO/1000 Network Driver
[   73.925952] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
[   73.927725] e1000e: probe of 0000:07:00.0 failed with error -5

# lspci -v
07:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
        Subsystem: Intel Corporation Device 0000
        Physical Slot: 0-6
        Flags: fast devsel, IRQ 23
        I/O ports at <unassigned> [disabled]
        Capabilities: [c8] Power Management version 2
        Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [e0] Express Endpoint, MSI 00
        Capabilities: [a0] MSI-X: Enable- Count=5 Masked-
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [140] Device Serial Number 9a-45-bb-ff-ff-a8-b1-90
        Kernel modules: e1000e

Best Regards
Lei

Comment 3 Gerd Hoffmann 2021-09-07 08:43:10 UTC
So no address space for io / mmio.  Probably edk2 didn't assign anything to the bridge.

What does "lspci -v' print for the root port where you plug the e1000e in (should be 00:03.0)?

The linux kernel should be able to fix that up by assigning address space to the root port.
How does "/proc/iomem" and "/proc/ioports" look like?

Any change of behavior if you add io-reserve=4k and mem-reserve=2M properties to the root port?

Comment 4 Lei Yang 2021-09-07 10:01:33 UTC
(In reply to Gerd Hoffmann from comment #3)
> So no address space for io / mmio.  Probably edk2 didn't assign anything to
> the bridge.
Hi Gerd
> 
> What does "lspci -v' print for the root port where you plug the e1000e in
> (should be 00:03.0)?
Inside guest:
00:03.0 PCI bridge: Red Hat, Inc. QEMU PCIe Root port (prog-if 00 [Normal decode])
        Flags: bus master, fast devsel, latency 0, IRQ 23
        Memory at c1515000 (32-bit, non-prefetchable) [size=4K]
        Bus: primary=00, secondary=07, subordinate=07, sec-latency=0
        I/O behind bridge: [disabled]
        Memory behind bridge: [disabled]
        Prefetchable memory behind bridge: [disabled]
        Capabilities: [90] Vendor Specific Information: Len=20 <?>
        Capabilities: [54] Express Root Port (Slot+), MSI 00
        Capabilities: [48] MSI-X: Enable+ Count=1 Masked-
        Capabilities: [40] Subsystem: Red Hat, Inc. Device 0000
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [148] Access Control Services
        Kernel driver in use: pcieport
HMP info:
dev: pcie-root-port, id "pcie_extra_root_port_0"
        x-migrate-msix = true
        bus-reserve = 4294967295 (0xffffffff)
        io-reserve = 4096 (4 KiB)
        mem-reserve = 18446744073709551615 (16 EiB)
        pref32-reserve = 18446744073709551615 (16 EiB)
        pref64-reserve = 18446744073709551615 (16 EiB)
        x-speed = "16"
        x-width = "32"
        power_controller_present = true
        disable-acs = false
        chassis = 6 (0x6)
        slot = 0 (0x0)
        hotplug = true
        native-hotplug = false
        port = 0 (0x0)
        aer_log_max = 8 (0x8)
        addr = 03.0
        romfile = ""
        romsize = 4294967295 (0xffffffff)
        rombar = 1 (0x1)
        multifunction = true
        x-pcie-lnksta-dllla = true
        x-pcie-extcap-init = true
        failover_pair_id = ""
        acpi-index = 0 (0x0)
        class PCI bridge, addr 00:03.0, pci id 1b36:000c (sub 0000:0000)
        bar 0: mem at 0xc1515000 [0xc1515fff]
        bus: pcie_extra_root_port_0
          type PCIE
          dev: e1000e, id "nic0"
            mac = "9a:45:bb:a8:b1:90"
            netdev = "netdev0"
            disable_vnet_hdr = 0 (0x0)
            subsys_ven = 32902 (0x8086)
            subsys = 0 (0x0)
            init-vet = true
            addr = 00.0
            romfile = "efi-e1000e.rom"
            romsize = 262144 (0x40000)
            rombar = 1 (0x1)
            multifunction = false
            x-pcie-lnksta-dllla = true
            x-pcie-extcap-init = true
            failover_pair_id = ""
            acpi-index = 0 (0x0)
            class Ethernet controller, addr 07:00.0, pci id 8086:10d3 (sub 8086:0000)
            bar 0: mem at 0xffffffffffffffff [0x1fffe]
            bar 1: mem at 0xffffffffffffffff [0x1fffe]
            bar 2: i/o at 0xffffffffffffffff [0x1e]
            bar 3: mem at 0xffffffffffffffff [0x3ffe]
            bar 6: mem at 0xffffffffffffffff [0x3fffe]
> 
> The linux kernel should be able to fix that up by assigning address space to
> the root port.
> How does "/proc/iomem" and "/proc/ioports" look like?

The guest does not display the hotplug nic device information

> 
> Any change of behavior if you add io-reserve=4k and mem-reserve=2M
> properties to the root port?

-device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=6,io-reserve=4k,mem-reserve=2M \

Adding parameters io-reserve=4k and mem-reserve=2M to the root port, hotplug still fails. Guest dmesg information is the same as Comment 2.

Best Regards
Lei

Comment 5 Gerd Hoffmann 2021-09-08 05:33:12 UTC
> > What does "lspci -v' print for the root port where you plug the e1000e in
> > (should be 00:03.0)?
> Inside guest:
> 00:03.0 PCI bridge: Red Hat, Inc. QEMU PCIe Root port (prog-if 00 [Normal
> decode])

>         I/O behind bridge: [disabled]
>         Memory behind bridge: [disabled]

Nothing assigned.

> > The linux kernel should be able to fix that up by assigning address space to
> > the root port.
> > How does "/proc/iomem" and "/proc/ioports" look like?
> 
> The guest does not display the hotplug nic device information

Sure, but can you add the files?  I'd like to see the full hierarchy.

And please attach the full (guest) kernel log too.

Comment 7 Lei Yang 2021-09-08 08:01:42 UTC
(In reply to Gerd Hoffmann from comment #5)
> > > What does "lspci -v' print for the root port where you plug the e1000e in
> > > (should be 00:03.0)?
> > Inside guest:
> > 00:03.0 PCI bridge: Red Hat, Inc. QEMU PCIe Root port (prog-if 00 [Normal
> > decode])
> 
> >         I/O behind bridge: [disabled]
> >         Memory behind bridge: [disabled]
> 
> Nothing assigned.
> 
> > > The linux kernel should be able to fix that up by assigning address space to
> > > the root port.
> > > How does "/proc/iomem" and "/proc/ioports" look like?
> > 
> > The guest does not display the hotplug nic device information
> 
> Sure, but can you add the files?  I'd like to see the full hierarchy.
# cat /proc/iomem
00000000-00000fff : Reserved
00001000-0002ffff : System RAM
00030000-0004ffff : Reserved
00050000-0009efff : System RAM
0009f000-0009ffff : Reserved
000a0000-000bffff : PCI Bus 0000:00
000f0000-000fffff : System ROM
00100000-7d450fff : System RAM
  6a000000-79ffffff : Crash kernel
7d451000-7d477fff : Reserved
7d478000-7da30017 : System RAM
7da30018-7da39a57 : System RAM
7da39a58-7e8edfff : System RAM
7e8ee000-7eb6dfff : Reserved
7eb6e000-7eb7dfff : ACPI Tables
7eb7e000-7ebfdfff : ACPI Non-volatile Storage
7ebfe000-7effffff : System RAM
7f000000-7fffffff : Reserved
80000000-afffffff : PCI Bus 0000:00
b0000000-bfffffff : PCI MMCONFIG 0000 [bus 00-ff]
  b0000000-bfffffff : Reserved
    b0000000-bfffffff : pnp 00:04
c0000000-febfffff : PCI Bus 0000:00
  c0000000-c0ffffff : 0000:00:02.0
    c0000000-c0ffffff : bochs-drm
  c1000000-c12fffff : PCI Bus 0000:01
    c1000000-c11fffff : PCI Bus 0000:02
      c1000000-c1003fff : 0000:02:02.0
        c1000000-c1003fff : ICH HD audio
      c1004000-c100400f : 0000:02:01.0
        c1004000-c100400f : i6300ESB timer
    c1200000-c12000ff : 0000:01:00.0
  c1300000-c13fffff : PCI Bus 0000:05
    c1300000-c1300fff : 0000:05:00.0
  c1400000-c14fffff : PCI Bus 0000:04
    c1400000-c1403fff : 0000:04:00.0
      c1400000-c1403fff : xhci-hcd
  c1510000-c1510fff : 0000:00:1f.2
    c1510000-c1510fff : ahci
  c1511000-c1511fff : 0000:00:1d.7
    c1511000-c1511fff : ehci_hcd
  c1512000-c1512fff : 0000:00:03.3
  c1513000-c1513fff : 0000:00:03.2
  c1514000-c1514fff : 0000:00:03.1
  c1515000-c1515fff : 0000:00:03.0
  c1516000-c1516fff : 0000:00:02.0
    c1516000-c1516fff : bochs-drm
  c1517000-c1517fff : 0000:00:01.4
  c1518000-c1518fff : 0000:00:01.3
  c1519000-c1519fff : 0000:00:01.2
  c151a000-c151afff : 0000:00:01.1
  c151b000-c151bfff : 0000:00:01.0
fec00000-fec003ff : IOAPIC 0
fed1f410-fed1f414 : iTCO_wdt.1.auto
  fed1f410-fed1f414 : iTCO_wdt.1.auto iTCO_wdt.1.auto
fee00000-fee00fff : Local APIC
100000000-7ffffffff : System RAM
  76c000000-76ce02466 : Kernel code
  76d000000-76d8b4fff : Kernel rodata
  76da00000-76dfca8ff : Kernel data
  76e6ae000-76ebfffff : Kernel bss
800000000-fffffffff : PCI Bus 0000:00
  800000000-8000fffff : PCI Bus 0000:03
    800000000-800003fff : 0000:03:00.0
      800000000-800003fff : virtio-pci-modern
  800100000-8001fffff : PCI Bus 0000:05
    800100000-800103fff : 0000:05:00.0
      800100000-800103fff : virtio-pci-modern
  800200000-8002fffff : PCI Bus 0000:06
    800200000-800203fff : 0000:06:00.0
      800200000-800203fff : virtio-pci-modern
# cat /proc/ioports
0000-0cf7 : PCI Bus 0000:00
  0000-001f : dma1
  0020-0021 : pic1
  0040-0043 : timer0
  0050-0053 : timer1
  0060-0060 : keyboard
  0064-0064 : keyboard
  0070-0077 : rtc0
  0080-008f : dma page reg
  00a0-00a1 : pic2
  00c0-00df : dma2
  00f0-00ff : fpu
  03f8-03ff : serial
  0505-0505 : QEMU0001:00
  0510-051b : QEMU0002:00
    0510-051b : fw_cfg_io
  0600-067f : 0000:00:1f.0
    0600-0603 : ACPI PM1a_EVT_BLK
    0604-0605 : ACPI PM1a_CNT_BLK
    0608-060b : ACPI PM_TMR
    0620-062f : ACPI GPE0_BLK
    0630-0633 : iTCO_wdt.1.auto
      0630-0633 : iTCO_wdt
    0660-067f : iTCO_wdt.1.auto
      0660-067f : iTCO_wdt
0cf8-0cff : PCI conf1
0d00-ffff : PCI Bus 0000:00
  6000-6fff : PCI Bus 0000:01
    6000-6fff : PCI Bus 0000:02
  7000-703f : 0000:00:1f.3
    7000-703f : i801_smbus
  7040-705f : 0000:00:1f.2
    7040-705f : ahci
  7060-707f : 0000:00:1d.4
    7060-707f : uhci_hcd
  7080-709f : 0000:00:1d.2
    7080-709f : uhci_hcd
  70a0-70bf : 0000:00:1d.0
    70a0-70bf : uhci_hcd

> 
> And please attach the full (guest) kernel log too.
Add an attachment: log

Comment 8 Gerd Hoffmann 2021-09-08 08:31:29 UTC
Hmm, ovmf doesn't assign resources to pci bridges without devices.
Can I get a firmware log?

Is this a regression?
If so: what are known-good versions?

Comment 10 Lei Yang 2021-09-09 10:42:15 UTC
(In reply to Gerd Hoffmann from comment #8)
> Hmm, ovmf doesn't assign resources to pci bridges without devices.

Hi Gerd

> Can I get a firmware log?

Sure, update in attachment, name : OVMF_firmware.log
> 
> Is this a regression?
> If so: what are known-good versions?

Yep,from the perspective of QE,this is a regression issue.

I tried to test the following scenario:

 1)qemu-kvm-6.1.0-1.el9.x86_64 + edk2-ovmf-20210527gite1999b264f1f-6.el9.noarch (Failed)
 2)qemu-kvm-6.1.0-1.el9.x86_64 + edk2-ovmf-20200602gitca407c7246bf-1.el9.noarch (Failed)
 3)qemu-kvm-6.0.0-13.el9_b.1.x86_64 + edk2-ovmf-20210527gite1999b264f1f-6.el9.noarch (Failed)
 4)qemu-kvm-6.0.0-13.el9_b.1.x86_64 + edk2-ovmf-20200602gitca407c7246bf-1.el9.noarch (Test Successful)

Based on above test result. The current bug should be regression, so add the keywords "Regression". 
On the other hand, the current bug requires edk2 and qemu-kvm to roll back together before it can be test success. Do I need to clone a bug to qemu-kvm? If need, I will file an new bug. Thanks in advance.

Best Regards
Lei

Comment 11 Gerd Hoffmann 2021-09-10 06:48:43 UTC
Can I get the extra verbose lspci output for the pcie root port please?
"sudo lspci -vvs3.0" should do the trick.
Any change in behavior if you use a rhel-8 machine type (-machine pc-q35-rhel8.5.0,...)?
(no bug cloning yet, lets find the root cause first).

Comment 12 Gerd Hoffmann 2021-09-10 07:08:41 UTC
(In reply to Gerd Hoffmann from comment #11)
> Can I get the extra verbose lspci output for the pcie root port please?

Ok, scratch that, reproduced it locally:

kraxel@rhel9 ~# sudo lspci -vvs1.0 | grep -A2 -B2 Slt
                LnkSta: Speed 2.5GT/s (downgraded), Width x1 (downgraded)
                        TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
                SltCap: AttnBtn+ PwrCtrl+ MRL- AttnInd+ PwrInd+ HotPlug- Surprise-
                                                                ^^^^^^^^
                        Slot #0, PowerLimit 0.000W; Interlock+ NoCompl-
                SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
                        Control: AttnInd Off, PwrInd On, Power- Interlock-
                SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet- Interlock-
                        Changed: MRL- PresDet- LinkState-
                RootCap: CRSVisible-

So, qemu turns of the slot hotplug capability,
and ovmf doesn't reserve io/mmio resources because of that.

Comment 13 Gerd Hoffmann 2021-09-10 07:29:24 UTC
qemu change:

commit 17858a169508609ca9063c544833e5a1adeb7b52
Author: Julia Suvorova <jusual>
Date:   Tue Jul 13 02:42:04 2021 +0200

    hw/acpi/ich9: Set ACPI PCI hot-plug as default on Q35

Adding "-global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off"
should switch back to traditional behavior.

Comment 14 Gerd Hoffmann 2021-09-10 08:01:20 UTC
>     hw/acpi/ich9: Set ACPI PCI hot-plug as default on Q35

The commit mentions bug 1752465 and bug 1690256,
both are about windows guest hotplug problems.

So it seems this change hasn't been tested with
linux + ovmf at all?

Comment 15 Lei Yang 2021-09-14 01:17:40 UTC
Hi Gerd

For the current bug, if you need any test information from QE, please feel free to let me know.

Best Regards
Lei

Comment 16 Gerd Hoffmann 2021-09-14 09:15:29 UTC
> For the current bug, if you need any test information from QE, please feel
> free to let me know.

It's more a design issue we have here.  q35 switched from native pcie hotplug
to acpi based hotplug (by default).  Now edk2 doesn't know the pcie slots are
hotpluggable because the pcie hotplug bit isn't set any more.  So edk2 simply
doesn't assign ressources to these slots.  Which breaks hotplug with linux, and
I wouldn't be surprised if it breaks hotplug with windows too.

Starting qemu with "-global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off"
switches back to native pcie hotplug and should make things work again.

[ can you verify that?
  can you also test whenever acpi hotplug works with windows + edk2? ]

SeaBIOS is a bit less strict.  It only checks whenever the root / downstream
port has the "Slot Implemented" bit set, but not whenever the "Hot-Plug Capable"
bit is set.

commit adding the edk2 implementation:
https://github.com/tianocore/edk2/commit/ffdd337630cc0df4bf9f7ef853bc56bc9105fd43

I suspect simply dropping the check for the "Hot-Plug Capable" bit will be rather
hard to sell upstream, because strictly speaking it is the correct thing to do.

Comment 17 Lei Yang 2021-09-14 14:57:27 UTC
(In reply to Gerd Hoffmann from comment #16)
> > For the current bug, if you need any test information from QE, please feel
> > free to let me know.
> 
> It's more a design issue we have here.  q35 switched from native pcie hotplug
> to acpi based hotplug (by default).  Now edk2 doesn't know the pcie slots are
> hotpluggable because the pcie hotplug bit isn't set any more.  So edk2 simply
> doesn't assign ressources to these slots.  Which breaks hotplug with linux,
> and
> I wouldn't be surprised if it breaks hotplug with windows too.
> 

Hi Gerd

> Starting qemu with "-global
> ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off"
> switches back to native pcie hotplug and should make things work again.

I tried to add this parameters line in qemu commandline. In the same env, hotplug nic successful.

> [ can you verify that?
>   can you also test whenever acpi hotplug works with windows + edk2? ]

On the windows guest,  Without using the above parameters, it can be hot-plugged successfully. Tested 3 times.

> SeaBIOS is a bit less strict.  It only checks whenever the root / downstream
> port has the "Slot Implemented" bit set, but not whenever the "Hot-Plug
> Capable"
> bit is set.
> 
> commit adding the edk2 implementation:
> https://github.com/tianocore/edk2/commit/
> ffdd337630cc0df4bf9f7ef853bc56bc9105fd43
> 
> I suspect simply dropping the check for the "Hot-Plug Capable" bit will be
> rather
> hard to sell upstream, because strictly speaking it is the correct thing to
> do.

Best Regards
Lei

Comment 18 Gerd Hoffmann 2021-09-21 08:20:10 UTC
> I suspect simply dropping the check for the "Hot-Plug Capable" bit will be
> rather hard to sell upstream, because strictly speaking it is the correct
> thing to do.

The other option we have is extending the ovmf hotplug driver to cover this.
Problem is without an AML interpreter (not present in OVMF) it is impossible
to figure whenever an given pcie root port uses acpi-based hotplug or has
hotplug disabled.

So just reserving io and mmio address space on all pcie root ports -- with
and without hotplug support -- is the only (slightly hackish) option we have
without doing host-side changes in qemu.

Comment 19 Gerd Hoffmann 2021-09-21 08:33:52 UTC
https://github.com/kraxel/edk2/commits/acpi-hotplug

Comment 27 Lei Yang 2022-02-08 07:30:09 UTC
==> Reproduced this problem on qemu-kvm-6.1.0-8.el9.x86_64.rpm

Test Version:
qemu-kvm-6.1.0-8.el9.x86_64.rpm
kernel-5.14.0-55.el9.x86_64
edk2-ovmf-20210527gite1999b264f1f-8.el9.noarch

Test steps

1. Boot a guest
/usr/libexec/qemu-kvm \
-name 'avocado-vt-vm1'  \
-sandbox on  \
-blockdev node-name=file_ovmf_code,driver=file,filename=/usr/share/OVMF/OVMF_CODE.secboot.fd,auto-read-only=on,discard=unmap \
-blockdev node-name=drive_ovmf_code,driver=raw,read-only=on,file=file_ovmf_code \
-blockdev node-name=file_ovmf_vars,driver=file,filename=/home/kvm_autotest_root/images/avocado-vt-vm1_rhel900-64-virtio-scsi_avocado-vt-vm1.qcow2_VARS.fd,auto-read-only=on,discard=unmap \
-blockdev node-name=drive_ovmf_vars,driver=raw,read-only=off,file=file_ovmf_vars \
-machine q35,memory-backend=mem-machine_mem,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars \
-device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
-device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
-nodefaults \
-device VGA,bus=pcie.0,addr=0x2 \
-m 30720 \
-object memory-backend-ram,size=30720M,id=mem-machine_mem  \
-smp 16,maxcpus=16,cores=8,threads=1,dies=1,sockets=2  \
-cpu 'EPYC-Rome',+kvm_pv_unhalt \
-device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
-device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \
-device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
-device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \
-blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel900-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \
-blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
-device scsi-hd,id=image1,drive=drive_image1,write-cache=on  \
-vnc :0  \
-rtc base=utc,clock=host,driftfix=slew  \
-boot menu=off,order=cdn,once=c,strict=off \
-net none \
-enable-kvm \
-device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=4 \
-device pcie-root-port,id=pcie_extra_root_port_1,addr=0x3.0x1,bus=pcie.0,chassis=5 \
-device pcie-root-port,id=pcie_extra_root_port_2,addr=0x3.0x2,bus=pcie.0,chassis=6 \
-device pcie-root-port,id=pcie_extra_root_port_3,addr=0x3.0x3,bus=pcie.0,chassis=7 \
-monitor stdio \
-qmp tcp:0:5555,server,nowait \

2. Hotplug nic
{"execute": "netdev_add", "arguments": {"type": "tap", "id": "netdev0"}}
{"return": {}}
3.{"execute": "device_add", "arguments": {"id": "nic0", "driver": "e1000e", "netdev": "netdev0", "mac": "9a:45:bb:a8:b1:90", "bus": "pcie_extra_root_port_0", "addr": "0x0"}}
{"return": {}}

3. Can not get NIC hotplugged via ifconfig/ip link show command inside guest.

4.Guest dmesg
[   29.349678] pci 0000:05:00.0: [8086:10d3] type 00 class 0x020000
[   29.350034] pci 0000:05:00.0: reg 0x10: [mem 0x00000000-0x0001ffff]
[   29.350356] pci 0000:05:00.0: reg 0x14: [mem 0x00000000-0x0001ffff]
[   29.350686] pci 0000:05:00.0: reg 0x18: [io  0x0000-0x001f]
[   29.350970] pci 0000:05:00.0: reg 0x1c: [mem 0x00000000-0x00003fff]
[   29.351336] pci 0000:05:00.0: reg 0x30: [mem 0x00000000-0x0003ffff pref]
[   29.352578] pci 0000:05:00.0: BAR 6: no space for [mem size 0x00040000 pref]
[   29.352907] pci 0000:05:00.0: BAR 6: failed to assign [mem size 0x00040000 pref]
[   29.353252] pci 0000:05:00.0: BAR 0: no space for [mem size 0x00020000]
[   29.353571] pci 0000:05:00.0: BAR 0: failed to assign [mem size 0x00020000]
[   29.353895] pci 0000:05:00.0: BAR 1: no space for [mem size 0x00020000]
[   29.354212] pci 0000:05:00.0: BAR 1: failed to assign [mem size 0x00020000]
[   29.354534] pci 0000:05:00.0: BAR 3: no space for [mem size 0x00004000]
[   29.354852] pci 0000:05:00.0: BAR 3: failed to assign [mem size 0x00004000]
[   29.355180] pci 0000:05:00.0: BAR 2: no space for [io  size 0x0020]
[   29.355471] pci 0000:05:00.0: BAR 2: failed to assign [io  size 0x0020]
[   29.409349] e1000e: Intel(R) PRO/1000 Network Driver
[   29.409632] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
[   29.411114] e1000e: probe of 0000:05:00.0 failed with error -5

==> Test pass on qemu-kvm-6.2.0-6.el9.x86_64

Test Version:
qemu-kvm-6.2.0-6.el9.x86_64
kernel-5.14.0-55.el9.x86_64
edk2-ovmf-20210527gite1999b264f1f-8.el9.noarch

1. Boot a guest
2. Hotplug a nic
3. Guest can get nic via ifconfig/ip link show command.

Based on above test result, this bug has been fixed well on qemu-kvm-6.2.0-6.el9.x86_64.

Comment 28 Yanghang Liu 2022-02-09 05:34:29 UTC
For vfio-pf part: This problem still existed in the latest qemu-kvm version


Test env:
host:
  qemu-kvm-6.2.0-6.el9.x86_64
  5.14.0-55.el9.x86_64
  edk2-ovmf-20210527gite1999b264f1f-8.el9.noarch
guest:
  5.14.0-55.el9.x86_64


Test device:
Both MT2892 PF and SFC9220 PF can reproduce this problem.


Test Step:
(1) start a Q35 + OVMF domain
# virt-install --machine=q35 --noreboot --name=rhel90 --memory=4096 --vcpus=4 --graphics type=vnc,port=5990,listen=0.0.0.0 --import --noautoconsole  --network bridge=switch,model=virtio,mac=52:54:00:00:90:90 --disk path=/home/images/RHEL90.qcow2,bus=virtio,cache=none,format=qcow2,io=threads,size=20  --boot=uefi --boot nvram.template=/usr/share/edk2/ovmf/OVMF_VARS.fd
# virsh start rhel90

(2) hot-plug a PF into the domain
# virsh attach-device rhel90 0000\:3b\:00.0.xml

(3) check the PF info in the domain
# ifconfig <-- I can not get any PF info here

# lspci
...
04:00.0 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]

# dmesg
[  279.937708] pci 0000:04:00.0: [15b3:101d] type 00 class 0x020000
[  279.941495] pci 0000:04:00.0: reg 0x10: [mem 0x00000000-0x01ffffff 64bit pref]
[  279.945293] pci 0000:04:00.0: reg 0x30: [mem 0x00000000-0x000fffff pref]
[  279.947345] pci 0000:04:00.0: Max Payload Size set to 128 (was 256, max 512)
[  279.950637] pci 0000:04:00.0: PME# supported from D3cold
[  279.954359] pci 0000:04:00.0: 126.016 Gb/s available PCIe bandwidth, limited by 8.0 GT/s PCIe x16 link at 0000:00:02.3 (capable of 252.048 Gb/s with 16.0 GT/s PCIe x16 link)
[  279.964555] pci 0000:04:00.0: BAR 0: no space for [mem size 0x02000000 64bit pref]
[  279.968099] pci 0000:04:00.0: BAR 0: failed to assign [mem size 0x02000000 64bit pref]
[  279.971755] pci 0000:04:00.0: BAR 6: assigned [mem 0xc1000000-0xc10fffff pref]
[  280.062655] mlx5_core 0000:04:00.0: Missing registers BAR, aborting
[  280.063989] mlx5_core 0000:04:00.0: mlx5_pci_init:768:(pid 1231): error requesting BARs, aborting
[  280.066176] mlx5_core 0000:04:00.0: probe_one:1480:(pid 1231): mlx5_pci_init failed with error code -19

Comment 29 Gerd Hoffmann 2022-02-14 11:04:01 UTC
(In reply to Yanghang Liu from comment #28)
> For vfio-pf part: This problem still existed in the latest qemu-kvm version

Is that actually a regression (i.e. does it work with qemu 6.0)?

> [  279.964555] pci 0000:04:00.0: BAR 0: no space for [mem size 0x02000000
> 64bit pref]

32 MB bar.  Default bridge window size is 2 MB.  So the pcie root port configuration must be tweaked.

# qemu-kvm -device pcie-root-port,help | grep reserve
  bus-reserve=<uint32>   -  (default: 4294967295)
  io-reserve=<size>      -  (default: 18446744073709551615)
  mem-reserve=<size>     -  (default: 18446744073709551615)
  pref32-reserve=<size>  -  (default: 18446744073709551615)
  pref64-reserve=<size>  -  (default: 18446744073709551615)

I think pref64-reserve is the one you need.

Comment 30 Yanghang Liu 2022-02-15 09:49:53 UTC
(In reply to Gerd Hoffmann from comment #29)
> > For vfio-pf part: This problem still existed in the latest qemu-kvm version
> 
> Is that actually a regression (i.e. does it work with qemu 6.0)?

Hi Gerd,

Yes. The problem can be fixed if I add "-global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off" into the vm cfg.

I found that there are several similar bugs that have been closed, but my problem still has not been fixed completely in fact.

 Bug 2001732 - [virtual network][qemu-6.1.0-1] Fail to hotplug nic with rtl8139 driver
 Bug 2001719 - fail to hotplug NIC with edk2 firmware
 Bug 2004829 - [ovmf] The guest does not present hot-plugged disk
 Bug 2007129 - pcie hotplug emulation has various problems due to insufficient state tracking 

May I ask if I need to open a separate bug to track this issue for PF/VF ?

> > [  279.964555] pci 0000:04:00.0: BAR 0: no space for [mem size 0x02000000 64bit pref]
> 
> 32 MB bar.  Default bridge window size is 2 MB.  So the pcie root port configuration must be tweaked.
> 
> # qemu-kvm -device pcie-root-port,help | grep reserve
>   bus-reserve=<uint32>   -  (default: 4294967295)
>   io-reserve=<size>      -  (default: 18446744073709551615)
>   mem-reserve=<size>     -  (default: 18446744073709551615)
>   pref32-reserve=<size>  -  (default: 18446744073709551615)
>   pref64-reserve=<size>  -  (default: 18446744073709551615)
> 
> I think pref64-reserve is the one you need.

Could you please share the related detailed qemu command line or domain xml with me ?

I'll add it into the VM cfg and see if it'll fix my problem.

Comment 31 Gerd Hoffmann 2022-02-15 10:48:51 UTC
(In reply to Yanghang Liu from comment #30)
> (In reply to Gerd Hoffmann from comment #29)
> > > For vfio-pf part: This problem still existed in the latest qemu-kvm version
> > 
> > Is that actually a regression (i.e. does it work with qemu 6.0)?
> 
> Hi Gerd,
> 
> Yes. The problem can be fixed if I add "-global
> ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off" into the vm cfg.

Ok.  Apparently yet another regression caused by acpi hotplug.
Most likely the guest kernel rearranges the bridge windows when
using pcie hotplug driver and doesn't when using acpi hotplug
driver.

Can you attach the guest kernel log for both working and
non-working case?

> May I ask if I need to open a separate bug to track this issue for PF/VF ?

Most likely this is not related to PF/VF, but to a pci device with a
memory bar larger than 2M.  Can probably also be reproduced using
'device_add pci-testdev,membar=4M'.

> Could you please share the related detailed qemu command line or domain xml
> with me ?
> 
> I'll add it into the VM cfg and see if it'll fix my problem.

Using this ...

  <qemu:commandline>
    <qemu:arg value='-global'/>
    <qemu:arg value='pcie-root-port.pref64-reserve=64M'/>
  </qemu:commandline>

... should give you pcie ports looking like this ...

00:02.0 PCI bridge: Red Hat, Inc. QEMU PCIe Root port (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0, IRQ 22
	Memory at c504c000 (32-bit, non-prefetchable) [size=4K]
	Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
	I/O behind bridge: 0000d000-0000dfff [size=4K]
	Memory behind bridge: c4e00000-c4ffffff [size=2M]
	Prefetchable memory behind bridge: 0000000800000000-0000000803ffffff [size=64M]
                                                                             ^^^^^^^^^^
	Capabilities: <access denied>
	Kernel driver in use: pcieport

... should be is enough space for a 32M bar.

Comment 32 Yanghang Liu 2022-02-16 03:57:29 UTC
(In reply to Gerd Hoffmann from comment #31)

> > Yes. The problem can be fixed if I add "-global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off" into the vm cfg.
> 
> Ok.  Apparently yet another regression caused by acpi hotplug.
> Most likely the guest kernel rearranges the bridge windows when using pcie hotplug driver and doesn't when using acpi hotplug driver.


> Can you attach the guest kernel log for both working and non-working case?

> Most likely this is not related to PF/VF, but to a pci device with a memory bar larger than 2M.  
> Can probably also be reproduced using 'device_add pci-testdev,membar=4M'.

Thanks Gerd for the explanation.

My test results seem to be consistent with what Gerd said.

1.Test "Hot-plug the pci-testdev whose membar=4M into the vm" scenario without "-global pcie-root-port.pref64-reserve=64M" in the vm cfg:
  
  # virsh qemu-monitor-command rhel90 --hmp "device_add pci-testdev,membar=4M,bus=pci.4"

    The related dmesg in the vm:

        # dmesg
        [   91.926757] pci 0000:04:00.0: [1b36:0005] type 00 class 0x00ff00
        [   91.930049] pci 0000:04:00.0: reg 0x10: [mem 0x00000000-0x00000fff]
        [   91.933174] pci 0000:04:00.0: reg 0x14: [io  0x0000-0x00ff]
        [   91.936015] pci 0000:04:00.0: reg 0x18: [mem 0x00000000-0x003fffff 64bit pref]
        [   91.941166] pci 0000:04:00.0: BAR 2: no space for [mem size 0x00400000 64bit pref]
        [   91.944529] pci 0000:04:00.0: BAR 2: failed to assign [mem size 0x00400000 64bit pref]
        [   91.947251] pci 0000:04:00.0: BAR 0: assigned [mem 0xc1000000-0xc1000fff]
        [   91.949654] pci 0000:04:00.0: BAR 1: assigned [io  0x6000-0x60ff]


2. Test "Hot-plug the pci-testdev whose membar=4M into the vm" scenario with "-global pcie-root-port.pref64-reserve=64M" in the vm cfg:

  # virsh qemu-monitor-command rhel90 --hmp "device_add pci-testdev,membar=4M,bus=pci.4"

   The related dmesg in the vm:

        # dmesg
        [   45.943065] pci 0000:04:00.0: [1b36:0005] type 00 class 0x00ff00
        [   45.946711] pci 0000:04:00.0: reg 0x10: [mem 0x00000000-0x00000fff]
        [   45.950183] pci 0000:04:00.0: reg 0x14: [io  0x0000-0x00ff]
        [   45.952725] pci 0000:04:00.0: reg 0x18: [mem 0x00000000-0x003fffff 64bit pref]
        [   45.958099] pci 0000:04:00.0: BAR 2: assigned [mem 0x80c000000-0x80c3fffff 64bit pref]
        [   45.962333] pci 0000:04:00.0: BAR 0: assigned [mem 0xc1000000-0xc1000fff]
        [   45.965884] pci 0000:04:00.0: BAR 1: assigned [io  0x6000-0x60ff]

Please feel free to let me know if we need to open a separate bug to track this issue.

Comment 33 Yanghang Liu 2022-02-16 04:10:55 UTC
(In reply to Yanghang Liu from comment #28)


> For vfio-pf part: This problem still existed in the latest qemu-kvm version
> 
> 
> Test env:
> host:
>   qemu-kvm-6.2.0-6.el9.x86_64
>   5.14.0-55.el9.x86_64
>   edk2-ovmf-20210527gite1999b264f1f-8.el9.noarch
> guest:
>   5.14.0-55.el9.x86_64
> 
> 
> Test device:
> Both MT2892 PF and SFC9220 PF can reproduce this problem.
> 
> 
> Test Step:
> (1) start a Q35 + OVMF domain
> # virt-install --machine=q35 --noreboot --name=rhel90 --memory=4096
> --vcpus=4 --graphics type=vnc,port=5990,listen=0.0.0.0 --import
> --noautoconsole  --network bridge=switch,model=virtio,mac=52:54:00:00:90:90
> --disk
> path=/home/images/RHEL90.qcow2,bus=virtio,cache=none,format=qcow2,io=threads,
> size=20  --boot=uefi --boot nvram.template=/usr/share/edk2/ovmf/OVMF_VARS.fd
> # virsh start rhel90
> 
> (2) hot-plug a PF into the domain
> # virsh attach-device rhel90 0000\:3b\:00.0.xml
> 
> (3) check the PF info in the domain
> # ifconfig <-- I can not get any PF info here
> 
> # lspci
> ...
> 04:00.0 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6
> Dx]
> 
> # dmesg
> [  279.937708] pci 0000:04:00.0: [15b3:101d] type 00 class 0x020000
> [  279.941495] pci 0000:04:00.0: reg 0x10: [mem 0x00000000-0x01ffffff 64bit
> pref]
> [  279.945293] pci 0000:04:00.0: reg 0x30: [mem 0x00000000-0x000fffff pref]
> [  279.947345] pci 0000:04:00.0: Max Payload Size set to 128 (was 256, max
> 512)
> [  279.950637] pci 0000:04:00.0: PME# supported from D3cold
> [  279.954359] pci 0000:04:00.0: 126.016 Gb/s available PCIe bandwidth,
> limited by 8.0 GT/s PCIe x16 link at 0000:00:02.3 (capable of 252.048 Gb/s
> with 16.0 GT/s PCIe x16 link)
> [  279.964555] pci 0000:04:00.0: BAR 0: no space for [mem size 0x02000000
> 64bit pref]
> [  279.968099] pci 0000:04:00.0: BAR 0: failed to assign [mem size
> 0x02000000 64bit pref]
> [  279.971755] pci 0000:04:00.0: BAR 6: assigned [mem 0xc1000000-0xc10fffff
> pref]
> [  280.062655] mlx5_core 0000:04:00.0: Missing registers BAR, aborting
> [  280.063989] mlx5_core 0000:04:00.0: mlx5_pci_init:768:(pid 1231): error
> requesting BARs, aborting
> [  280.066176] mlx5_core 0000:04:00.0: probe_one:1480:(pid 1231):
> mlx5_pci_init failed with error code -19

Retest this scenario: The PF can be hot-plugged successfully with "-global pcie-root-port.pref64-reserve=64M" in the vm cfg:


The related device info in the vm after hot-plugging the PF into the vm:

  # lshw -c network -businfo
  Bus info          Device     Class          Description
  =======================================================
  pci@0000:04:00.0  enp4s0np0  network        MT2892 Family [ConnectX-6 Dx]


  # lspci -v -s 00:02.3
  00:02.3 PCI bridge: Red Hat, Inc. QEMU PCIe Root port (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0, IRQ 22
	Memory at c1842000 (32-bit, non-prefetchable) [size=4K]
	Bus: primary=00, secondary=04, subordinate=04, sec-latency=0
	I/O behind bridge: 00006000-00006fff [size=4K]
	Memory behind bridge: c1000000-c11fffff [size=2M]
	Prefetchable memory behind bridge: 000000080c000000-000000080fffffff [size=64M]
	Capabilities: [90] Vendor Specific Information: Len=20 <?>
	Capabilities: [54] Express Root Port (Slot+), MSI 00
	Capabilities: [48] MSI-X: Enable+ Count=1 Masked-
	Capabilities: [40] Subsystem: Red Hat, Inc. Device 0000
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [148] Access Control Services
	Kernel driver in use: pcieport


  # dmesg
  [  543.562782] pci 0000:04:00.0: [15b3:101d] type 00 class 0x020000
  [  543.565120] pci 0000:04:00.0: reg 0x10: [mem 0x00000000-0x01ffffff 64bit pref]
  [  543.567828] pci 0000:04:00.0: reg 0x30: [mem 0x00000000-0x000fffff pref]
  [  543.570044] pci 0000:04:00.0: Max Payload Size set to 128 (was 256, max 512)
  [  543.573637] pci 0000:04:00.0: PME# supported from D3cold
  [  543.576084] pci 0000:04:00.0: 126.016 Gb/s available PCIe bandwidth, limited by 8.0 GT/s PCIe x16 link at 0000:00:02.3 (capable of 252.048 Gb/s with 16.0 GT/s PCIe x16 link)
  [  543.578956] pci 0000:04:00.0: BAR 0: assigned [mem 0x80c000000-0x80dffffff 64bit pref]
  [  543.580097] pci 0000:04:00.0: BAR 6: assigned [mem 0xc1000000-0xc10fffff pref]
  [  543.685000] mlx5_core 0000:04:00.0: enabling device (0000 -> 0002)
  [  543.687604] ACPI: \_SB_.GSIH: Enabled at IRQ 23
  [  543.688737] mlx5_core 0000:04:00.0: firmware version: 22.28.4000
  [  543.689436] mlx5_core 0000:04:00.0: 126.016 Gb/s available PCIe bandwidth, limited by 8.0 GT/s PCIe x16 link at 0000:00:02.3 (capable of 252.048 Gb/s with 16.0 GT/s PCIe x16 link)
  [  543.941005] mlx5_core 0000:04:00.0: Rate limit: 127 rates are supported, range: 0Mbps to 97656Mbps
  [  543.943951] mlx5_core 0000:04:00.0: E-Switch: Total vports 2, per vport: max uc(1024) max mc(16384)
  [  543.956711] mlx5_core 0000:04:00.0: Port module event: module 1, Cable plugged
  [  543.960765] mlx5_core 0000:04:00.0: mlx5_pcie_event:298:(pid 35): PCIe slot advertised sufficient power (75W).
  [  543.974441] mlx5_core 0000:04:00.0: mlx5e: IPSec ESP acceleration enabled
  [  543.976800] mlx5_core 0000:04:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
  [  544.117782] mlx5_core 0000:04:00.0: Supported tc offload range - chains: 4294967294, prios: 4294967295
  [  544.229156] mlx5_core 0000:04:00.0 enp4s0np0: renamed from eth0
  [  544.342061] mlx5_core 0000:04:00.0 enp4s0np0: Link up
  [  544.349267] IPv6: ADDRCONF(NETDEV_CHANGE): enp4s0np0: link becomes ready

Comment 34 Gerd Hoffmann 2022-02-16 08:44:10 UTC
> My test results seem to be consistent with what Gerd said.
> 
> 1.Test "Hot-plug the pci-testdev whose membar=4M into the vm" scenario
> without "-global pcie-root-port.pref64-reserve=64M" in the vm cfg:

> 2. Test "Hot-plug the pci-testdev whose membar=4M into the vm" scenario with
> "-global pcie-root-port.pref64-reserve=64M" in the vm cfg:

For completeness: with acpi hotplug turned off:

[   44.264984] pcieport 0000:00:02.5: Slot(0-5): Attention button pressed
[   44.266268] pcieport 0000:00:02.5: Slot(0-5) Powering on due to button press
[   44.267733] pcieport 0000:00:02.5: Slot(0-5): Card present
[   44.268734] pcieport 0000:00:02.5: Slot(0-5): Link Up
[   44.394230] pci 0000:06:00.0: [1b36:0005] type 00 class 0x00ff00
[   44.395811] pci 0000:06:00.0: reg 0x10: [mem 0x00000000-0x00000fff]
[   44.397390] pci 0000:06:00.0: reg 0x14: [io  0x0000-0x00ff]
[   44.398833] pci 0000:06:00.0: reg 0x18: [mem 0x00000000-0x003fffff 64bit pref]
[   44.400970] pci 0000:06:00.0: BAR 2: no space for [mem size 0x00400000 64bit pref]
[   44.402746] pci 0000:06:00.0: BAR 2: failed to assign [mem size 0x00400000 64bit pref]
[   44.404673] pci 0000:06:00.0: BAR 0: assigned [mem 0xc4400000-0xc4400fff]
[   44.405960] pci 0000:06:00.0: BAR 1: assigned [io  0x8000-0x80ff]
[   44.407040] pcieport 0000:00:02.5: PCI bridge to [bus 06]
[   44.407995] pcieport 0000:00:02.5:   bridge window [io  0x8000-0x8fff]
[   44.412885] pcieport 0000:00:02.5:   bridge window [mem 0xc4400000-0xc45fffff]
[   44.416289] pcieport 0000:00:02.5:   bridge window [mem 0x800600000-0x8007fffff 64bit pref]
[   44.422173] pcieport 0000:00:02.5: resource 15 [mem 0x800600000-0x8007fffff 64bit pref] released
[   44.423506] pcieport 0000:00:02.5: PCI bridge to [bus 06]
[   44.428528] pcieport 0000:00:02.5: BAR 15: assigned [mem 0x800c00000-0x800ffffff 64bit pref]
[   44.429791] pci 0000:06:00.0: BAR 2: assigned [mem 0x800c00000-0x800ffffff 64bit pref]
[   44.431895] pcieport 0000:00:02.5: PCI bridge to [bus 06]
[   44.432824] pcieport 0000:00:02.5:   bridge window [io  0x8000-0x8fff]
[   44.436173] pcieport 0000:00:02.5:   bridge window [mem 0xc4400000-0xc45fffff]
[   44.438798] pcieport 0000:00:02.5:   bridge window [mem 0x800c00000-0x800ffffff 64bit pref]

So with pcie hotplug the kernel makes the bridge window (0000:00:02.5: resource 15) larger so the bar fits, which does not happen with apci hotplug.

Yes, creating a new bug for that makes sense.

Comment 35 Yanghang Liu 2022-02-16 10:57:21 UTC
(In reply to Gerd Hoffmann from comment #34)

> 
> For completeness: with acpi hotplug turned off:
> 
> [   44.264984] pcieport 0000:00:02.5: Slot(0-5): Attention button pressed
> [   44.266268] pcieport 0000:00:02.5: Slot(0-5) Powering on due to button
> press
> [   44.267733] pcieport 0000:00:02.5: Slot(0-5): Card present
> [   44.268734] pcieport 0000:00:02.5: Slot(0-5): Link Up
> [   44.394230] pci 0000:06:00.0: [1b36:0005] type 00 class 0x00ff00
> [   44.395811] pci 0000:06:00.0: reg 0x10: [mem 0x00000000-0x00000fff]
> [   44.397390] pci 0000:06:00.0: reg 0x14: [io  0x0000-0x00ff]
> [   44.398833] pci 0000:06:00.0: reg 0x18: [mem 0x00000000-0x003fffff 64bit
> pref]
> [   44.400970] pci 0000:06:00.0: BAR 2: no space for [mem size 0x00400000
> 64bit pref]
> [   44.402746] pci 0000:06:00.0: BAR 2: failed to assign [mem size
> 0x00400000 64bit pref]
> [   44.404673] pci 0000:06:00.0: BAR 0: assigned [mem 0xc4400000-0xc4400fff]
> [   44.405960] pci 0000:06:00.0: BAR 1: assigned [io  0x8000-0x80ff]
> [   44.407040] pcieport 0000:00:02.5: PCI bridge to [bus 06]
> [   44.407995] pcieport 0000:00:02.5:   bridge window [io  0x8000-0x8fff]
> [   44.412885] pcieport 0000:00:02.5:   bridge window [mem
> 0xc4400000-0xc45fffff]
> [   44.416289] pcieport 0000:00:02.5:   bridge window [mem
> 0x800600000-0x8007fffff 64bit pref]
> [   44.422173] pcieport 0000:00:02.5: resource 15 [mem
> 0x800600000-0x8007fffff 64bit pref] released
> [   44.423506] pcieport 0000:00:02.5: PCI bridge to [bus 06]
> [   44.428528] pcieport 0000:00:02.5: BAR 15: assigned [mem
> 0x800c00000-0x800ffffff 64bit pref]
> [   44.429791] pci 0000:06:00.0: BAR 2: assigned [mem
> 0x800c00000-0x800ffffff 64bit pref]
> [   44.431895] pcieport 0000:00:02.5: PCI bridge to [bus 06]
> [   44.432824] pcieport 0000:00:02.5:   bridge window [io  0x8000-0x8fff]
> [   44.436173] pcieport 0000:00:02.5:   bridge window [mem
> 0xc4400000-0xc45fffff]
> [   44.438798] pcieport 0000:00:02.5:   bridge window [mem
> 0x800c00000-0x800ffffff 64bit pref]
> 
> So with pcie hotplug the kernel makes the bridge window (0000:00:02.5:
> resource 15) larger so the bar fits, which does not happen with apci hotplug.
> 
> Yes, creating a new bug for that makes sense.

Thanks Gerd for the info.

I have opened a new bug for tracking the issue from comment 28 to comment 33.   

Bug 2055123 - [Q35] Failed to hot-plug a device whose membar > 2M into the vm

Comment 36 Gerd Hoffmann 2022-05-25 06:40:48 UTC
closing this as the original issue has been addressed on the qemu-kvm side in 8.6 (and the remaining >2M bar issue got its own bug).


Note You need to log in before you can comment on or make changes to this bug.