Bug 2001732
Summary: | [virtual network][qemu-6.1.0-1] Fail to hotplug nic with rtl8139 driver | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Lei Yang <leiyang> | |
Component: | seabios | Assignee: | Gerd Hoffmann <kraxel> | |
Status: | CLOSED ERRATA | QA Contact: | jingzhao <jinzhao> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 9.0 | CC: | aadam, ailan, chayang, coli, fjin, imammedo, jinzhao, jusual, juzhang, lvivier, mrezanin, mst, ppolawsk, virt-maint, yafu, yalzhang, yanghliu, yiwei | |
Target Milestone: | rc | Keywords: | Regression, Triaged | |
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | seabios-1.15.0-1.el9 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 2001921 (view as bug list) | Environment: | ||
Last Closed: | 2022-05-17 12:46:40 UTC | Type: | --- | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 2018393 | |||
Bug Blocks: | 2001921 |
Description
Lei Yang
2021-09-07 03:45:49 UTC
Assigned to Ariel for initial triage per bz process and age of bug created or assigned to virt-maint without triage. The regression has been introduced by: 17858a169508 ("hw/acpi/ich9: Set ACPI PCI hot-plug as default on Q35") A workaround is to add to the command line: -global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off The problem can be reproduced upstream too. Julia, any idea of what happens? Thanks (In reply to Laurent Vivier from comment #3) > The regression has been introduced by: > > 17858a169508 ("hw/acpi/ich9: Set ACPI PCI hot-plug as default on Q35") > > A workaround is to add to the command line: > > -global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off > > The problem can be reproduced upstream too. > > Julia, any idea of what happens? Yes, we had the same issue with root ports. BIOS/guest sees that native hot-plug flag is turned off, thinks that the bridge is not hot-pluggable, and does not allocate IO. Solution (sort of) for root ports is already merged: [PATCH] hw/pcie-root-port: Fix hotplug for PCI devices requiring IO Best regards, Julia Suvorova. (In reply to Julia Suvorova from comment #4) > (In reply to Laurent Vivier from comment #3) > > The regression has been introduced by: > > > > 17858a169508 ("hw/acpi/ich9: Set ACPI PCI hot-plug as default on Q35") > > > > A workaround is to add to the command line: > > > > -global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off > > > > The problem can be reproduced upstream too. > > > > Julia, any idea of what happens? > > Yes, we had the same issue with root ports. BIOS/guest sees that native > hot-plug flag is turned off, > thinks that the bridge is not hot-pluggable, and does not allocate IO. > > Solution (sort of) for root ports is already merged: > [PATCH] hw/pcie-root-port: Fix hotplug for PCI devices requiring IO commit e2a6290aab578b2170c1f5909fa556385dc0d820 Author: Marcel Apfelbaum <marcel.apfelbaum> Date: Mon Aug 2 12:00:57 2021 +0300 hw/pcie-root-port: Fix hotplug for PCI devices requiring IO Q35 has now ACPI hotplug enabled by default for PCI(e) devices. As opposed to native PCIe hotplug, guests like Fedora 34 will not assign IO range to pcie-root-ports not supporting native hotplug, resulting into a regression. Reproduce by: qemu-bin -M q35 -device pcie-root-port,id=p1 -monitor stdio device_add e1000,bus=p1 In the Guest OS the respective pcie-root-port will have the IO range disabled. Fix it by setting the "reserve-io" hint capability of the pcie-root-ports so the firmware will allocate the IO range instead. Acked-by: Igor Mammedov <imammedo> Signed-off-by: Marcel Apfelbaum <marcel> Message-Id: <20210802090057.1709775-1-marcel> Reviewed-by: Michael S. Tsirkin <mst> Signed-off-by: Michael S. Tsirkin <mst> I have no idea how to do the same solution for the bridge. Marcel, could you help? *** Bug 2007072 has been marked as a duplicate of this bug. *** (In reply to Laurent Vivier from comment #5) > (In reply to Julia Suvorova from comment #4) > > (In reply to Laurent Vivier from comment #3) > > > The regression has been introduced by: > > > > > > 17858a169508 ("hw/acpi/ich9: Set ACPI PCI hot-plug as default on Q35") > > > > > > A workaround is to add to the command line: > > > > > > -global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off > > > > > > The problem can be reproduced upstream too. > > > > > > Julia, any idea of what happens? > > [...] > > I have no idea how to do the same solution for the bridge. > > Marcel, could you help? I will have a look, sure. Hi Laurent, Could you please help check the following "hot-plug X710 PF into vm" test steps ? Test env: host: 5.14.0-7.el9.x86_64 qemu-kvm-6.1.0-5.el9.x86_64 guest: 5.14.0-7.el9.x86_64 Test steps: (1) start a Q35 + Seabios RHEL90 vm virt-install --machine=q35 --noreboot --name=rhel90 --memory=4096 --vcpus=4 --graphics type=vnc,port=5990,listen=0.0.0.0 --network bridge=switch,model=virtio,mac=52:54:00:00:90:90 --import --noautoconsole --disk path=/home/images/RHEL90.qcow2,bus=virtio,cache=none,format=qcow2,io=threads,size=20 (2) hot-plug a X710 PF into the RHEL90 vm # virsh attach-device rhel90 $x710_pf_device.xml The X710 PF device xml: <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x17' slot='0x00' function='0x1'/> </source> <alias name='hostdev0'/> </hostdev> (3) check the X710 PF info in the vm (3.1) # dmesg ... [ 42.451269] pci 0000:04:00.0: [8086:15ff] type 00 class 0x020000 [ 42.452530] pci 0000:04:00.0: reg 0x10: [mem 0x00000000-0x00ffffff 64bit pref] [ 42.453963] pci 0000:04:00.0: reg 0x1c: [mem 0x00000000-0x00007fff 64bit pref] [ 42.455244] pci 0000:04:00.0: reg 0x30: [mem 0x00000000-0x0007ffff pref] [ 42.456541] pci 0000:04:00.0: Max Payload Size set to 128 (was 512, max 2048) [ 42.458352] pci 0000:04:00.0: PME# supported from D0 D3hot D3cold [ 42.460752] pci 0000:04:00.0: BAR 0: no space for [mem size 0x01000000 64bit pref] [ 42.461916] pci 0000:04:00.0: BAR 0: failed to assign [mem size 0x01000000 64bit pref] [ 42.463075] pci 0000:04:00.0: BAR 6: assigned [mem 0xfe200000-0xfe27ffff pref] [ 42.464230] pci 0000:04:00.0: BAR 3: assigned [mem 0xfd400000-0xfd407fff 64bit pref] [ 42.491141] i40e: Intel(R) Ethernet Connection XL710 Network Driver [ 42.492142] i40e: Copyright (c) 2013 - 2019 Intel Corporation. [ 42.493588] i40e 0000:04:00.0: enabling device (0140 -> 0142) [ 42.496071] i40e 0000:04:00.0: Cannot map registers, bar size 0x0 too small, aborting [ 42.497920] i40e: probe of 0000:04:00.0 failed with error -12 (3.2) # lshw -c network -businfo Bus info Device Class Description ======================================================= pci@0000:04:00.0 network Ethernet Controller X710 for 10GBASE # ifconfig The "ifconfig" does not show any info about the X710 PF device (4) After add the "-global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off" parameter , the same X710 PF device can be hot-plugged into the vm successfully (4.1) The relared vm dmesg: # dmesg ... [ 926.277733] pcieport 0000:00:02.3: pciehp: Slot(0-3): Attention button pressed [ 926.278398] pcieport 0000:00:02.3: pciehp: Slot(0-3) Powering on due to button press [ 926.279185] pcieport 0000:00:02.3: pciehp: Slot(0-3): Card present [ 926.279760] pcieport 0000:00:02.3: pciehp: Slot(0-3): Link Up [ 926.408015] pci 0000:04:00.0: [8086:15ff] type 00 class 0x020000 [ 926.408766] pci 0000:04:00.0: reg 0x10: [mem 0x00000000-0x00ffffff 64bit pref] [ 926.409434] pci 0000:04:00.0: reg 0x1c: [mem 0x00000000-0x00007fff 64bit pref] [ 926.410066] pci 0000:04:00.0: reg 0x30: [mem 0x00000000-0x0007ffff pref] [ 926.410678] pci 0000:04:00.0: Max Payload Size set to 128 (was 512, max 2048) [ 926.411760] pci 0000:04:00.0: PME# supported from D0 D3hot D3cold [ 926.412747] pci 0000:04:00.0: BAR 0: no space for [mem size 0x01000000 64bit pref] [ 926.413345] pci 0000:04:00.0: BAR 0: failed to assign [mem size 0x01000000 64bit pref] [ 926.413974] pci 0000:04:00.0: BAR 6: assigned [mem 0xfe200000-0xfe27ffff pref] [ 926.414553] pci 0000:04:00.0: BAR 3: assigned [mem 0xfd400000-0xfd407fff 64bit pref] [ 926.415229] pcieport 0000:00:02.3: PCI bridge to [bus 04] [ 926.415671] pcieport 0000:00:02.3: bridge window [io 0x4000-0x4fff] [ 926.417175] pcieport 0000:00:02.3: bridge window [mem 0xfe200000-0xfe3fffff] [ 926.418616] pcieport 0000:00:02.3: bridge window [mem 0xfd400000-0xfd5fffff 64bit pref] [ 926.420474] PCI: No. 2 try to assign unassigned res [ 926.420476] release child resource [mem 0xfd400000-0xfd407fff 64bit pref] [ 926.420478] pcieport 0000:00:02.3: resource 15 [mem 0xfd400000-0xfd5fffff 64bit pref] released [ 926.421229] pcieport 0000:00:02.3: PCI bridge to [bus 04] [ 926.423177] pcieport 0000:00:02.3: BAR 15: assigned [mem 0x180000000-0x1817fffff 64bit pref] [ 926.423855] pci 0000:04:00.0: BAR 0: assigned [mem 0x180000000-0x180ffffff 64bit pref] [ 926.424569] pci 0000:04:00.0: BAR 3: assigned [mem 0x181000000-0x181007fff 64bit pref] [ 926.425282] pcieport 0000:00:02.3: PCI bridge to [bus 04] [ 926.425726] pcieport 0000:00:02.3: bridge window [io 0x4000-0x4fff] [ 926.427110] pcieport 0000:00:02.3: bridge window [mem 0xfe200000-0xfe3fffff] [ 926.428663] pcieport 0000:00:02.3: bridge window [mem 0x180000000-0x1817fffff 64bit pref] [ 926.444123] i40e: Intel(R) Ethernet Connection XL710 Network Driver [ 926.444585] i40e: Copyright (c) 2013 - 2019 Intel Corporation. [ 926.445205] i40e 0000:04:00.0: enabling device (0140 -> 0142) [ 926.461068] i40e 0000:04:00.0: fw 8.1.63299 api 1.12 nvm 8.10 0x800093ec 1.2829.0 [8086:15ff] [8086:0000] [ 926.461731] i40e 0000:04:00.0: The driver for the device detected a newer version of the NVM image v1.12 than expected v1.9. Please install the most recent version of the network driver. [ 926.841580] i40e 0000:04:00.0: MAC address: b4:96:91:a6:68:79 [ 926.842627] i40e 0000:04:00.0: FW LLDP is enabled [ 926.855577] i40e 0000:04:00.0: PCI-Express: Speed 8.0GT/s Width x8 [ 926.858211] i40e 0000:04:00.0: Features: PF-id[1] VFs: 64 VSIs: 66 QP: 4 RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA (4.2) The "ifconfig" can show the interface info about the X710 PF device # lshw -c network -businfo Bus info Device Class Description ======================================================= pci@0000:04:00.0 eth1 network Ethernet Controller X710 for 10GBASE # ifconfig eth1 eth1: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500 ether b4:96:91:a6:68:79 txqueuelen 1000 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 It seems to me that the above problem has the same root cause with this BZ. Is my understanding correct ? (In reply to Yanghang Liu from comment #8) > Hi Laurent, > > Could you please help check the following "hot-plug X710 PF into vm" test > steps ? ... > It seems to me that the above problem has the same root cause with this BZ. > > Is my understanding correct ? If it works with "-global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off" the root cause is the same. Fix posted upstream: https://mail.coreboot.org/hyperkitty/list/seabios@seabios.org/thread/OZDGF64H6HTFKLZCJX7ISJ44SX5ZN7II/ Issue was in SeaBIOS, so I'm changing component to it. Assigning to Igor directly since there's an ITR defined and someone will need to drive/watch getting this into downstream (and that isn't virt-maint). Klaus - Igor contacted me directly and asked to reassign to someone in firmware: Merging upstream was delayed to the next SeaBIOS release (so it is not getting in QEMU 6.2). It should be assigned to SeaBIOS maintainers to backport fix, when they see it is fine to do so (that the reason I haven't assigned BZ to myself) (In reply to John Ferlan from comment #16) > Klaus - Igor contacted me directly and asked to reassign to someone in > firmware: > > Merging upstream was delayed to the next SeaBIOS release > (so it is not getting in QEMU 6.2). It should be assigned to SeaBIOS > maintainers to backport fix, when they see it is fine to do so > (that the reason I haven't assigned BZ to myself) Thanks, bouncing this to Gerd as he's should have the last word here I guess depends on 2018393 see also bug 2018392 (rhel-8 version of the rebase bug with some more comments). the seabios versions for 8.6 and 9.0 should be almost identical. the plan is to jump to 1.15 now and to 1.16 in january, possibly backport fixes interim. QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass. Reproduce this bug with seabios-1.14.0-7.el9.x86_64, Verified it with seabios-1.15.0-1.el9.x86_64. Following is the detailed verified steps: Host version: kernel-5.14.0-32.el9.x86_64 qemu-kvm-6.1.0-8.el9.x86_64 seabios-1.15.0-1.el9.x86_64 Guest: rhel9.0.0 1.Boot a guest without nic device /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -sandbox on \ -machine q35,memory-backend=mem-machine_mem \ -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \ -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0 \ -nodefaults \ -device VGA,bus=pcie.0,addr=0x2 \ -m 7168 \ -object memory-backend-ram,size=7168M,id=mem-machine_mem \ -smp 6,maxcpus=6,cores=3,threads=1,dies=1,sockets=2 \ -cpu 'EPYC',+kvm_pv_unhalt \ -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \ -device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \ -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \ -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel900-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \ -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \ -vnc :0 \ -rtc base=utc,clock=host,driftfix=slew \ -boot menu=off,order=cdn,once=c,strict=off \ -net none \ -enable-kvm \ -device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=4 \ -device pcie-root-port,id=pcie_extra_root_port_1,addr=0x3.0x1,bus=pcie.0,chassis=5 \ -device pcie-root-port,id=pcie_extra_root_port_2,addr=0x3.0x2,bus=pcie.0,chassis=6 \ -device pcie-root-port,id=pcie_extra_root_port_3,addr=0x3.0x3,bus=pcie.0,chassis=7 \ -monitor stdio \ -qmp tcp:0:5555,server,nowait \ 2.Hotplug rtl819 nic to guest {'execute': 'qmp_capabilities'} {"return": {}} {'execute': 'netdev_add', 'arguments': {'type': 'tap', 'id': 'idjeru4b'}} {"return": {}} {"execute": "device_add", "arguments": {"id": "idrfgIqF", "driver": "rtl8139", "netdev": "idjeru4b", "mac": "9a:45:bb:a8:b1:90", "bus": "pcie-pci-bridge-0","addr":"0x1"}} {"return": {}} 3.Check the nic inside guest # lspci | grep Eth 02:01.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8100/8101L/8139 PCI Fast Ethernet Adapter (rev 20) # ifconfig -a eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 10.73.197.44 netmask 255.255.252.0 broadcast 10.73.199.255 inet6 fe80::9845:bbff:fea8:b190 prefixlen 64 scopeid 0x20<link> inet6 2620:52:0:49c4:9845:bbff:fea8:b190 prefixlen 64 scopeid 0x0<global> ether 9a:45:bb:a8:b1:90 txqueuelen 1000 (Ethernet) RX packets 34617 bytes 1636041 (1.5 MiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 353 bytes 37013 (36.1 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 839 lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 inet6 ::1 prefixlen 128 scopeid 0x10<host> loop txqueuelen 1000 (Local Loopback) RX packets 24 bytes 2446 (2.3 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 24 bytes 2446 (2.3 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 # ping www.baidu.com -c 5 -I eth0 PING www.a.shifen.com (182.61.200.6) from 10.73.197.44 eth0: 56(84) bytes of data. 64 bytes from 182.61.200.6 (182.61.200.6): icmp_seq=1 ttl=49 time=27.1 ms 64 bytes from 182.61.200.6 (182.61.200.6): icmp_seq=2 ttl=49 time=23.0 ms 64 bytes from 182.61.200.6 (182.61.200.6): icmp_seq=3 ttl=49 time=30.2 ms 64 bytes from 182.61.200.6 (182.61.200.6): icmp_seq=4 ttl=49 time=29.1 ms 64 bytes from 182.61.200.6 (182.61.200.6): icmp_seq=5 ttl=49 time=30.3 ms --- www.a.shifen.com ping statistics --- 5 packets transmitted, 5 received, 0% packet loss, time 4004ms rtt min/avg/max/mdev = 22.973/27.940/30.307/2.741 ms 4.Check dmesg in guest [ 331.671266] pci 0000:02:01.0: [10ec:8139] type 00 class 0x020000 [ 331.671742] pci 0000:02:01.0: reg 0x10: [io 0x0000-0x00ff] [ 331.672111] pci 0000:02:01.0: reg 0x14: [mem 0x00000000-0x000000ff] [ 331.672640] pci 0000:02:01.0: reg 0x30: [mem 0x00000000-0x0003ffff pref] [ 331.673762] pci 0000:02:01.0: BAR 6: assigned [mem 0xfda00000-0xfda3ffff pref] [ 331.674196] pci 0000:02:01.0: BAR 0: assigned [io 0x7000-0x70ff] [ 331.674595] pci 0000:02:01.0: BAR 1: assigned [mem 0xfda40000-0xfda400ff] [ 331.775001] 8139cp: 8139cp: 10/100 PCI Ethernet driver v1.3 (Mar 22, 2004) [ 331.775525] 8139cp 0000:02:01.0: enabling device (0000 -> 0003) [ 331.777168] ACPI: \_SB_.GSIG: Enabled at IRQ 22 [ 331.780154] 8139cp 0000:02:01.0 eth0: RTL-8139C+ at 0x0000000068f7b71a, 9a:45:bb:a8:b1:90, IRQ 22 [ 331.785518] 8139too: 8139too Fast Ethernet driver 0.9.28 [ 331.804364] 8139cp 0000:02:01.0 eth0: link up, 100Mbps, full-duplex, lpa 0x05E1 Test results: ifconfig can get NIC, hotplug successful. Verify this bug with the latest qemu-kvm-6.2.0-1.el9.x86_64 and fix "seabios-1.15.0-1.el9.x86_64" version. The steps and results of the test are the same as comment21 host version: kernel-5.14.0-36.el9.x86_64 qemu-kvm-6.2.0-1.el9.x86_64 seabios-1.15.0-1.el9.x86_64 guest: rhel9.0.0 (kernel-5.14.0-36.el9.x86_64) Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (new packages: seabios), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:2391 |