Bug 1738821
Summary: | Ping from/to guest failed when hot-pluging an e1000e nic | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux Advanced Virtualization | Reporter: | Lei Yang <leiyang> |
Component: | qemu-kvm | Assignee: | Yvugenfi <yvugenfi> |
qemu-kvm sub component: | Devices | QA Contact: | Lei Yang <leiyang> |
Status: | CLOSED CURRENTRELEASE | Docs Contact: | |
Severity: | medium | ||
Priority: | medium | CC: | chayang, ddepaula, jasowang, jinzhao, juzhang, pezhang, virt-maint, ybendito, yvugenfi |
Version: | 8.1 | Keywords: | Triaged |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | qemu-kvm-5.2.0-3.module+el8.4.0+9499+42e58f08 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2022-01-12 22:37:40 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1744438 |
Description
Lei Yang
2019-08-08 08:39:36 UTC
NO problem with the same test step virtio-net-pci nic,it can ping from/to guest successful. The same test step on win2019 guest are works well,ping from/to guest are successfully. Hi,Yan I can reproduce this bug on rhel8.2 host + win2019_guest(q35 + seabios + virtio-net-pci). host version: qemu-kvm-4.2.0-1.module+el8.2.0+4793+b09dd2fb.x86_64 kernel-4.18.0-158.el8.x86_64 virtio-win-prewhql-0.1-172.iso Reproduce steps: 1.Boot a guest. /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -machine q35 \ -nodefaults \ -device VGA,bus=pcie.0,addr=0x1 \ -m 14336 \ -smp 16,maxcpus=16,cores=8,threads=1,sockets=2 \ -cpu 'EPYC',hv_stimer,hv_synic,hv_vpindex,hv_reset,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_time,hv-tlbflush,+kvm_pv_unhalt \ -device pcie-root-port,id=pcie.0-root-port-2,slot=2,chassis=2,addr=0x2,bus=pcie.0 \ -device qemu-xhci,id=usb1,bus=pcie.0-root-port-2,addr=0x0 \ -device pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x3,bus=pcie.0 \ -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie.0-root-port-3,addr=0x0 \ -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/win2019-64-virtio-scsi.qcow2 \ -device scsi-hd,id=image1,drive=drive_image1 \ -device pcie-root-port,id=pcie.0-root-port-4,slot=4,chassis=4,addr=0x4,bus=pcie.0 \ -drive id=drive_cd1,if=none,snapshot=off,aio=threads,cache=none,media=cdrom,file=/home/kvm_autotest_root/iso/windows/winutils.iso \ -device scsi-cd,id=cd1,drive=drive_cd1 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -vnc :0 \ -rtc base=localtime,clock=host,driftfix=slew \ -boot order=cdn,once=c,menu=off,strict=off \ -enable-kvm \ -qmp tcp:0:5555,server,nowait \ -device pcie-root-port,id=pcie_extra_root_port_0,slot=5,chassis=5,addr=0x5,bus=pcie.0 \ -device pcie-root-port,id=pcie_extra_root_port_1,slot=6,chassis=6,addr=0x6,bus=pcie.0 \ -device pcie-root-port,id=pcie_extra_root_port_4,slot=7,chassis=7,addr=0x7,bus=pcie.0 \ -device pcie-root-port,id=pcie_extra_root_port_5,slot=8,chassis=8,addr=0x8,bus=pcie.0 \ -monitor stdio \ 2.hotplug a virtio-net nic. # telnet 10.73.196.43 5555 {"execute":"qmp_capabilities"} {"return": {}} {'execute': 'netdev_add', 'arguments': {'type': 'tap', 'id': 'idJxmmIZ','vhost':'on'}} {"return": {}} {'execute': 'device_add', 'arguments': {'driver': 'virtio-net-pci', 'netdev': 'idJxmmIZ', 'mac': '9a:90:e8:73:1c:72', 'id': 'idguH3SC', 'bus': 'pcie_extra_root_port_0'}} {"return": {}} {"timestamp": {"seconds": 1575271737, "microseconds": 753018}, "event": "NIC_RX_FILTER_CHANGED", "data": {"name": "idguH3SC", "path": "/machine/peripheral/idguH3SC/virtio-backend"}} 3.Hot unplug the nic (only one "DEVICE_DELETE" event return). {'execute': 'device_del', 'arguments': {'id':'idguH3SC'}} {"return": {}} {"timestamp": {"seconds": 1575271884, "microseconds": 833278}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/idguH3SC/virtio-backend"}} {'execute': 'netdev_del', 'arguments': {'id':'idJxmmIZ'}} {"return": {}} 4.Hotplug the nic again failed. {'execute': 'netdev_add', 'arguments': {'type': 'tap', 'id': 'idJxmmIZ','vhost':'on'}} {"return": {}} {'execute': 'device_add', 'arguments': {'driver': 'virtio-net-pci', 'netdev': 'idJxmmIZ', 'mac': '9a:90:e8:73:1c:72', 'id': 'idguH3SC', 'bus': 'pcie_extra_root_port_0'}} {"error": {"class": "GenericError", "desc": "Duplicate ID 'idguH3SC' for device"}} I'm not sure if this is a new issue.if it is,please tell me know,I will file a new bz. Thanks & Regards LeiYang Let's keep it as a single issue for now The problem described in comment #4 for virtio-net-pci is not related to this BZ. See https://bugzilla.redhat.com/show_bug.cgi?id=1708480 for virtio-net-pci Please provide more information about the problem with e1000e: 1. With which delay between device_del and following device_add to reproduce the problem? Does it happen if the delay is significant (15 seconds)? 2. When the problem happened, what is the status of the nic in the guest? Does it have the IP address? 3. After the problem happens, does device_del actually removes the device? (info network) (In reply to ybendito from comment #7) > Please provide more information about the problem with e1000e: > 1. With which delay between device_del and following device_add to reproduce > the problem? Does it happen if the delay is significant (15 seconds)? Hi The delay between device_del and device_add above 15 seconds,the issue persists. > 2. When the problem happened, what is the status of the nic in the guest? > Does it have the IP address? When the problem happened,the nic is state up in the guest,but it can not obtain ip address after dhclient. ==>inside guest # ip -d link show eth0 5: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000 link/ether 00:1a:4a:42:0b:01 brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 9212 addrgenmode none numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 > 3. After the problem happens, does device_del actually removes the device? > (info network) 1.After device_del nic. {"execute":"qmp_capabilities"} {"return": {}} { "execute": "netdev_add", "arguments": {"type":"tap","id":"hostnet0","script":"/etc/qemu-ifup"}} {"return": {}} {"execute": "device_add", "arguments": { "driver":"e1000e","netdev":"hostnet0","mac":"00:1a:4a:42:0b:01","id": "net0","bus":"root.3"}} {"return": {}} {"execute":"device_del","arguments":{"id":"net0"}} {"return": {}} {"timestamp": {"seconds": 1577345622, "microseconds": 418992}, "event": "DEVICE_DELETED", "data": {"device": "net0", "path": "/machine/peripheral/net0"}} ==>inside guest #lspci |grep Eth (no return) ==>host hmp (qemu) info network hostnet0: index=0,type=tap,ifname=tap0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown 2.Again hotplug this nic {"execute": "device_add", "arguments": { "driver":"e1000e","netdev":"hostnet0","mac":"00:1a:4a:42:0b:01","id": "net0","bus":"root.3"}} {"return": {}} ==>inside guest #lspci |grep Eth 03:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection ==>host hmp (qemu) info network net0: index=0,type=nic,model=e1000e,macaddr=00:1a:4a:42:0b:01 \ hostnet0: index=0,type=tap,ifname=tap0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks Test e1000e device on qemu-kvm-5.0.0-0.module+el8.3.0+6620+5d5e1420.x86_64, hit same issue. ==>Test steps 1, boot rhel.8.3 guest qemu cli: /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -sandbox on \ -machine q35 \ -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \ -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0 \ -nodefaults \ -device VGA,bus=pcie.0,addr=0x2 \ -m 7168 \ -smp 6,maxcpus=6,cores=3,threads=1,dies=1,sockets=2 \ -cpu 'Haswell-noTSX',+kvm_pv_unhalt \ -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \ -device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \ -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \ -blockdev node-name=file_image1,driver=file,aio=threads,filename=/home/kvm_autotest_root/images/rhel830-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_image1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1 \ -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \ -vnc :0 \ -rtc base=utc,clock=host,driftfix=slew \ -boot menu=off,order=cdn,once=c,strict=off \ -enable-kvm \ -device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=5 \ -monitor stdio \ -qmp tcp:0:5555,server,nowait \ -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \ 2.Hot plug nic through qmp (No "NIC_RX_FILTER_CHANGED" event in qmp after plug e1000e nic). # telnet 10.73.224.38 5555 Trying 10.73.224.38... Connected to 10.73.224.38. Escape character is '^]'. {"QMP": {"version": {"qemu": {"micro": 0, "minor": 0, "major": 5}, "package": "qemu-kvm-5.0.0-0.module+el8.3.0+6620+5d5e1420"}, "capabilities": ["oob"]}} {"execute": "qmp_capabilities"} {"return": {}} {'execute': 'netdev_add', 'arguments': {'type': 'tap', 'id': 'idW9F3zA'}} {"return": {}} {'execute': 'device_add', 'arguments': {'driver': 'e1000e', 'netdev': 'idW9F3zA', 'mac': '9a:af:77:01:18:a2', 'id': 'idjmgn2T', 'bus': 'pcie_extra_root_port_0', 'addr': '0x0'}} {"return": {}} 3.Hot unplug e1000e device, Only one "DEVICE_SELETE" event return. {'execute': 'device_del', 'arguments': {'id':'idjmgn2T'}} {"return": {}} {"timestamp": {"seconds": 1590654515, "microseconds": 560138}, "event": "DEVICE_DELETED", "data": {"device": "idjmgn2T", "path": "/machine/peripheral/idjmgn2T"}} 4.Hot plug the device again,cao not obtain ipaddress. {'execute': 'device_add', 'arguments': {'driver': 'e1000e', 'netdev': 'idW9F3zA', 'mac': '9a:af:77:01:18:a2', 'id': 'idjmgn2T', 'bus': 'pcie_extra_root_port_0', 'addr': '0x0'}} {"return": {}} Posted to qemu-devel https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg04494.html ==Verified with qemu-kvm-5.2.0-1.module+el8.4.0+9091+650b220a.x86_64 1.Boot guest /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -sandbox on \ -machine q35 \ -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \ -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0 \ -nodefaults \ -device VGA,bus=pcie.0,addr=0x2 \ -m 6144 \ -smp 6,maxcpus=6,cores=3,threads=1,dies=1,sockets=2 \ -cpu 'Haswell-noTSX',+kvm_pv_unhalt \ -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \ -device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \ -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \ -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel840-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \ -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \ -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \ -vnc :0 \ -rtc base=utc,clock=host,driftfix=slew \ -boot menu=off,order=cdn,once=c,strict=off \ -enable-kvm \ -device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=5 \ -monitor stdio \ -qmp tcp:0:5555,server,nowait \ 2.Hot plug nic through qmp # telnet 10.73.224.38 5555 {"QMP": {"version": {"qemu": {"micro": 0, "minor": 2, "major": 5}, "package": "qemu-kvm-5.2.0-1.module+el8.4.0+9091+650b220a"}, "capabilities": ["oob"]}} {"execute":"qmp_capabilities"} {"return": {}} {'execute': 'netdev_add', 'arguments': {'type': 'tap', 'id': 'idW9F3zA'}} {"return": {}} {'execute': 'device_add', 'arguments': {'driver': 'e1000e', 'netdev': 'idW9F3zA', 'mac': '9a:af:77:01:18:a2', 'id': 'idjmgn2T', 'bus': 'pcie_extra_root_port_0', 'addr': '0x0'}} {"return": {}} 3.Hot unplug e1000e device {'execute': 'device_del', 'arguments': {'id':'idjmgn2T'}} {"return": {}} {"timestamp": {"seconds": 1608878587, "microseconds": 832085}, "event": "DEVICE_DELETED", "data": {"device": "idjmgn2T", "path": "/machine/peripheral/idjmgn2T"}} 4.Hot plug the device again,guest get ip address, cna ping external host. {'execute': 'device_add', 'arguments': {'driver': 'e1000e', 'netdev': 'idW9F3zA', 'mac': '9a:af:77:01:18:a2', 'id': 'idjmgn2T', 'bus': 'pcie_extra_root_port_0', 'addr': '0x0'}} {"return": {}} 5.ping external host from guest # ping 10.73.224.38 -c 4 PING 10.73.224.38 (10.73.224.38) 56(84) bytes of data. 64 bytes from 10.73.224.38: icmp_seq=1 ttl=64 time=0.499 ms 64 bytes from 10.73.224.38: icmp_seq=2 ttl=64 time=0.229 ms 64 bytes from 10.73.224.38: icmp_seq=3 ttl=64 time=0.168 ms 64 bytes from 10.73.224.38: icmp_seq=4 ttl=64 time=0.229 ms --- 10.73.224.38 ping statistics --- 4 packets transmitted, 4 received, 0% packet loss, time 3062ms rtt min/avg/max/mdev = 0.168/0.281/0.499/0.128 ms So this bug has been fixed very well. Move to 'VERIFIED'. Hi Danilo Because the current deadline for the bug was past, I reset ITM, please help to update into the errata。 Best regards Lei Yang Adding TestOnly as this didn't require any code change downstream. Moving to ON_QA so QE can verify it. Granting devel+ Needs QA_ACK (In reply to Danilo Cesar Lemes de Paula from comment #21) > Adding TestOnly as this didn't require any code change downstream. > Moving to ON_QA so QE can verify it. > Granting devel+ > Needs QA_ACK Hi Danilo, I'd like to remove "TestOnly" as actually there is code delivery in qemu(see comment 12, 13, 14). And QE confirmed the fix is in qemu-5.2 after rebase. |