Bug 2002686
| Summary: | hot unplug nic can not unplug the nic device successfully | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | leidwang <leidwang> |
| Component: | qemu-kvm | Assignee: | Virtualization Maintenance <virt-maint> |
| qemu-kvm sub component: | Networking | QA Contact: | Lei Yang <leiyang> |
| Status: | CLOSED DUPLICATE | Docs Contact: | |
| Severity: | high | ||
| Priority: | high | CC: | jinzhao, juzhang, leiyang, lijin, qizhu, virt-maint, yanghliu, ybendito, yfu, yvugenfi |
| Version: | 9.0 | Keywords: | Triaged |
| Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-03-02 05:38:11 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Hit same issue on rhel.8.8 with network function testing. Test Version: kernel-4.18.0-430.el8.x86_64 qemu-kvm-6.2.0-22.module+el8.8.0+16816+1d3555ec.x86_64 virtio-win-prewhql-0.1-227.iso Hit same issue Test Version: kernel-5.14.0-179.el9.x86_64 qemu-kvm-7.1.0-3.el9.x86_64 libvirt-8.8.0-1.el9.x86_64 swtpm-0.7.0-3.20211109gitb79fd91.el9.x86_64 edk2-ovmf-20220826gitba0e0e4c6a-1.el9.noarch Note that the problem happens only under avocado in hot-plug flow and is not reproducible with manual execution. This is due to inconsistency of TAP creation in avocado when the network adapter is attached from the beginning (in this case the avocado creates it with vnet_hdr option) and when the network adapter is hot-plugged (in this case the avocado creates it with vnet_hdr option), see https://github.com/avocado-framework/avocado-vt/blob/master/virttest/qemu_vm.py#L4232 https://github.com/avocado-framework/avocado-vt/blob/master/virttest/qemu_vm.py#L2845 For example, the libvirt for virtio-net always creates the TAP with vnet_hdr https://github.com/libvirt/libvirt/blob/master/src/qemu/qemu_interface.c#L435 https://github.com/libvirt/libvirt/blob/master/src/qemu/qemu_interface.c#L240 Qemu when it runs from the command line also by default creates the TAP with vnet_hdr. So, if I'm not mistaken: the mainstream flow always uses TAP with vnet_hdr, although the qemu has an option there is the option to create it without vnet_hdr. However, even if the TAP was created without vnet_hdr, the networking should work (some optional features of virtio-net are disabled) and there is no explanation yet why in the tests it sometimes works and sometimes does not. IMO, this lowers the priority on this bug but does not justify the erroneous behavior of the network stack. My suggestions (to be discussed): 1. Add printout when the TAP is created to make visible in the logs whether vnet_hdr was specified at time of creation, i.e. here https://github.com/avocado-framework/avocado-vt/blob/master/virttest/utils_net.py#L1325 2. Make the default mode of vnet_header to be True (at least for virtio-net) in all the tests 3. Add the parameter that allows to redefine vnet_hdr as False for testing purposes 4. Create additional test that will verify the network functions also with vnet_hdr=no with and without vhost, such a test will reproduce the problem (vhost + no vnet_hdr) with Windows VM and probably on Linux VM as well 5. Open a separate BZ according to the results of the test (In reply to ybendito from comment #16) > Note that the problem happens only under avocado in hot-plug flow and is not > reproducible with manual execution. This is due to inconsistency of TAP > creation in avocado when the network adapter is attached from the beginning > (in this case the avocado creates it with vnet_hdr option) and when the > network adapter is hot-plugged (in this case the avocado creates it with > vnet_hdr option), see Typo, the last one is WITHOUT vnet_hdr > https://github.com/avocado-framework/avocado-vt/blob/master/virttest/qemu_vm. > py#L4232 > https://github.com/avocado-framework/avocado-vt/blob/master/virttest/qemu_vm. > py#L2845 > > For example, the libvirt for virtio-net always creates the TAP with vnet_hdr > https://github.com/libvirt/libvirt/blob/master/src/qemu/qemu_interface.c#L435 > https://github.com/libvirt/libvirt/blob/master/src/qemu/qemu_interface.c#L240 > > Qemu when it runs from the command line also by default creates the TAP with > vnet_hdr. > So, if I'm not mistaken: the mainstream flow always uses TAP with vnet_hdr, > although the qemu has an option there is the option to create it without > vnet_hdr. > > However, even if the TAP was created without vnet_hdr, the networking should > work (some optional features of virtio-net are disabled) and there is no > explanation yet why in the tests it sometimes works and sometimes does not. > IMO, this lowers the priority on this bug but does not justify the erroneous > behavior of the network stack. > > My suggestions (to be discussed): > 1. Add printout when the TAP is created to make visible in the logs whether > vnet_hdr was specified at time of creation, i.e. here > https://github.com/avocado-framework/avocado-vt/blob/master/virttest/ > utils_net.py#L1325 > 2. Make the default mode of vnet_header to be True (at least for virtio-net) > in all the tests > 3. Add the parameter that allows to redefine vnet_hdr as False for testing > purposes > 4. Create additional test that will verify the network functions also with > vnet_hdr=no with and without vhost, such a test will reproduce the problem > (vhost + no vnet_hdr) with Windows VM and probably on Linux VM as well > 5. Open a separate BZ according to the results of the test (In reply to ybendito from comment #16) > Note that the problem happens only under avocado in hot-plug flow and is not > reproducible with manual execution. This is due to inconsistency of TAP > creation in avocado when the network adapter is attached from the beginning > (in this case the avocado creates it with vnet_hdr option) and when the > network adapter is hot-plugged (in this case the avocado creates it with > vnet_hdr option), see > https://github.com/avocado-framework/avocado-vt/blob/master/virttest/qemu_vm. > py#L4232 > https://github.com/avocado-framework/avocado-vt/blob/master/virttest/qemu_vm. > py#L2845 > > For example, the libvirt for virtio-net always creates the TAP with vnet_hdr > https://github.com/libvirt/libvirt/blob/master/src/qemu/qemu_interface.c#L435 > https://github.com/libvirt/libvirt/blob/master/src/qemu/qemu_interface.c#L240 > > Qemu when it runs from the command line also by default creates the TAP with > vnet_hdr. > So, if I'm not mistaken: the mainstream flow always uses TAP with vnet_hdr, > although the qemu has an option there is the option to create it without > vnet_hdr. > > However, even if the TAP was created without vnet_hdr, the networking should > work (some optional features of virtio-net are disabled) and there is no > explanation yet why in the tests it sometimes works and sometimes does not. > IMO, this lowers the priority on this bug but does not justify the erroneous > behavior of the network stack. > > My suggestions (to be discussed): > 1. Add printout when the TAP is created to make visible in the logs whether > vnet_hdr was specified at time of creation, i.e. here > https://github.com/avocado-framework/avocado-vt/blob/master/virttest/ > utils_net.py#L1325 > 2. Make the default mode of vnet_header to be True (at least for virtio-net) > in all the tests > 3. Add the parameter that allows to redefine vnet_hdr as False for testing > purposes > 4. Create additional test that will verify the network functions also with > vnet_hdr=no with and without vhost, such a test will reproduce the problem > (vhost + no vnet_hdr) with Windows VM and probably on Linux VM as well > 5. Open a separate BZ according to the results of the test Thanks a lot Yuri. I will check your suggestions one by one,and update QE automation code or test plan. Thanks, Leidong Just for record - in my smoke test with Fedora 36 VM the guest virtio-net with vhost=on,vnet_hdr=off does not acquire the IP address Discussed this bz with Lei,this issue can be reproduced on linux guest,so change the component to qemu-kvm/networking.Thanks! From QE's perspective, the current bug and Bug 2084003 are the same issue as Bug 1958175, so closed it "DUPLICATE". Please correct me if I'm wrong. Thanks Lei *** This bug has been marked as a duplicate of bug 1958175 *** |
Description of problem: hot unplug nic can not unplug the nic device successfully(Windows),this issue can only be reproduced when running a loop by automation. Version-Release number of selected component (if applicable): 'kvm_version': '5.14.0-1.el9.x86_64' 'qemu_version': 'qemu-kvm-core-6.1.0-1.el9.x86_64' virtio-win-prewhql-0.1-207.iso How reproducible: 1/5 Steps to Reproduce: 1.boot up a guest /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -sandbox on \ -machine q35,memory-backend=mem-machine_mem \ -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \ -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0 \ -nodefaults \ -device VGA,bus=pcie.0,addr=0x2 \ -device i6300esb,bus=pcie-pci-bridge-0,addr=0x1 \ -watchdog-action reset \ -m 30720 \ -object memory-backend-ram,size=30720M,id=mem-machine_mem \ -smp 20,maxcpus=20,cores=10,threads=1,dies=1,sockets=2 \ -cpu 'Cascadelake-Server-noTSX',hv_stimer,hv_synic,hv_vpindex,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_time,hv_frequencies,hv_runtime,hv_tlbflush,hv_reenlightenment,hv_stimer_direct,hv_ipi,+kvm_pv_unhalt \ -device intel-hda,bus=pcie-pci-bridge-0,addr=0x2 \ -device hda-duplex \ -chardev socket,server=on,wait=off,path=/tmp/avocado_r9m7yx5x/monitor-qmpmonitor1-20210903-185410-Y0bYPe9H,id=qmp_id_qmpmonitor1 \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -chardev socket,server=on,wait=off,path=/tmp/avocado_r9m7yx5x/monitor-catch_monitor-20210903-185410-Y0bYPe9H,id=qmp_id_catch_monitor \ -mon chardev=qmp_id_catch_monitor,mode=control \ -device pvpanic,ioport=0x505,id=idpxqPBw \ -chardev socket,server=on,wait=off,path=/tmp/avocado_r9m7yx5x/serial-serial0-20210903-185410-Y0bYPe9H,id=chardev_serial0 \ -device isa-serial,id=serial0,chardev=chardev_serial0 \ -object rng-random,filename=/dev/random,id=passthrough-EK1tyvLX \ -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \ -device virtio-rng-pci,id=virtio-rng-pci-AH3whJxz,rng=passthrough-EK1tyvLX,bus=pcie-root-port-1,addr=0x0 \ -chardev socket,id=seabioslog_id_20210903-185410-Y0bYPe9H,path=/tmp/avocado_r9m7yx5x/seabios-20210903-185410-Y0bYPe9H,server=on,wait=off \ -device isa-debugcon,chardev=seabioslog_id_20210903-185410-Y0bYPe9H,iobase=0x402 \ -device ich9-usb-ehci1,id=usb1,addr=0x1d.0x7,multifunction=on,bus=pcie.0 \ -device ich9-usb-uhci1,id=usb1.0,multifunction=on,masterbus=usb1.0,addr=0x1d.0x0,firstport=0,bus=pcie.0 \ -device ich9-usb-uhci2,id=usb1.1,multifunction=on,masterbus=usb1.0,addr=0x1d.0x2,firstport=2,bus=pcie.0 \ -device ich9-usb-uhci3,id=usb1.2,multifunction=on,masterbus=usb1.0,addr=0x1d.0x4,firstport=4,bus=pcie.0 \ -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \ -device qemu-xhci,id=usb2,bus=pcie-root-port-2,addr=0x0 \ -device usb-tablet,id=usb-tablet1,bus=usb2.0,port=1 \ -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \ -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-3,addr=0x0 \ -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/win2022-64-virtio-scsi_avocado-vt-vm1.qcow2,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \ -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \ -blockdev node-name=file_cd1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/iso/windows/winutils.iso,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_cd1,driver=raw,read-only=on,cache.direct=on,cache.no-flush=off,file=file_cd1 \ -device scsi-cd,id=cd1,drive=drive_cd1,write-cache=on \ -vnc :0 \ -rtc base=localtime,clock=host,driftfix=slew \ -boot menu=off,order=cdn,once=c,strict=off \ -net none \ -no-hpet \ -enable-kvm \ -device pcie-root-port,id=pcie-root-port-4,port=0x4,addr=0x1.0x4,bus=pcie.0,chassis=5 \ -device virtio-balloon-pci,id=balloon0,bus=pcie-root-port-4,addr=0x0 \ -device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=6 \ -device pcie-root-port,id=pcie_extra_root_port_1,addr=0x3.0x1,bus=pcie.0,chassis=7 \ -device pcie-root-port,id=pcie_extra_root_port_2,addr=0x3.0x2,bus=pcie.0,chassis=8 \ -device pcie-root-port,id=pcie_extra_root_port_3,addr=0x3.0x3,bus=pcie.0,chassis=9 2.hotplug a nic to guest {'execute': 'netdev_add', 'arguments': {'type': 'tap', 'id': 'idZfngmG', 'fd': '91', 'vhost': True}, 'id': 'dGUZvJ53'} {'execute': 'device_add', 'arguments': OrderedDict([('id', 'idLYu96K'), ('driver', 'virtio-net-pci'), ('netdev', 'idZfngmG'), ('mac', '9a:be:8b:7a:3b:03'), ('bus', 'pcie_extra_root_port_0'), ('addr', '0x0')]), 'id': 'zb64omOe'} 3.Got the ip address of new nic and ping guest's new ip from host 4.Pause vm and resume vm 5.Ping guest's new ip from host 6.Unplug the nic from guest {'execute': 'device_del', 'arguments': {'id': 'idLYu96K'}, 'id': 'dZGdQSCA'} 7.Check if the nic is unpluged successfully Actual results: Device idLYu96K is not unplugged by guest Expected results: Device idLYu96K is unplugged by guest Additional info: