Created attachment 1682834 [details] log Description of problem: Cannot get a valid ip after continuous hotplug/unplug Version-Release number of selected component (if applicable): qemu-kvm-4.2.0-19.module+el8.2.0+6296+6b821950.x86_64 kernel-4.18.0-193.el8.x86_64 virtio-win-prewhql-181 seabios-1.13.0-1.module+el8.2.0+5520+4e5817f3.x86_64 guest:ws2019 How reproducible: 100% Steps to Reproduce: 1 boot guest 2 hotplug virtio-net device {'execute': 'netdev_add', 'arguments': {'type': 'tap', 'id': 'idBhIYQz', 'fds': '55:53:52:54'}, 'id': 'feP61U4g'} {'execute': 'device_add', 'arguments': {'driver': 'virtio-net-pci', 'netdev': 'idBhIYQz', 'mac': '9a:8d:ad:f2:92:6b', 'id': 'idR7WrN8', 'mq': 'on', 'vectors': 10, 'bus': 'pcie_extra_root_port_0', 'addr': '0x0'}, 'id': 'NyasKAz8'} 3 Check if new interface gets ip address 4 Ping guest's new ip from host 5 unplug virtio-net device Send command: {'execute': 'device_del', 'arguments': {'id': 'idR7WrN8'}, 'id': '73wMToOK'} 6 repeat about 30 times 7 hotplug a new virtio-net device {'execute': 'netdev_add', 'arguments': {'type': 'tap', 'id': 'id9zVCba', 'fds': '62:61:64:63'}, 'id': 'HxdwgeYH'} {'execute': 'device_add', 'arguments': {'driver': 'virtio-net-pci', 'netdev': 'id9zVCba', 'mac': '9a:8d:ad:f2:92:6c', 'id': 'idU220t9', 'mq': 'on', 'vectors': 10, 'bus': 'pcie_extra_root_port_0', 'addr': '0x0'}, 'id': 'pvSsPQtf'} 8 Check if new interface gets ip address Actual results: cannot get valid ip Expected results: can get ip and ping succesfully Additional info: 1 can reproduce on windows2019 and RHEL8.2.0 guest 2 cannot reproduce qemu-kvm-4.2.0-11.module+el8.2.0+5837+4c1442ec.x86_64 and qemu-kvm-4.2.0-17.module+el8.2.0+6141+0f540f16.x86_64 so it is a regression
Can you please post host network configuration and configuration scripts? Thanks.
(In reply to Yan Vugenfirer from comment #3) > Can you please post host network configuration and configuration scripts? > > Thanks. switch: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 10.73.72.188 netmask 255.255.252.0 broadcast 10.73.75.255 inet6 fe80::c3a6:1270:5932:4fe6 prefixlen 64 scopeid 0x20<link> inet6 2620:52:0:4948:e5c7:bf8b:8a1d:5be0 prefixlen 64 scopeid 0x0<global> ether 06:02:4c:a2:8d:08 txqueuelen 1000 (Ethernet) RX packets 17994668 bytes 26137817803 (24.3 GiB) RX errors 0 dropped 220 overruns 0 frame 0 TX packets 6412642 bytes 2118625406 (1.9 GiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 [root@dell-per440-04 network-scripts]# cat ifcfg-eno1 # Generated by parse-kickstart TYPE=Ethernet DEVICE=eno1 UUID=a3fac8a2-0945-4476-b028-733526ba8dcf ONBOOT=yes NAME="System eno1" BRIDGE=switch [root@dell-per440-04 network-scripts]# cat ifcfg-switch STP=no TYPE=Bridge PROXY_METHOD=none BROWSER_ONLY=no BOOTPROTO=dhcp DEFROUTE=yes IPV4_FAILURE_FATAL=no IPV6INIT=yes IPV6_AUTOCONF=yes IPV6_DEFROUTE=yes IPV6_FAILURE_FATAL=no IPV6_ADDR_GEN_MODE=stable-privacy NAME=switch UUID=927cf127-8b55-4b16-a5f7-f10599cf5093 DEVICE=switch ONBOOT=yes [root@dell-per440-04 network-scripts]# cat /etc/qemu-ifup #!/bin/sh switch=switch /usr/sbin/ip link set $1 up /usr/sbin/ip link set dev $1 master ${switch} /usr/sbin/ip link set ${switch} type bridge forward_delay 0 /usr/sbin/ip link set ${switch} type bridge stp_state 0
Hi, I hit similar issue After "disable/enable" virtio-net device in guest several times, the guest cannot get valid ip, either. Could you check if it is the same issue as this? ============================================================================================== 03:39:36 DEBUG| Sending command: netsh interface set interface name="Ethernet" admin=DISABLED 03:39:36 DEBUG| Sending command: echo %errorlevel% 03:39:47 DEBUG| Sending command: netsh interface set interface name="Ethernet" admin=ENABLED 03:40:20 DEBUG| Sending command: echo %errorlevel% 03:42:30 INFO | Context: Start ping test 03:42:30 INFO | The command of Ping is: ping 10.73.73.197 -c 10 03:42:30 DEBUG| PING 10.73.73.197 (10.73.73.197) 56(84) bytes of data. 03:42:43 DEBUG| From 10.73.72.188 icmp_seq=10 Destination Host Unreachable 03:42:45 INFO | Running 'kill -19 290870' 03:42:46 INFO | Command 'kill -19 290870' finished with 0 after 0.00079345703125s 03:42:46 INFO | Running 'kill -19 290872' 03:42:46 INFO | Command 'kill -19 290872' finished with 0 after 0.0006191730499267578s 03:42:46 INFO | Running 'kill -2 290872' 03:42:46 INFO | Command 'kill -2 290872' finished with 0 after 0.0007576942443847656s 03:42:46 INFO | Running 'kill -2 290870' 03:42:46 INFO | Command 'kill -2 290870' finished with 0 after 0.0005733966827392578s 03:42:46 INFO | Running 'kill -18 290870' 03:42:46 INFO | Command 'kill -18 290870' finished with 0 after 0.0006244182586669922s 03:42:46 INFO | Running 'kill -18 290872' 03:42:46 DEBUG| 03:42:46 INFO | Command 'kill -18 290872' finished with 0 after 0.0009965896606445312s ================================================================================================ Detailed log pls refer to attachment: disable-enable 4.18.0-193.3.1.el8_2.x86_64 qemu-kvm-4.2.0-21.module+el8.2.1+6586+8b7713b9.x86_64 seabios-bin-1.13.0-1.module+el8.2.0+5520+4e5817f3.noarch virtio-win-prewhql-178/184/185 Guest:win8-32 Boot cmd: -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \ -device virtio-net-pci,mac=9a:18:cb:d5:92:4e,id=idliQoji,mq=on,vectors=14,netdev=idbR6Qlz,bus=pcie-root-port-3,addr=0x0 \ -netdev tap,id=idbR6Qlz,vhost=on,vhostfds=27:28:29:30:31:32,fds=14:21:22:24:25:26 \ Thanks Yu Wang
There are several hotplug\unplug symptoms that we see: 1. No ping to the guest after a sequence of hotplug\unplug 2. QEMU crash 3. Guest crash in pci.sys (pci bus driver) on Windows Server 2019 I think we can treat them, at least for now, as the same general hotplug\unlug issue.
Run ws2019 on RHEL8.2.0 AV host, hit similar issue(build 185) reboot several times (about 50-100 times), it cannot get ip. Thanks Yu Wang
Run win2012r2 on RHEL8.2.03.0 AV host, hit the similar issue(build 189) {'execute':'netdev_add','arguments':{'type':'tap','id':'hostnet0','vhost':true,'script':'/etc/qemu-ifup','queues':32}} {"return": {}} {'execute':'device_add','arguments':{'driver':'virtio-net-pci','id':'net0','mac':'00:1a:4a:42:0b:01','netdev':'hostnet0','mq':'on','vectors':'64','bus':'pcie_extra_root_port_0'}} {"return": {}} ----checked the guest, virtio-net device is shown in device manager, but can't get ip address. ----after about 30s, return the following message, then guest get ip address. {"timestamp": {"seconds": 1597828188, "microseconds": 717336}, "event": "NIC_RX_FILTER_CHANGED", "data": {"name": "net0", "path": "/machine/peripheral/net0/virtio-backend"}}
(In reply to xiagao from comment #10) > Run win2012r2 on RHEL8.2.03.0 AV host, hit the similar issue(build 189) > > > {'execute':'netdev_add','arguments':{'type':'tap','id':'hostnet0','vhost': > true,'script':'/etc/qemu-ifup','queues':32}} > {"return": {}} > {'execute':'device_add','arguments':{'driver':'virtio-net-pci','id':'net0', > 'mac':'00:1a:4a:42:0b:01','netdev':'hostnet0','mq':'on','vectors':'64','bus': > 'pcie_extra_root_port_0'}} > {"return": {}} > > ----checked the guest, virtio-net device is shown in device manager, but > can't get ip address. > ----after about 30s, return the following message, then guest get ip address. > {"timestamp": {"seconds": 1597828188, "microseconds": 717336}, "event": > "NIC_RX_FILTER_CHANGED", "data": {"name": "net0", "path": > "/machine/peripheral/net0/virtio-backend"}} pkg: kernel-4.18.0-232.el8.x86_64 qemu-kvm-5.1.0-2.module+el8.3.0+7652+b30e6901.x86_64 virtio-win-prewhql-189 edk2-ovmf-20200602gitca407c7246bf-2.el8.noarch
Sometimes hit the similar issue when booting up windows guest with q35. When testing with vhost=off, the reproduce probability will increase. Tried 20 times with vhost=on by automation,hit once. Tried 20 times with vhost=off by automation,hit five times. 22:42:45 DEBUG| Sending command: wmic nicconfig where IPEnabled=True get ipaddress, macaddress 22:42:46 WARNI| No VM's NIC got IP address 22:43:59 DEBUG| Sending command: ipconfig || ifconfig 22:43:59 DEBUG| Sending command: ip route || route print 22:43:59 ERROR| Guest network status: Windows IP Configuration Ethernet adapter Ethernet: Connection-specific DNS Suffix . : lab.eng.pek2.redhat.com Link-local IPv6 Address . . . . . : fe80::4920:c31:aaf7:73d7%6 Default Gateway . . . . . . . . . : C:\Windows\system32> Guest route table: 'ip' is not recognized as an internal or external command, operable program or batch file. =========================================================================== Interface List 6...9a a6 cb 0e eb f9 ......Red Hat VirtIO Ethernet Adapter 1...........................Software Loopback Interface 1 =========================================================================== IPv4 Route Table =========================================================================== Active Routes: Network Destination Netmask Gateway Interface Metric 127.0.0.0 255.0.0.0 On-link 127.0.0.1 331 127.0.0.1 255.255.255.255 On-link 127.0.0.1 331 127.255.255.255 255.255.255.255 On-link 127.0.0.1 331 224.0.0.0 240.0.0.0 On-link 127.0.0.1 331 255.255.255.255 255.255.255.255 On-link 127.0.0.1 331 =========================================================================== Persistent Routes: None IPv6 Route Table =========================================================================== Active Routes: If Metric Network Destination Gateway 1 331 ::1/128 On-link 6 271 fe80::/64 On-link 6 271 fe80::4920:c31:aaf7:73d7/128 On-link 1 331 ff00::/8 On-link 6 271 ff00::/8 On-link =========================================================================== Persistent Routes: None C:\Windows\system32>
(In reply to xiagao from comment #10) > Run win2012r2 on RHEL8.2.03.0 AV host, hit the similar issue(build 189) > > > {'execute':'netdev_add','arguments':{'type':'tap','id':'hostnet0','vhost': > true,'script':'/etc/qemu-ifup','queues':32}} > {"return": {}} > {'execute':'device_add','arguments':{'driver':'virtio-net-pci','id':'net0', > 'mac':'00:1a:4a:42:0b:01','netdev':'hostnet0','mq':'on','vectors':'64','bus': > 'pcie_extra_root_port_0'}} > {"return": {}} > > ----checked the guest, virtio-net device is shown in device manager, but > can't get ip address. > ----after about 30s, return the following message, then guest get ip address. > {"timestamp": {"seconds": 1597828188, "microseconds": 717336}, "event": > "NIC_RX_FILTER_CHANGED", "data": {"name": "net0", "path": > "/machine/peripheral/net0/virtio-backend"}} Hi, Let's make some order in BZ. I think the above is not related to original bug. First of all the amount of vectors should be at least (number of queues * 2) + 1 (the +1 is for the control queue). If we have problem with this scenario, I don't think it is related to the hot plug. And I suggest to open separate bug. Please test the original steps with latest QEMU and check if it is reprodcued. Thanks!
Posted to upstream https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg04494.html
Hi, Yan I tried to use qemu-kvm-5.2.0-1.module+el8.4.0+9091+650b220a.x86_64 and kernel-4.18.0-262.el8.x86_64 to verify the problem. 1.Now rhel guest can works normally, but on the Windows guest still has the problem of could not get a valid ip after hot plug/unplug rtl8139 nic. So how should I solve this problem? 2.virtio-net and e1000e can work normally on Windows guest. Best Regards Lei Yang
I've checked it with upstream and 8.4.0 - the device does not disappear after hot unplug and does not work after hotplug If I use 'scan for hardware changes' 0 the device disappears and after hot plug works as expected Let's check the problem step-by-step: 1. start q35 vm with rtl8139 - the device works properly 2. hot unplug the device. Does it disappear from the device manager? 3. If during reasonable time it does not disappear: use 'scan for hardware changes' 4. After the device disappeared, hot-plug it. 5. Does the device work properly now?
(In reply to ybendito from comment #22) > I've checked it with upstream and 8.4.0 - the device does not disappear > after hot unplug and does not work after hotplug > If I use 'scan for hardware changes' 0 the device disappears and after hot > plug works as expected > > Let's check the problem step-by-step: Hi, Yan > 1. start q35 vm with rtl8139 - the device works properly Yes, boot windows guest with rtl8139 nic is works normally. > 2. hot unplug the device. Does it disappear from the device manager? No, The device will not disappear from the device manager > 3. If during reasonable time it does not disappear: use 'scan for hardware changes' Device will disappear from device management after using “scan for hardware changes” > 4. After the device disappeared, hot-plug it. > 5. Does the device work properly now? When the device disappears and re-hotplug it, it can work normally Best Regards Lei Yang
It is possible that this behavior happens because the rtl8139 is not a PCIe device. This is different problem (or not a problem) so please open separate BZ for rtl8139 on q35. Please label this new bz as a blocker of https://bugzilla.redhat.com/show_bug.cgi?id=1744438 (tracker).
==Steps 1.Boot guest ==Reproduced with qemu-kvm-4.2.0-19.module+el8.2.0+6296+6b821950.x86_64 2.After steps,can not get a valid ip address after hot plug/unplug nic many times. Hot plug: { "execute": "netdev_add", "arguments": { "type":"tap","id":"hostnet0"}} {"execute": "device_add", "arguments": { "driver":"virtio-net-pci","netdev":"hostnet0","mac":"00:1a:4a:42:0b:01","id": "net0","bus":"pcie-root-port-3"}} Hot unplug: {"execute": "device_del", "arguments": {"id": "net0"}} {"execute": "netdev_del", "arguments": {"id": "hostnet0"}} So this bug has been reproduced. ==Verified with qemu-kvm-5.2.0-1.module+el8.4.0+9091+650b220a.x86_64 1.Guest can get a valid ip address after hot plug/unplug nic many times. Hot plug: { "execute": "netdev_add", "arguments": { "type":"tap","id":"hostnet0"}} {"execute": "device_add", "arguments": { "driver":"virtio-net-pci","netdev":"hostnet0","mac":"00:1a:4a:42:0b:01","id": "net0","bus":"pcie-root-port-3"}} Hot unplug: {"execute": "device_del", "arguments": {"id": "net0"}} {"execute": "netdev_del", "arguments": {"id": "hostnet0"}} So this bug has been fixed very well. Move to 'VERIFIED'. Additional info: This bug only verifies that PCIe device, rtl8139 still has problems, and has been tracked separately in Bug 1908633.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:2098