Bug 1829272 - [virtual network] could not get a valid ip after hotunplug/hotplug network device many times
Summary: [virtual network] could not get a valid ip after hotunplug/hotplug network de...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: qemu-kvm
Version: 8.2
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: 8.3
Assignee: Yvugenfi@redhat.com
QA Contact: Lei Yang
URL:
Whiteboard:
Depends On:
Blocks: 1744438
TreeView+ depends on / blocked
 
Reported: 2020-04-29 10:32 UTC by Yu Wang
Modified: 2023-03-14 14:39 UTC (History)
8 users (show)

Fixed In Version: qemu-kvm-5.2.0-1.module+el8.4.0+9091+650b220a
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-05-25 06:42:08 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
log (11.79 MB, application/zip)
2020-04-29 10:32 UTC, Yu Wang
no flags Details

Description Yu Wang 2020-04-29 10:32:09 UTC
Created attachment 1682834 [details]
log

Description of problem:

Cannot get a valid ip after continuous hotplug/unplug


Version-Release number of selected component (if applicable):

    qemu-kvm-4.2.0-19.module+el8.2.0+6296+6b821950.x86_64
    kernel-4.18.0-193.el8.x86_64
    virtio-win-prewhql-181
    seabios-1.13.0-1.module+el8.2.0+5520+4e5817f3.x86_64
    guest:ws2019

How reproducible:
100%

Steps to Reproduce:
1 boot guest

2 hotplug virtio-net device
{'execute': 'netdev_add', 'arguments': {'type': 'tap', 'id': 'idBhIYQz', 'fds': '55:53:52:54'}, 'id': 'feP61U4g'}

{'execute': 'device_add', 'arguments': {'driver': 'virtio-net-pci', 'netdev': 'idBhIYQz', 'mac': '9a:8d:ad:f2:92:6b', 'id': 'idR7WrN8', 'mq': 'on', 'vectors': 10, 'bus': 'pcie_extra_root_port_0', 'addr': '0x0'}, 'id': 'NyasKAz8'}

3 Check if new interface gets ip address

4 Ping guest's new ip from host

5 unplug virtio-net device
Send command: {'execute': 'device_del', 'arguments': {'id': 'idR7WrN8'}, 'id': '73wMToOK'}

6 repeat about 30 times 

7 hotplug a new virtio-net device
{'execute': 'netdev_add', 'arguments': {'type': 'tap', 'id': 'id9zVCba', 'fds': '62:61:64:63'}, 'id': 'HxdwgeYH'}

{'execute': 'device_add', 'arguments': {'driver': 'virtio-net-pci', 'netdev': 'id9zVCba', 'mac': '9a:8d:ad:f2:92:6c', 'id': 'idU220t9', 'mq': 'on', 'vectors': 10, 'bus': 'pcie_extra_root_port_0', 'addr': '0x0'}, 'id': 'pvSsPQtf'}

8 Check if new interface gets ip address

Actual results:
cannot get valid ip

Expected results:
can get ip and ping succesfully


Additional info:
1 can reproduce on windows2019 and RHEL8.2.0 guest
2 cannot reproduce qemu-kvm-4.2.0-11.module+el8.2.0+5837+4c1442ec.x86_64
  and qemu-kvm-4.2.0-17.module+el8.2.0+6141+0f540f16.x86_64
  so it is a regression

Comment 3 Yvugenfi@redhat.com 2020-05-11 07:56:53 UTC
Can you please post host network configuration and configuration scripts?

Thanks.

Comment 4 Yu Wang 2020-05-12 01:13:00 UTC
(In reply to Yan Vugenfirer from comment #3)
> Can you please post host network configuration and configuration scripts?
> 
> Thanks.

switch: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.73.72.188  netmask 255.255.252.0  broadcast 10.73.75.255
        inet6 fe80::c3a6:1270:5932:4fe6  prefixlen 64  scopeid 0x20<link>
        inet6 2620:52:0:4948:e5c7:bf8b:8a1d:5be0  prefixlen 64  scopeid 0x0<global>
        ether 06:02:4c:a2:8d:08  txqueuelen 1000  (Ethernet)
        RX packets 17994668  bytes 26137817803 (24.3 GiB)
        RX errors 0  dropped 220  overruns 0  frame 0
        TX packets 6412642  bytes 2118625406 (1.9 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


[root@dell-per440-04 network-scripts]# cat ifcfg-eno1 
# Generated by parse-kickstart
TYPE=Ethernet
DEVICE=eno1
UUID=a3fac8a2-0945-4476-b028-733526ba8dcf
ONBOOT=yes
NAME="System eno1"
BRIDGE=switch


[root@dell-per440-04 network-scripts]# cat ifcfg-switch
STP=no
TYPE=Bridge
PROXY_METHOD=none
BROWSER_ONLY=no
BOOTPROTO=dhcp
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
NAME=switch
UUID=927cf127-8b55-4b16-a5f7-f10599cf5093
DEVICE=switch
ONBOOT=yes


[root@dell-per440-04 network-scripts]# cat /etc/qemu-ifup
#!/bin/sh
switch=switch
/usr/sbin/ip link set $1 up
/usr/sbin/ip link set dev $1 master ${switch}
/usr/sbin/ip link set ${switch} type bridge forward_delay 0
/usr/sbin/ip link set ${switch} type bridge stp_state 0

Comment 5 Yu Wang 2020-05-25 09:54:50 UTC
Hi, 

I hit similar issue

After "disable/enable" virtio-net device in guest several times, the guest cannot get valid ip, either.
Could you check if it is  the same issue as this?

==============================================================================================
03:39:36 DEBUG| Sending command: netsh interface set interface name="Ethernet" admin=DISABLED
03:39:36 DEBUG| Sending command: echo %errorlevel%
03:39:47 DEBUG| Sending command: netsh interface set interface name="Ethernet" admin=ENABLED
03:40:20 DEBUG| Sending command: echo %errorlevel%
03:42:30 INFO | Context: Start ping test
03:42:30 INFO | The command of Ping is: ping 10.73.73.197  -c 10
03:42:30 DEBUG| PING 10.73.73.197 (10.73.73.197) 56(84) bytes of data.
03:42:43 DEBUG| From 10.73.72.188 icmp_seq=10 Destination Host Unreachable
03:42:45 INFO | Running 'kill -19 290870'
03:42:46 INFO | Command 'kill -19 290870' finished with 0 after 0.00079345703125s
03:42:46 INFO | Running 'kill -19 290872'
03:42:46 INFO | Command 'kill -19 290872' finished with 0 after 0.0006191730499267578s
03:42:46 INFO | Running 'kill -2 290872'
03:42:46 INFO | Command 'kill -2 290872' finished with 0 after 0.0007576942443847656s
03:42:46 INFO | Running 'kill -2 290870'
03:42:46 INFO | Command 'kill -2 290870' finished with 0 after 0.0005733966827392578s
03:42:46 INFO | Running 'kill -18 290870'
03:42:46 INFO | Command 'kill -18 290870' finished with 0 after 0.0006244182586669922s
03:42:46 INFO | Running 'kill -18 290872'
03:42:46 DEBUG| 
03:42:46 INFO | Command 'kill -18 290872' finished with 0 after 0.0009965896606445312s
================================================================================================

Detailed log pls refer to attachment: disable-enable


4.18.0-193.3.1.el8_2.x86_64
qemu-kvm-4.2.0-21.module+el8.2.1+6586+8b7713b9.x86_64
seabios-bin-1.13.0-1.module+el8.2.0+5520+4e5817f3.noarch
virtio-win-prewhql-178/184/185
Guest:win8-32

Boot cmd:

    -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
    -device virtio-net-pci,mac=9a:18:cb:d5:92:4e,id=idliQoji,mq=on,vectors=14,netdev=idbR6Qlz,bus=pcie-root-port-3,addr=0x0  \
    -netdev tap,id=idbR6Qlz,vhost=on,vhostfds=27:28:29:30:31:32,fds=14:21:22:24:25:26 \


Thanks
Yu Wang

Comment 6 Yvugenfi@redhat.com 2020-05-25 09:59:45 UTC
There are several hotplug\unplug symptoms that we see:

1. No ping to the guest after a sequence of hotplug\unplug

2. QEMU crash

3. Guest crash in pci.sys (pci bus driver) on Windows Server 2019

I think we can treat them, at least for now, as the same general hotplug\unlug issue.

Comment 7 Yu Wang 2020-06-05 02:15:38 UTC
Run ws2019 on RHEL8.2.0 AV host, hit similar issue(build 185)

reboot several times (about 50-100 times), it cannot get ip.

Thanks
Yu Wang

Comment 10 xiagao 2020-08-19 09:19:08 UTC
Run win2012r2 on RHEL8.2.03.0 AV host, hit the similar issue(build 189)


{'execute':'netdev_add','arguments':{'type':'tap','id':'hostnet0','vhost':true,'script':'/etc/qemu-ifup','queues':32}}
{"return": {}}
{'execute':'device_add','arguments':{'driver':'virtio-net-pci','id':'net0','mac':'00:1a:4a:42:0b:01','netdev':'hostnet0','mq':'on','vectors':'64','bus':'pcie_extra_root_port_0'}}
{"return": {}}

----checked the guest, virtio-net device is shown in device manager, but can't get ip address. 
----after about 30s, return the following message, then guest get ip address.
{"timestamp": {"seconds": 1597828188, "microseconds": 717336}, "event": "NIC_RX_FILTER_CHANGED", "data": {"name": "net0", "path": "/machine/peripheral/net0/virtio-backend"}}

Comment 11 xiagao 2020-08-19 09:23:39 UTC
(In reply to xiagao from comment #10)
> Run win2012r2 on RHEL8.2.03.0 AV host, hit the similar issue(build 189)
> 
> 
> {'execute':'netdev_add','arguments':{'type':'tap','id':'hostnet0','vhost':
> true,'script':'/etc/qemu-ifup','queues':32}}
> {"return": {}}
> {'execute':'device_add','arguments':{'driver':'virtio-net-pci','id':'net0',
> 'mac':'00:1a:4a:42:0b:01','netdev':'hostnet0','mq':'on','vectors':'64','bus':
> 'pcie_extra_root_port_0'}}
> {"return": {}}
> 
> ----checked the guest, virtio-net device is shown in device manager, but
> can't get ip address. 
> ----after about 30s, return the following message, then guest get ip address.
> {"timestamp": {"seconds": 1597828188, "microseconds": 717336}, "event":
> "NIC_RX_FILTER_CHANGED", "data": {"name": "net0", "path":
> "/machine/peripheral/net0/virtio-backend"}}

pkg:
kernel-4.18.0-232.el8.x86_64
qemu-kvm-5.1.0-2.module+el8.3.0+7652+b30e6901.x86_64
virtio-win-prewhql-189
edk2-ovmf-20200602gitca407c7246bf-2.el8.noarch

Comment 12 xiagao 2020-08-25 03:13:58 UTC
Sometimes hit the similar issue when booting up windows guest with q35. When testing with vhost=off, the reproduce probability will increase.
Tried 20 times with vhost=on by automation,hit once.
Tried 20 times with vhost=off by automation,hit five times.


22:42:45 DEBUG| Sending command: wmic nicconfig where IPEnabled=True get ipaddress, macaddress
22:42:46 WARNI| No VM's NIC got IP address
22:43:59 DEBUG| Sending command: ipconfig || ifconfig
22:43:59 DEBUG| Sending command: ip route || route print
22:43:59 ERROR| Guest network status:
 
Windows IP Configuration


Ethernet adapter Ethernet:

   Connection-specific DNS Suffix  . : lab.eng.pek2.redhat.com
   Link-local IPv6 Address . . . . . : fe80::4920:c31:aaf7:73d7%6
   Default Gateway . . . . . . . . . : 

C:\Windows\system32>

Guest route table:
 'ip' is not recognized as an internal or external command,
operable program or batch file.
===========================================================================
Interface List
  6...9a a6 cb 0e eb f9 ......Red Hat VirtIO Ethernet Adapter
  1...........................Software Loopback Interface 1
===========================================================================

IPv4 Route Table
===========================================================================
Active Routes:
Network Destination        Netmask          Gateway       Interface  Metric
        127.0.0.0        255.0.0.0         On-link         127.0.0.1    331
        127.0.0.1  255.255.255.255         On-link         127.0.0.1    331
  127.255.255.255  255.255.255.255         On-link         127.0.0.1    331
        224.0.0.0        240.0.0.0         On-link         127.0.0.1    331
  255.255.255.255  255.255.255.255         On-link         127.0.0.1    331
===========================================================================
Persistent Routes:
  None

IPv6 Route Table
===========================================================================
Active Routes:
 If Metric Network Destination      Gateway
  1    331 ::1/128                  On-link
  6    271 fe80::/64                On-link
  6    271 fe80::4920:c31:aaf7:73d7/128
                                    On-link
  1    331 ff00::/8                 On-link
  6    271 ff00::/8                 On-link
===========================================================================
Persistent Routes:
  None

C:\Windows\system32>

Comment 13 Yvugenfi@redhat.com 2020-10-05 08:07:49 UTC
(In reply to xiagao from comment #10)
> Run win2012r2 on RHEL8.2.03.0 AV host, hit the similar issue(build 189)
> 
> 
> {'execute':'netdev_add','arguments':{'type':'tap','id':'hostnet0','vhost':
> true,'script':'/etc/qemu-ifup','queues':32}}
> {"return": {}}
> {'execute':'device_add','arguments':{'driver':'virtio-net-pci','id':'net0',
> 'mac':'00:1a:4a:42:0b:01','netdev':'hostnet0','mq':'on','vectors':'64','bus':
> 'pcie_extra_root_port_0'}}
> {"return": {}}
> 
> ----checked the guest, virtio-net device is shown in device manager, but
> can't get ip address. 
> ----after about 30s, return the following message, then guest get ip address.
> {"timestamp": {"seconds": 1597828188, "microseconds": 717336}, "event":
> "NIC_RX_FILTER_CHANGED", "data": {"name": "net0", "path":
> "/machine/peripheral/net0/virtio-backend"}}

Hi,

Let's make some order in BZ.

I think the above is not related to original bug. First of all the amount of vectors should be at least (number of queues * 2) + 1 (the +1 is for the control queue). If we have problem with this scenario, I don't think it is related to the hot plug. And I suggest to open separate bug.

Please test the original steps with latest QEMU and check if it is reprodcued. 

Thanks!

Comment 15 ybendito 2020-11-18 07:04:50 UTC
Posted to upstream
https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg04494.html

Comment 21 Lei Yang 2020-12-16 10:21:08 UTC
Hi, Yan

I tried to use qemu-kvm-5.2.0-1.module+el8.4.0+9091+650b220a.x86_64 and kernel-4.18.0-262.el8.x86_64 to verify the problem. 
1.Now rhel guest can works normally, but on the Windows guest still has the problem of could not get a valid ip after hot plug/unplug rtl8139 nic. So how should I solve this problem?
2.virtio-net and e1000e can work normally on Windows guest.

Best Regards
Lei Yang

Comment 22 ybendito 2020-12-16 14:27:36 UTC
I've checked it with upstream and 8.4.0 - the device does not disappear after hot unplug and does not work after hotplug
If I use 'scan for hardware changes' 0 the device disappears and after hot plug works as expected

Let's check the problem step-by-step:
1. start q35 vm with rtl8139 - the device works properly
2. hot unplug the device. Does it disappear from the device manager?
3. If during reasonable time it does not disappear: use 'scan for hardware changes'
4. After the device disappeared, hot-plug it. 
5. Does the device work properly now?

Comment 23 Lei Yang 2020-12-17 02:52:09 UTC
(In reply to ybendito from comment #22)
> I've checked it with upstream and 8.4.0 - the device does not disappear
> after hot unplug and does not work after hotplug
> If I use 'scan for hardware changes' 0 the device disappears and after hot
> plug works as expected
> 
> Let's check the problem step-by-step:
Hi, Yan
> 1. start q35 vm with rtl8139 - the device works properly
Yes, boot windows guest with rtl8139 nic is works normally.
> 2. hot unplug the device. Does it disappear from the device manager?
No, The device will not disappear from the device manager
> 3. If during reasonable time it does not disappear: use 'scan for hardware changes'
Device will disappear from device management after using “scan for hardware changes”
> 4. After the device disappeared, hot-plug it. 
> 5. Does the device work properly now?
When the device disappears and re-hotplug it, it can work normally

Best Regards
Lei Yang

Comment 24 ybendito 2020-12-17 07:49:03 UTC
It is possible that this behavior happens because the rtl8139 is not a PCIe device.
This is different problem (or not a problem) so please open separate BZ for rtl8139 on q35.
Please label this new bz as a blocker of https://bugzilla.redhat.com/show_bug.cgi?id=1744438 (tracker).

Comment 25 Lei Yang 2020-12-17 08:55:47 UTC
==Steps
1.Boot guest
==Reproduced with qemu-kvm-4.2.0-19.module+el8.2.0+6296+6b821950.x86_64
2.After steps,can not get a valid ip address after hot plug/unplug nic many times.
Hot plug:
{ "execute": "netdev_add", "arguments": { "type":"tap","id":"hostnet0"}}
{"execute": "device_add", "arguments": { "driver":"virtio-net-pci","netdev":"hostnet0","mac":"00:1a:4a:42:0b:01","id": "net0","bus":"pcie-root-port-3"}}
Hot unplug:
{"execute": "device_del", "arguments": {"id": "net0"}}
{"execute": "netdev_del", "arguments": {"id": "hostnet0"}}

So this bug has been reproduced.

==Verified with qemu-kvm-5.2.0-1.module+el8.4.0+9091+650b220a.x86_64

1.Guest can get a valid ip address after hot plug/unplug nic many times.
Hot plug:
{ "execute": "netdev_add", "arguments": { "type":"tap","id":"hostnet0"}}
{"execute": "device_add", "arguments": { "driver":"virtio-net-pci","netdev":"hostnet0","mac":"00:1a:4a:42:0b:01","id": "net0","bus":"pcie-root-port-3"}}
Hot unplug:
{"execute": "device_del", "arguments": {"id": "net0"}}
{"execute": "netdev_del", "arguments": {"id": "hostnet0"}}

So this bug has been fixed very well. Move to 'VERIFIED'.

Additional info:
This bug only verifies that PCIe device, rtl8139 still has problems, and has been tracked separately in Bug 1908633.

Comment 27 errata-xmlrpc 2021-05-25 06:42:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:2098


Note You need to log in before you can comment on or make changes to this bug.