1693587 – RFE: support for net failover devices in libvirt

Bug 1693587 - RFE: support for net failover devices in libvirt

Summary: RFE: support for net failover devices in libvirt

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux Advanced Virtualization
Classification:	Red Hat
Component:	libvirt
Sub Component:
Version:	8.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	unspecified
Target Milestone:	rc
Target Release:	---
Assignee:	Laine Stump
QA Contact:	Luyao Huang
Docs Contact:
URL:
Whiteboard:
Depends On:	1718673 1757796
Blocks:	1688177 1760395 1848983
TreeView+	depends on / blocked

Reported:	2019-03-28 09:21 UTC by Jens Freimann
Modified:	2020-06-19 12:36 UTC (History)
CC List:	22 users (show)
Fixed In Version:	libvirt-6.0.0-3.el8
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1718673 1760395 (view as bug list)
Environment:
Last Closed:	2020-05-05 09:45:09 UTC
Type:	Feature Request
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2020:2017	0	None	None	None	2020-05-05 09:47:03 UTC

Description Jens Freimann 2019-03-28 09:21:19 UTC

This BZ is for the libvirt side of support for network failover support.

The general idea is that we have a pair of devices, a vfio-pci and a
emulated device. Before migration the vfio-pci networking device is unplugged and data flows to the emulated device, on the target side another vfio-pci device
is plugged in to take over the data-path. In the guest the net_failover
module will pair net devices with the same MAC address.

The guest part is already upstream and described in https://www.kernel.org/doc/html/latest/networking/net_failover.html

The QEMU part is at the moment still work in progress. It is expected that it will require new parameters to virtio-net and vfio-pci devices. More details will follow. 

Current QEMU RFC: https://www.mail-archive.com/qemu-devel@nongnu.org/msg606906.html

Comment 9 Laine Stump 2019-10-08 01:17:46 UTC

I had previously decided on the following XML for libvirt:

  <interface type='hostdev'>
    <mac address='blah'/>
    <driver backup='ua-muhbackup'/>
  </interface>

  <interface type='bridge'>
    <mac address='blah'/>
    <model type='virtio'/>
    <alias name='ua-muhbackup'/>
    <driver failover='on'/>
  </interface>'

Although it's not a libvirt requirement (or qemu requirement, afaik), the *guest* virtio-net driver requires the MAC addresses of the two devices to match, as that is how it matches the master (hostdev) and backup (virtio-net) devices.

However, I heard from oVirt/RHV (Hi Dan!) that they can't have two network interfaces with the same MAC address. We then had a short discussion with mtsirkin where a couple of ideas were proposed for getting around this restriction, but I'm not really comfortable with either of them.

So this afternoon I hit on another idea that can be simply implemented within libvirt, but with the loss of one feature:

Instead of using <interface type='hostdev'> for the hostdev device, we can use plain <hostdev>. Since the only "interfacey" thing we're using from the interface config is the MAC address anyway, this won't be a problem. So we'll end up with this:

  <hostdev mode='subsystem' type='pci' managed='yes'>
    <source>
      <address domain='0' bus='0x2' slot='0x10' function='0x5'/>
    </source>
    <driver backup='ua-muhbackup'/>
  </hostdev>

  <interface type='bridge'>
    <mac address='blah'/>
    <model type='virtio'/>
    <alias name='ua-muhbackup'/>
    <driver failover='on'/>
  </interface>'

The result will be that there will only be a single interface with the given MAC address. When plugging in the hostdev device, libvirt will grab the MAC from the virtio device and set it just as it normally would for <interface type='hostdev'>.

The functionality that is lost in doing this is that we can no longer take advantage of hostdev device pools in libvirt networks (e.g. <interface type='network'>). So if oVirt is relying on that functionality for the pools of VFs to use, then this won't work for them (at least not immediately). If oVirt, like OpenStack, manages the VF pools itself, then this won't be a problem.


Dan - will the above work for you?

Comment 10 Dan Kenigsberg 2019-10-10 09:13:24 UTC

> Dan - will the above work for you?

Not easily, I'm afraid. In RHV we already have a mechanism to unplug an sriov <interface> on the migration source, and the choose an <interface> to plug at the destination. It works fine, but puts a burden on the user to create a proper bonding inside the guest. The only problem I'd like to fix, and the only change I wish to make is to have this in-guest bonding created automatically by the guest OS.

Moving RHVM to <hostdev> does not seem trivial at all. Selecting an available <interface> took months to code in ovirt-engine, and would not be easy to extend.

I'm looking for an implementation that changes as little as possible. Ideally, I'd like to see two <interfaces> each with its own "persistent mac", with only <driver backup='ua-muhbackup'/> added. I *think* that it is possible to make this happen both in libvirt and in virtio. libvirt can pass the pci address of the "master" device (or even its mac address) to the guest virtio driver, which should be able to create the bond based on it.

Comment 12 Laine Stump 2019-10-21 01:28:48 UTC

Another possibility (although I hesitate to mention it because it seems very ugly) would be to allow the MAC addresses of the two devices to be different in the config, but when we know that the guest requires them to match, we just ignore the MAC address from the hostdev's config, and set it up with the MAC of the virtio device.

The problem is that I wouldn't want this to be a permanent thing, as the time may come when that isn't necessary, desired, or even permitable. But since the only entity who *really* knows whether or not matching MACs is a requirement is the guest virtio driver, there's really no way for libvirt to do this automatically.

If it becomes necessary, maybe we could have it settable with an option in the hostdev device? e.g.:

  <hostdev mode='subsystem' type='pci' managed='yes'>
    <source>
      <address domain='0' bus='0x2' slot='0x10' function='0x5'/>
    </source>
    <driver backup='ua-muhbackup' usebackupmac='yes'/> (or some other appropriate name)
  </hostdev>

Comment 13 Laine Stump 2019-10-22 15:30:54 UTC

Sorry, what I *meant* to put as an example was this:

  <interface type='hostdev'>
    <mac address='blah'/>
    <driver backup='ua-muhbackup' usebackupmac='yes'/>
  </interface>

Comment 14 Dan Kenigsberg 2019-10-23 08:47:30 UTC

(In reply to Laine Stump from comment #13)
> Sorry, what I *meant* to put as an example was this:
> 
>   <interface type='hostdev'>
>     <mac address='blah'/>
>     <driver backup='ua-muhbackup' usebackupmac='yes'/>
>   </interface>

Yes, I believe that this would work for ovirt.

Comment 20 Daniel Berrangé 2019-12-24 14:08:00 UTC

(In reply to Laine Stump from comment #12)
> Another possibility (although I hesitate to mention it because it seems very
> ugly) would be to allow the MAC addresses of the two devices to be different
> in the config, but when we know that the guest requires them to match, we
> just ignore the MAC address from the hostdev's config, and set it up with
> the MAC of the virtio device.
> 
> The problem is that I wouldn't want this to be a permanent thing, as the
> time may come when that isn't necessary, desired, or even permitable. But
> since the only entity who *really* knows whether or not matching MACs is a
> requirement is the guest virtio driver, there's really no way for libvirt to
> do this automatically.

I don't think libvirt should be in the business of playing games
with MAC addresses. From the QEMU configuration POV, all libvirt
needs to do is be able to set failover=on on the virtio-net device
and set failover_pair_id=net1 on the vfio-pci device.

The Linux driver is documented as requiring that *both* these
devices have the same MAC address configured, but this is not
something libvirt needs to have any knowledge of as it is a
guest OS requirement, not a QEMU requirement.

Libvirt simply needs to provide a way to set the MAC address
for both NICs. The user or mgmt app can then set both NICs to
the same MAC address if this is mandatory for the guest OS that
is being deployed.

Libvirt must not enforce identical MAC addresses, unless this
is mandated by QEMU itself. Nor must libvirt silently ignore
the MAC address configured in the XML to force both MAC
addresses to be the same.

IOW, if identical MAC addresses are required by the guest OS,
then the XML given to libvirt by RHEV must honour this when
creating its XML doc for that guest.

Comment 22 Laine Stump 2020-01-17 15:53:59 UTC

BTW, for anyone following along from the peanut gallery, I do have patches that are working, and am planning to post them upstream today (I had delayed because a qemu crash showed up during testing, but we identified that crash as an already known (and fixed upstream) unrelated bug in qemu.

Comment 23 Laine Stump 2020-01-20 17:58:26 UTC

Patches upstream:

https://www.redhat.com/archives/libvir-list/2020-January/msg00813.html

Patch 12 implements the "useBackupMAC" attribute I mentioned in Comment 12. See the commit log for reasons I'm not really happy with it. Could the problem instead be solved by vdsm modifying the MAC address of the hostdev NIC (to match the MAC address of the virtio NIC) before passing on the XML to libvirt? This would allow RHEV use different MAC addresses for the two interfaces, but they would be identical by the time they got to the guest driver.

Comment 24 Dan Kenigsberg 2020-01-21 10:26:56 UTC

Small correction: it has been years since ovirt-engine is the ovirt entity generating the domxml. Dominik may have another look to see if our recent-ish use of aliases may allow us to accept two libvirt interfaces with the same mac address (or respond on https://www.redhat.com/archives/libvir-list/2020-January/msg00825.html ). But I must restate my opinion that useBackupMAC is a nice API, similar to what in-guest bonding have.

Comment 25 Laine Stump 2020-01-25 17:39:07 UTC

V2 posted upstream: https://www.redhat.com/archives/libvir-list/2020-January/msg01025.html

After discussion on the mailing list and in IRC, we decided it was better to implement as a separate new subelement called <teaming>, e.g.:

  <interface type='bridge'>
    <source bridge='br0'/>
    <model type='virtio'/>
    <mac address='00:11:22:33:44:55'/>
    <alias name='ua-backup0'/>
    <teaming type='persistent'/>
  </interface>
  <interface type='network>
    <source network='hostdev-pool'/>
    <mac address='00:11:22:33:44:55'/>
    <teaming type='transient' persistent='ua-backup0'/>
  </interface>

Comment 26 Laine Stump 2020-01-30 19:17:23 UTC

Pushed upstream:

commit cad65f222f29dffd4e91d43b230665aca813c7a6
Author: Laine Stump <laine>
Date:   Sun Dec 8 14:22:34 2019 -0500

    qemu: add capabilities flag for failover feature

commit fb0509d06ac57434c2edbd81ee63deb32a0e598a
Author: Laine Stump <laine>
Date:   Wed Jan 22 16:24:10 2020 -0500

    conf: parse/format <teaming> subelement of <interface>
    
commit eb9f6cc4b3464707cf689fda9812e5129003bf27
Author: Laine Stump <laine>
Date:   Thu Jan 23 15:34:53 2020 -0500

    qemu: support interface <teaming> functionality
    
commit 2758f680b7d586baf084f340b153d7706b8ce12b
Author: Laine Stump <laine>
Date:   Thu Jan 9 19:39:47 2020 -0500

    qemu: allow migration with assigned PCI hostdev if <teaming> is set
    
commit 8a226ddb3602586a2ba2359afc4448c02f566a0e
Author: Laine Stump <laine>
Date:   Wed Jan 15 16:38:57 2020 -0500

    qemu: add wait-unplug to qemu migration status enum
    
commit f0f34056ab26eaa9f903a51cd1fa155088fd640f
Author: Laine Stump <laine>
Date:   Thu Jan 23 21:34:01 2020 -0500

    docs: document <interface> subelement <teaming>

Comment 29 yalzhang@redhat.com 2020-03-25 03:03:32 UTC

Test with below version & configuration, with hardware intel X520/82599ES, the migration succeed, but the standby bridge virtio net does not work. This may be the same issue with "Bug 1789206 - ping can not always work during live migration of vm with VF". I will confirm and test it after the progress of above bug.

Host:
libvirt-libs-6.0.0-13.module+el8.2.0+6048+0fa476b4.x86_64
qemu-kvm-4.2.0-15.module+el8.2.0+6029+618ef2ec.x86_64
kernel-4.18.0-189.el8.x86_64

VM:
kernel-4.18.0-187.el8.x86_64

On src and target host, prepare 2 networks, one is hostdev interface pool, another is connected to host bridge, and the bridge is connected to the same PF as the hostdev network.

1.
# virsh net-dumpxml hostdevnet --inactive
<network>
  <name>hostdevnet</name>
  <uuid>d1d21152-ab38-4b66-9ddd-13d788b727fc</uuid>
  <forward mode='hostdev' managed='yes'>
    <pf dev='enp130s0f1'/>
  </forward>
</network>

# virsh net-dumpxml host-bridge
<network>
  <name>host-bridge</name>
  <uuid>75b161a9-79c9-4090-b961-10d8b19aff7e</uuid>
  <forward mode='bridge'/>
  <bridge name='br0'/>  
</network>

# nmcli con show enp130s0f1  | grep connection.master -A1
connection.master:                      br0
connection.slave-type:                  bridge

VM's configuration:
# virsh dumpxml rhel | grep /interface -B12
...
<interface type='network'>
      <mac address='fe:54:00:d2:24:55'/>
      <source network='host-bridge'/>
      <model type='virtio'/>
      <teaming type='persistent'/>
      <alias name='ua-backup0'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
    </interface>
    <interface type='network'>
      <mac address='fe:54:00:d2:24:55'/>
      <source network='hostdevnet'/>
      <model type='virtio'/>
      <teaming type='transient' persistent='ua-backup0'/>
      <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
    </interface>


2. Start the vm, and check the interface status:
[src host]# virsh start rhel --console
...
[vm]# ifconfig
enp1s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.73.*.*  netmask 255.255.254.0  broadcast 10.73.*.*
        inet6 2620:52:0:4920:fc54:ff:fed2:2455  prefixlen 64  scopeid 0x0<global>
        inet6 fe80::fc54:ff:fed2:2455  prefixlen 64  scopeid 0x20<link>
        ether fe:54:00:d2:24:55  txqueuelen 1000  (Ethernet)
        RX packets 151  bytes 22983 (22.4 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 132  bytes 14919 (14.5 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp1s0nsby: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::b68f:2db9:bd33:c4d6  prefixlen 64  scopeid 0x20<link>
        ether fe:54:00:d2:24:55  txqueuelen 1000  (Ethernet)
        RX packets 89  bytes 8866 (8.6 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 11  bytes 1790 (1.7 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp7s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.73.*.*  netmask 255.255.254.0  broadcast 10.73.*.*           ------> same ip with the enp1s0
        inet6 fe80::560a:f55b:a6a2:c5c4  prefixlen 64  scopeid 0x20<link>
        ether fe:54:00:d2:24:55  txqueuelen 1000  (Ethernet)
        RX packets 62  bytes 14117 (13.7 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 121  bytes 13129 (12.8 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[vm]# ping 10.72.*.*  (my laptop)
PING 10.72.13.8 (10.72.*.*) 56(84) bytes of data.
64 bytes from 10.72.*.*: icmp_seq=1 ttl=61 time=7.02 ms
64 bytes from 10.72.*.*: icmp_seq=2 ttl=61 time=6.48 ms
....


3.On another terminal, run migration:
[src host]# virsh migrate  rhel qemu+ssh://per59/system --live --verbose --bandwidth 6

and check the ping status on the vm:
[vm on src host]
......
64 bytes from 10.72.*.*: icmp_seq=139 ttl=61 time=46.7 ms
[  239.548356] pcieport 0000:00:02.6: Slot(0-6): Attention button pressed
[  239.549633] pcieport 0000:00:02.6: Slot(0-6): Powering off due to button press
6t4 bytes from 10.72.*.*: icmp_seq=140 ttl=61 time=70.4 ms
64 bytes from 10.72.*.*: icmp_seq=141 ttl=61 time=91.2 ms
64 bytes from 10.72.*.*: icmp_seq=142 ttl=61 time=6.99 ms
64 bytes from 10.72.*.*: icmp_seq=143 ttl=61 time=6.44 ms
64 bytes from 10.72.*.*: icmp_seq=144 ttl=61 time=6.56 ms
64 bytes from 10.72.*.*: icmp_seq=145 ttl=61 time=6.85 ms
[  244.867432] virtio_net virtio0 enp1s0: failover primary slave:enp7s0 unregistered
From 10.73.*.* icmp_seq=161 Destination Host Unreachable
From 10.73.*.* icmp_seq=162 Destination Host Unreachable
...
(after the unregister action, the network of the vm never works)


4. migration succeed, check on the target host, the vm network works well, the ping recovered:
[vm on target host]
64 bytes from 10.72.*.*: icmp_seq=435 ttl=61 time=6.92 ms
64 bytes from 10.72.*.*: icmp_seq=436 ttl=61 time=114 ms
...
--- 10.72.*.* ping statistics ---
491 packets transmitted, 406 received, +51 errors, 17.3116% packet loss, time 1171ms
rtt min/avg/max/mdev = 5.398/42.727/120.439/37.327 ms, pipe 4

# ifconfig 
(the outputs is the same as on the src host before migration)

5. migrate back succeed, and after the hostdev unregistered, the standby does not work, either.

Comment 30 yalzhang@redhat.com 2020-03-25 05:33:05 UTC

Same configuration and version with above comment 29, test hotplug hostdev interface:

1. Start vm with the bridge interface as below:

# virsh dumpxml rhel --inactive| grep /interface -B7
    <interface type='network'>
      <mac address='fe:54:00:d2:24:55'/>
      <source network='host-bridge'/>
      <model type='virtio'/>
      <teaming type='persistent'/>
      <alias name='ua-backup0'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
    </interface>

2. Check the vm network status, the master interface and standby interface get different ip address, and the network works well:

# ifconfig
enp1s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.73.33.19*  netmask 255.255.254.0  broadcast 10.73.33.255
        inet6 fe80::fc54:ff:fed2:2455  prefixlen 64  scopeid 0x20<link>
        inet6 2620:52:0:4920:fc54:ff:fed2:2455  prefixlen 64  scopeid 0x0<global>
        ether fe:54:00:d2:24:55  txqueuelen 1000  (Ethernet)
        RX packets 387  bytes 41165 (40.2 KiB)
        RX errors 0  dropped 99  overruns 0  frame 0
        TX packets 294  bytes 28585 (27.9 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp1s0nsby: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.73.33.17*  netmask 255.255.254.0  broadcast 10.73.33.255
        inet6 fe80::b68f:2db9:bd33:c4d6  prefixlen 64  scopeid 0x20<link>
        ether fe:54:00:d2:24:55  txqueuelen 1000  (Ethernet)
        RX packets 387  bytes 41165 (40.2 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 294  bytes 28585 (27.9 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
# ping -4 www.baidu.com
PING www.wshifen.com (104.193.88.123) 56(84) bytes of data.
64 bytes from 104.193.88.123 (104.193.88.123): icmp_seq=1 ttl=47 time=189 ms
64 bytes from 104.193.88.123 (104.193.88.123): icmp_seq=2 ttl=47 time=189 ms

--- www.wshifen.com ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 3ms
rtt min/avg/max/mdev = 188.846/188.869/188.893/0.435 ms

3. Prepare the hostdev interface xml and do hotplug:

# cat hostdev_interface.xml 
<interface type='network'>
      <mac address='fe:54:00:d2:24:55'/>
      <source network='hostdevnet'/>
      <model type='virtio'/>
      <teaming type='transient' persistent='ua-backup0'/>
    </interface>

# virsh attach-device rhel hostdev_interface.xml 
Device attached successfully

4.Check on the vm, the network works well after hotplug

......
64 bytes from 45.113.192.101 (45.113.192.101): icmp_seq=18 ttl=43 time=207 ms
64 bytes from 45.113.192.101 (45.113.192.101): icmp_seq=19 ttl=43 time=207 ms
[   66.805818] pcieport 0000:00:02.6: Slot(0-6): Attention button pressed
[   66.807413] pcieport 0000:00:02.6: Slot(0-6) Powering on due to button press
[   66.808616] pcieport 0000:00:02.6: Slot(0-6): Card present
[   66.809399] pcieport 0000:00:02.6: Slot(0-6): Link Up
[   66.936301] pci 0000:07:00.0: [8086:10ed] type 00 class 0x020000
[   66.937425] pci 0000:07:00.0: reg 0x10: [mem 0x00000000-0x00003fff 64bit]
[   66.938612] pci 0000:07:00.0: reg 0x1c: [mem 0x00000000-0x00003fff 64bit]
[   66.940314] pci 0000:07:00.0: disabling ASPM on pre-1.1 PCIe device.  You can enable it with 'pcie_aspm=force'
[   66.941899] pci 0000:07:00.0: BAR 0: assigned [mem 0xfc200000-0xfc203fff 64bit]
[   66.943088] pci 0000:07:00.0: BAR 3: assigned [mem 0xfc204000-0xfc207fff 64bit]
[   66.944273] pcieport 0000:00:02.6: PCI bridge to [bus 07]
[   66.945128] pcieport 0000:00:02.6:   bridge window [io  0x7000-0x7fff]
[   66.948195] pcieport 0000:00:02.6:   bridge window [mem 0xfc200000-0xfc3fffff]
[   66.950456] pcieport 0000:00:02.6:   bridge window [mem 0xfde00000-0xfdffffff 64bit pref]
[   67.176392] ixgbevf: Intel(R) 10 Gigabit PCI Express Virtual Function Network Driver - version 4.1.0-k-rh8.2.0
[   67.177955] ixgbevf: Copyright (c) 2009 - 2018 Intel Corporation.
[   67.179308] ixgbevf 0000:07:00.0: enabling device (0000 -> 0002)
[   67.210175] ixgbevf 0000:07:00.0: NIC Link is Up 10 Gbps
[   67.211704] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[   67.216605] virtio_net virtio0 enp1s0: failover primary slave:eth0 registered
[   67.218086] ixgbevf 0000:07:00.0: fe:54:00:d2:24:55
[   67.219094] ixgbevf 0000:07:00.0: MAC: 1
[   67.219909] ixgbevf 0000:07:00.0: Intel(R) 82599 Virtual Function
[   67.260233] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[   67.268520] ixgbevf 0000:07:00.0: NIC Link is Up 10 Gbps
[   67.269758] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
64 bytes from 45.113.192.101 (45.113.192.101): icmp_seq=21 ttl=43 time=208 ms
64 bytes from 45.113.192.101 (45.113.192.101): icmp_seq=22 ttl=43 time=207 ms
.....
--- www.wshifen.com ping statistics ---
26 packets transmitted, 19 received, 26.9231% packet loss, time 362ms
rtt min/avg/max/mdev = 206.778/207.408/207.767/0.319 ms

[root@bootp-73-33-198 ~]# ifconfig
enp1s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.73.33.198  netmask 255.255.254.0  broadcast 10.73.33.255
        inet6 2620:52:0:4920:fc54:ff:fed2:2455  prefixlen 64  scopeid 0x0<global>
        inet6 fe80::fc54:ff:fed2:2455  prefixlen 64  scopeid 0x20<link>
        ether fe:54:00:d2:24:55  txqueuelen 1000  (Ethernet)
        RX packets 215  bytes 26687 (26.0 KiB)
        RX errors 0  dropped 40  overruns 0  frame 0
        TX packets 156  bytes 15655 (15.2 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp1s0nsby: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.73.33.170  netmask 255.255.254.0  broadcast 10.73.33.255
        inet6 fe80::b68f:2db9:bd33:c4d6  prefixlen 64  scopeid 0x20<link>
        ether fe:54:00:d2:24:55  txqueuelen 1000  (Ethernet)
        RX packets 199  bytes 24176 (23.6 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 138  bytes 13506 (13.1 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.73.33.198  netmask 255.255.254.0  broadcast 10.73.33.255
        inet6 fe80::ad64:6233:a21:4f76  prefixlen 64  scopeid 0x20<link>
        ether fe:54:00:d2:24:55  txqueuelen 1000  (Ethernet)
        RX packets 16  bytes 2511 (2.4 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 18  bytes 2149 (2.0 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

Comment 31 Laine Stump 2020-03-30 16:49:07 UTC

yalzhang - I meant to respond to this last week, but it got lost in the clutter of other emails :-/


The fact that the "enp1s0nsby" device shows up in the guest, and that migration is allowed shows that the libvirt part of this feature is successfully working. (libvirt's job is to add the proper options to the commandline (if that hadn't happened, you wouldn't see the enp1s0nsby device in the guest), and to allow migration when there is a hostdev device, as long as it has a designated virtio "failover" device associated with it.

Once this basic functionality is working in libvirt, the actual runtime operation is taken care of by qemu on the host, and the kernel driver in the guest. 

he problems you've discovered are likely in one of those two areas, and should have a separate BZ filed for them. I'm not sure whether it is better to first file against the guest virtio driver for triage, or to qemu - Jens or Juan: do you have an opinion on that?

In the meantime, based on the information I've seen from your testing, in my opinion this BZ can be marked as VERIFIED.

Comment 32 Laine Stump 2020-03-30 16:51:59 UTC

(Also, although my testing has shown that the igb driver works properly with this feature, I think I recall being told by Jens that the ixgbe driver *doesn't* work properly.)

Comment 33 yalzhang@redhat.com 2020-04-01 07:30:37 UTC

Hi laine, Thank you very much! move this but to be verified.

Comment 34 Jens Freimann 2020-04-01 07:46:29 UTC

Yes, I think the libvirt part works as it's supposed to and we'll look into the other issue. Regarding ixgbe I think it doesn't work because the driver sets up the MAC filter to direct packets to the VF before the migrated-to VM is up.

Comment 35 Juan Quintela 2020-04-07 07:46:32 UTC

I think it is better that we add it as a virtio-net device bugzilla.

Thanks Laine.

Comment 37 errata-xmlrpc 2020-05-05 09:45:09 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2017

Note You need to log in before you can comment on or make changes to this bug.

aadam
ailan
berrange
danken
dholler
dyuan
gcase
jdenemar
jsuchane
kchamart
knoel
laine
moshele
mst
mtessun
quintela
rbalakri
xuzhang
yafu
yalzhang
yanghliu
yicui