1789206 – ping can not always work during live migration of vm with VF

This bug has been migrated to another issue tracking site. It has been closed here and may no longer be being monitored.

If you would like to get updates for this issue, or to participate in it, you may do so at Red Hat Issue Tracker .

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1789206 - ping can not always work during live migration of vm with VF

Summary: ping can not always work during live migration of vm with VF

Keywords:
Status:	CLOSED MIGRATED
Alias:	None
Product:	Red Hat Enterprise Linux 9
Classification:	Red Hat
Component:	qemu-kvm
Sub Component:
Version:	9.0
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Laurent Vivier
QA Contact:	Yanhui Ma
Docs Contact:	Jiri Herrmann
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-01-09 03:31 UTC by Yanghang Liu
Modified:	2023-09-22 16:14 UTC (History)
CC List:	15 users (show)
Fixed In Version:
Doc Type:	Known Issue
Doc Text:	.Host network cannot ping VMs with VFs during live migration When live migrating a virtual machine (VM) with a configured virtual function (VF), such as a VMs that uses virtual SR-IOV software, the network of the VM is not visible to other devices and the VM cannot be reached by commands such as `ping`. After the migration is finished, however, the problem no longer occurs.
Clone Of:
Environment:
Last Closed:	2023-09-22 16:14:28 UTC
Type:	Bug
Target Upstream Version:
Embargoed:
Dependent Products:
Flags:	pm-rhel: mirror+

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	RHEL-7336	0	None	Migrated	None	2023-09-22 16:14:22 UTC

Description Yanghang Liu 2020-01-09 03:31:05 UTC

Description of problem:
During the live migration of vm with VF, I can not get ping reply from guest immediately after the VF is hot-unplugged.(the vm didn't reach downtime at this time).

Version-Release number of selected component (if applicable):
Host:
4.18.0-147.3.1.el8_1.x86_64
qemu-kvm-4.1.0-20.module+el8.1.1+5309+6d656f05.x86_64
Guest:
4.18.0-147.3.1.el8_1.x86_64

How reproducible:
10/10

Steps to Reproduce:
1.On source host,create 82599ES VF and set the mac address of the VF
ip link set enp6s0f0  vf 0  mac 22:2b:62:bb:a9:82

2.start a source guest with 82599ES VF which enables failover 
/usr/libexec/qemu-kvm -name rhel811 -M q35 -enable-kvm \
-monitor stdio \
-nodefaults \
-m 4G \
-boot menu=on \
-cpu Haswell-noTSX-IBRS \
-device pcie-root-port,id=root.1,chassis=1,addr=0x2.0,multifunction=on \
-device pcie-root-port,id=root.2,chassis=2,addr=0x2.1 \
-device pcie-root-port,id=root.3,chassis=3,addr=0x2.2 \
-device pcie-root-port,id=root.4,chassis=4,addr=0x2.3 \
-device pcie-root-port,id=root.5,chassis=5,addr=0x2.4 \
-device pcie-root-port,id=root.6,chassis=6,addr=0x2.5 \
-device pcie-root-port,id=root.7,chassis=7,addr=0x2.6 \
-device pcie-root-port,id=root.8,chassis=8,addr=0x2.7 \
-smp 2,sockets=1,cores=2,threads=2,maxcpus=4 \
-qmp tcp:0:6666,server,nowait \
-blockdev node-name=back_image,driver=file,cache.direct=on,cache.no-flush=off,filename=/nfsmount/migra_test/rhel811_q35.qcow2,aio=threads \
-blockdev node-name=drive-virtio-disk0,driver=qcow2,cache.direct=on,cache.no-flush=off,file=back_image \
-device virtio-blk-pci,drive=drive-virtio-disk0,id=disk0,bus=root.1 \
-device VGA,id=video1,bus=root.2  \
-vnc :0 \
-netdev tap,id=hostnet0,vhost=on \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=22:2b:62:bb:a9:82,bus=root.3,failover=on \
-device vfio-pci,host=0000:06:10.0,id=hostdev0,bus=root.4,failover_pair_id=net0 \


3.check the network info in source guest
# ifconfig 
enp3s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.73.33.236  netmask 255.255.254.0  broadcast 10.73.33.255
        ether 22:2b:62:bb:a9:82  txqueuelen 1000  (Ethernet)
        RX packets 28683  bytes 1961744 (1.8 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 93  bytes 13770 (13.4 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp3s0nsby: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        ether 22:2b:62:bb:a9:82  txqueuelen 1000  (Ethernet)
        RX packets 28345  bytes 1924974 (1.8 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp4s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        ether 22:2b:62:bb:a9:82  txqueuelen 1000  (Ethernet)
        RX packets 339  bytes 36836 (35.9 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 95  bytes 14406 (14.0 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

# ip link show 
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 22:2b:62:bb:a9:82 brd ff:ff:ff:ff:ff:ff
3: enp3s0nsby: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master enp3s0 state UP mode DEFAULT group default qlen 1000
    link/ether 22:2b:62:bb:a9:82 brd ff:ff:ff:ff:ff:ff
4: enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master enp3s0 state UP mode DEFAULT group default qlen 1000
    link/ether 22:2b:62:bb:a9:82 brd ff:ff:ff:ff:ff:ff


4.On target host,create NetXtreme BCM57810 VF and set the mac address of the VF 
ip link set enp131s0f0 vf 0  mac 22:2b:62:bb:a9:82


5.start a target guest in listening mode in order to  wait for migrating from source guest
...
-incoming tcp:0:5800 \

6.keep pinging the vm during the migration
# ping 10.73.33.236

7.Migrate guest from source host to target host. 
(qemu) migrate -d tcp:10.73.73.73:5800
migrate guest successfully.

8.check ping output
# ping 10.73.33.236 
64 bytes from 10.73.33.236: icmp_seq=59 ttl=61 time=3.07 ms
64 bytes from 10.73.33.236: icmp_seq=60 ttl=61 time=4.35 ms
64 bytes from 10.73.33.236: icmp_seq=61 ttl=61 time=2.10 ms
64 bytes from 10.73.33.236: icmp_seq=62 ttl=61 time=4.53 ms[1]
64 bytes from 10.73.33.236: icmp_seq=88 ttl=61 time=7.39 ms[2]  
64 bytes from 10.73.33.236: icmp_seq=89 ttl=61 time=4.35 ms
64 bytes from 10.73.33.236: icmp_seq=90 ttl=61 time=5.82 ms
64 bytes from 10.73.33.236: icmp_seq=91 ttl=61 time=4.39 ms

[1]
when "virtio_net virtio1 enp3s0: failover primary slave:enp4s0 unregistered" is outputed in source guest vm dmesg,ping will not work until the migration is completed.
[2]
when migration is completed,ping works again.

Actual results:
when "virtio_net virtio1 enp3s0: failover primary slave:enp4s0 unregistered" is outputed in source guest vm dmesg,ping will not work until the migration is completed.

Expected results:
ping should always work during migration, because hypervisor will fail over to the virtio netdev datapath when the VF is unplugged.

Additional info:
(1)
# lspci | grep -i 82599
06:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
06:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
(2)
This problem can be reproduced with NetXtreme II BCM57810 
(3)
This problem can be reproduced in RHEL82-AV
The test env info is as following:
host:
qemu-kvm-4.2.0-4.module+el8.2.0+5220+e82621dc.x86_64
4.18.0-167.el8.x86_64
guest:
4.18.0-167.el8.x86_64

Comment 2 Ademar Reis 2020-02-05 23:12:14 UTC

QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks

Comment 3 Jens Freimann 2020-02-07 13:01:02 UTC

The problem could be that the driver of the NIC updates the MAC filter to soon. I have a tool
to test this here. https://github.com/jensfr/netfailover_driver_detect with a description of the problem.
I can help with the test or do it myself given access to your system.

Comment 5 Jens Freimann 2020-03-10 14:09:37 UTC

Re-assigning to Juan

Comment 6 Juan Quintela 2020-06-17 07:50:57 UTC

We are out of time.
Moving to the next version.

Comment 7 Yanghang Liu 2020-09-25 04:57:53 UTC

This bug can be reproduced with 82599ES in
(1) host env:
qemu-kvm version : qemu-kvm-5.1.0-8.module+el8.3.0+8141+3cd9cd43.x86_64
kernel version : 4.18.0-238.el8.x86_64
(2) vm env:
kernel version : 4.18.0-238.el8.x86_64

Comment 10 Juan Quintela 2020-11-02 13:13:34 UTC

Hi
Can you test what is the result of this test:
https://github.com/jensfr/netfailover_driver_detect

Just to be sure if the problem is that it enables the destination link too soon? 

Thanks, Juan.

Comment 11 Yanghang Liu 2020-11-09 06:45:41 UTC

Hi Juan,

Very sorry for the late reply.

> Can you test what is the result of this test:
> https://github.com/jensfr/netfailover_driver_detect

I have tried to test this problem with the tool in the link you provided.

My test step is as follows:

(1) Start a host named HOST_A with 82599ES PF

The IP address of 82599ES PF : 10.73.33.142
The PCI address of 82599ES PF：0000:05:00.0
The Interface name of 82599ES PF : enp5s0f0
The mac address of 82599ES PF : 00:1b:21:c3:d0:3c

# ifconfig
...
enp5s0f0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.73.33.142  netmask 255.255.254.0  broadcast 10.73.33.255
        inet6 fe80::aa6d:d0a0:3531:a542  prefixlen 64  scopeid 0x20<link>
        inet6 2620:52:0:4920:f167:beaf:81b8:fd6  prefixlen 64  scopeid 0x0<global>
        ether 00:1b:21:c3:d0:3c  txqueuelen 1000  (Ethernet)
        RX packets 1885137  bytes 1288877814 (1.2 GiB)
        RX errors 0  dropped 468  overruns 0  frame 0
        TX packets 2445247  bytes 3114882646 (2.9 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
# lshw -c network -businfo
Bus info          Device      Class          Description
========================================================
pci@0000:05:00.0  enp5s0f0    network        82599ES 10-Gigabit SFI/SFP+ Network


(2) Start another host named HOST_B  with BCM57810 PF

The IP address of BCM57810 PF: 10.73.33.244
The PCI address of BCM57810 PF：0000:82:00.0
The Interface name of BCM57810 PF: enp130s0f0
The mac address of BCM57810 PF: 00:0a:f7:05:82:c0


# ifconfig
...
enp130s0f0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.73.33.244  netmask 255.255.254.0  broadcast 10.73.33.255
        inet6 fe80::20a:f7ff:fe05:82c0  prefixlen 64  scopeid 0x20<link>
        inet6 2620:52:0:4920:20a:f7ff:fe05:82c0  prefixlen 64  scopeid 0x0<global>
        ether 00:0a:f7:05:82:c0  txqueuelen 1000  (Ethernet)
        RX packets 5449  bytes 853348 (833.3 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1537  bytes 172913 (168.8 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 58  memory 0xc9000000-c97fffff
# lshw -c network -businfo
Bus info          Device        Class          Description
==========================================================
pci@0000:82:00.0  enp130s0f0    network        NetXtreme II BCM57810 10 Gigabit 


(3) Check if two hosts can communicate with each other
HOST_A(10.73.33.142) can ping  HOST_B(10.73.33.244) successfully
HOST_B(10.73.33.244) can ping HOST_A(10.73.33.142) successfully


(4) Create a VF based on 82599ES PF/BCM57810 PF

BCM57810 PF:
# echo 1 > /sys/bus/pci/devices/0000\:82\:00.0/sriov_numvfs
# ip link set enp130s0f0 vf 0  mac 22:2b:62:bb:a9:82
# ip link show enp130s0f0
3: enp130s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 00:0a:f7:05:82:c0 brd ff:ff:ff:ff:ff:ff
    vf 0     link/ether 22:2b:62:bb:a9:82 brd ff:ff:ff:ff:ff:ff, tx rate 10000 (Mbps), max_tx_rate 10000Mbps, spoof checking on, link-state auto

82599ES PF:
# echo 1 > /sys/bus/pci/devices/0000\:05\:00.0/sriov_numvfs
# ip link show enp5s0f0
9: enp5s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 00:1b:21:c3:d0:3c brd ff:ff:ff:ff:ff:ff
    vf 0     link/ether ce:6e:d6:ae:c9:93 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off, query_rss off


(5) Run the tool provided in the link https://github.com/jensfr/netfailover_driver_detect


> The Interface name of 82599ES PF : enp5s0f0
> The mac address of 82599ES PF : 00:1b:21:c3:d0:3c

> The Interface name of BCM57810 PF: enp130s0f0
> The mac address of BCM57810 PF: 00:0a:f7:05:82:c0

On HOST_A:
#  ./is_legacy -d enp5s0f0 -t 20 

On HOST_B:
#  ./send_packet -d enp130s0f0 -A 00:0a:f7:05:82:c0 -B 00:1b:21:c3:d0:3c 




(6) check the output after running the tool

On HOST_B:
#  ./send_packet -d enp130s0f0 -A 00:0a:f7:05:82:c0 -B 00:1b:21:c3:d0:3c
..........

On HOST_A:
#  ./is_legacy -d enp5s0f0 -t 20
timed out



As shown in step 6, I do not run this tool successfully.

Juan, could you please help check my test steps provided above ？

Feel free to tell me if I have not used this tool correctly or have missed any operations.

Thanks for your help in advance.

Comment 13 Juan Quintela 2021-01-04 20:00:43 UTC

Hi Yanghang

Just back from holidays, will try to "explain" your failure.

Later, Juan.

Comment 14 Juan Quintela 2021-03-11 07:20:45 UTC

Hi

Can I get access to the two machines that show this bug?

Comment 15 Juan Quintela 2021-03-11 07:32:15 UTC

Once there, some more questions:

- can you send me the whole network configuration of both source and destination hosts?
  specifically, where is the virtio-net device comunicating vs the SR-IOV one?
- What networking are you using really, because as far as I can understand
  you are mixing intel card in one side and Broadcom on the other side
  but comment 11 seems to indicate that you are trying to migrate from Intel to Broadcom.
- Only with source, without doing migration.  If you unplug the SR-IOV network card from source, ping continues working (it appears than not, so this should be a network configuration).

Comment 17 Yanghang Liu 2021-03-16 04:04:52 UTC

(In reply to Juan Quintela from comment #15)


In order to prevent the problems caused by different/incorrect network configurations used on the qemu side, 
I have tried to used libvirt to reproduce this problem:


Test env:
source host:dell-per730-27.lab.eng.pek2.redhat.com
target host:dell-per730-28.lab.eng.pek2.redhat.com



The step step:
1.  create a bridge based on a PF   

# lspci -v -s 06:00.0  
06:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
...
Capabilities: [160] Single Root I/O Virtualization (SR-IOV)

# nmcli connection add type bridge ifname br0 con-name br0 stp off autoconnect yes
# nmcli connection add type bridge-slave ifname enp6s0f0 con-name enp6s0f0 master br0 autoconnect yes
# systemctl restart NetworkManager


note:
We need to create a bridge on both source host and target host.
The PF on the souce host and the target can be different.


2.  setup an bridge network for vm based on the bridge 

# vim failover_bridge_network.xml 
<network> 
  <name>failover-bridge</name> 
  <forward mode='bridge'/> 
  <bridge name='br0'/> 
</network> 

# virsh net-define failover_bridge_network.xml
Network failover-bridge defined from failover_bridge_network.xml

# virsh net-autostart failover-bridge
Network failover-bridge marked as autostarted

# virsh net-start failover-bridge
Network failover-bridge started

# virsh net-list 
 Name                  State    Autostart   Persistent
-------------------------------------------------------------
...
 failover-bridge       active   yes             yes



note:
We need to create the bridge network on both source host and target host.
make sure the bridge network name on source host and target host are the same.
make sure the bridge network is active when do the failover vf migration


3. create a VF
# echo 1 > /sys/bus/pci/devices/0000\:06\:00.0/sriov_numvfs


note:
We need to create the VF on both source host and target host.
The PF on the souce host and the target can be different.


4.  setup a hostdev network based on the PF	
# vim failover_vf_network.xml 
<network>
  <name>failover-vf</name>
  <forward mode='hostdev' managed='yes'>
    <pf dev='enp6s0f0'/>
  </forward>
</network>

# virsh net-define failover_vf_network.xml
Network failover-vf defined from failover_vf_network.xml

# virsh net-autostart failover-vf  
Network failover-vf marked as autostarted

# virsh net-start failover-vf 
Network failover-vf started

# virsh net-list 
 Name                  State    Autostart   Persistent
-------------------------------------------------------------
...
 failover-vf       active   yes             yes


note:
We need to create the hostdev network on both source host and target host.
make sure the hostdev network name on source host and target host are the same.
make sure the hostdev network is active when do the failover vf migration


5.  use virt-install or virt-maneger to install a domain

The following is an example of virt-install cmdline:
# virt-install --machine=q35 --noreboot --name=Bug1789206  --memory=4096 --vcpus=4 --disk path=/nfs/images/RHEL84.qcow2,bus=virtio,cache=none,format=qcow2,io=threads,size=20 --graphics type=vnc,port=5900,listen=0.0.0.0  --network none  -l http://download.eng.pek2.redhat.com/nightly/rhel-8/RHEL-8/latest-RHEL-8.4.0/compose/BaseOS/x86_64/os/

note:
We only need to create the domain on the source host 
make sure there is no domain with the same domain name on the tatget host
make sure both the source host and the target host can access the same nfs shared dir


6. add the failover vf and failover virtio-net device to the domain

failover virtio-net device xml:
    <interface type='network'>
      <mac address='52:54:00:aa:1c:ef'/>
      <source network='failover-bridge'/>
      <model type='virtio'/>
      <teaming type='persistent'/>
      <alias name='ua-test'/>
    </interface>

failover vf xml:
    <interface type='network'>
      <mac address='52:54:00:aa:1c:ef'/>
      <source network='failover-vf'/>
      <teaming type='transient' persistent='ua-test'/>
    </interface>

#  virsh edit failover_vm
add the failover virtio-net device xml and failover vf xml into the domain xml


note:
make sure the mac address of the 2 interfaces are the same;
make sure the model of the bridge type interface is virtio;
make sure the 'persistent' of the hostdev interface point to the persistent bridge interface;


7. start the domain
#  virsh start Bug1789206

# ps -ef | grep -i qemu-kvm

-netdev tap,fd=39,id=hostua-test,vhost=on,vhostfd=40 
-device virtio-net-pci,failover=on,netdev=hostua-test,id=ua-test,mac=52:54:00:aa:1c:ef,bus=pci.1,addr=0x0
-device vfio-pci,host=0000:06:10.0,id=hostdev0,bus=pci.7,addr=0x0,failover_pair_id=ua-test

8. check the failover device status in the vm

# ifconfig 
enp1s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.73.33.134  netmask 255.255.254.0  broadcast 10.73.33.255
        inet6 2620:52:0:4920:4f2a:c7ed:de11:b8db  prefixlen 64  scopeid 0x0<global>
        inet6 fe80::85b0:7444:2539:538f  prefixlen 64  scopeid 0x20<link>
        ether 52:54:00:aa:1c:ef  txqueuelen 1000  (Ethernet)
        RX packets 267  bytes 37547 (36.6 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 117  bytes 18422 (17.9 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp1s0nsby: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::df09:886b:dce5:a748  prefixlen 64  scopeid 0x20<link>
        ether 52:54:00:aa:1c:ef  txqueuelen 1000  (Ethernet)
        RX packets 179  bytes 15531 (15.1 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 30  bytes 5920 (5.7 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp7s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.73.33.134  netmask 255.255.254.0  broadcast 10.73.33.255
        inet6 fe80::aca9:c6bc:3119:7818  prefixlen 64  scopeid 0x20<link>
        ether 52:54:00:aa:1c:ef  txqueuelen 1000  (Ethernet)
        RX packets 88  bytes 22016 (21.5 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 87  bytes 12502 (12.2 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

# dmesg | grep -i failover
[    7.679714] virtio_net virtio0 eth0: failover master:eth0 registered
[    7.685373] virtio_net virtio0 eth0: failover standby slave:eth1 registered
[   13.482718] virtio_net virtio0 enp1s0: failover primary slave:eth0 registered


9. ping the virtual machine from the third host
# ping 10.73.33.134

10.  migrate the vm from the source host to the target host
# virsh migrate  Bug1789206 --live --verbose qemu+ssh://10.73.73.73/system

11. check the ping status 
...
64 bytes from 10.73.33.188: icmp_seq=63 ttl=61 time=52.1 ms
64 bytes from 10.73.33.188: icmp_seq=64 ttl=61 time=55.1 ms 
64 bytes from 10.73.33.188: icmp_seq=75 ttl=61 time=52.8 ms <--- when migration is completed,ping works again
64 bytes from 10.73.33.188: icmp_seq=76 ttl=61 time=51.4 ms
64 bytes from 10.73.33.188: icmp_seq=77 ttl=61 time=49.8 ms
64 bytes from 10.73.33.188: icmp_seq=78 ttl=61 time=50.9 ms
64 bytes from 10.73.33.188: icmp_seq=79 ttl=61 time=50.0 ms
64 bytes from 10.73.33.188: icmp_seq=80 ttl=61 time=79.0 ms

Comment 18 Yanghang Liu 2021-03-16 08:03:00 UTC

Hi Juan, 

Could we move the "Internal Target Release" of this bug to RHEL8.5 ?

It seems that we need more time to confirm this ping problem, so this BZ cannot be fixed on RHEL8.4.

Comment 20 Juan Quintela 2021-03-16 09:50:31 UTC

Hi Yanghang

your setup seems correct. Nothing looks obviously wrong to me.
I am going to access your machines and see what is going on.

Thanks for doing the setup.

Later, Juan.

Comment 21 Juan Quintela 2021-03-16 10:35:28 UTC

hi

Just a heads-up.

Inside the machine.
How do you configure them?

I see this:

[root@bootp-73-33-188 network-scripts]# ip route
default via 10.73.33.254 dev enp1s0 proto dhcp metric 100 
default via 10.73.33.254 dev enp1s0nsby proto dhcp metric 101 
10.73.32.0/23 dev enp1s0 proto kernel scope link src 10.73.33.188 metric 100 
10.73.32.0/23 dev enp1s0nsby proto kernel scope link src 10.73.33.188 metric 101

And that configuration is wrong.

2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 52:54:00:aa:1c:ef brd ff:ff:ff:ff:ff:ff
    inet 10.73.33.188/23 brd 10.73.33.255 scope global dynamic noprefixroute enp1s0
       valid_lft 41265sec preferred_lft 41265sec
    inet6 2620:52:0:4920:fe22:38bb:73a4:c9a5/64 scope global dynamic noprefixroute 
       valid_lft 2591514sec preferred_lft 604314sec
    inet6 fe80::6b98:4a67:ddc:66d/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
3: enp1s0nsby: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master enp1s0 state UP group default qlen 1000
    link/ether 52:54:00:aa:1c:ef brd ff:ff:ff:ff:ff:ff
    inet 10.73.33.188/23 brd 10.73.33.255 scope global dynamic noprefixroute enp1s0nsby
       valid_lft 41265sec preferred_lft 41265sec
    inet6 fe80::f306:608:a04f:fd6f/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever

Both the bonding(failover) device (enp1s0) and the virtio device (enp1s0nsby) have an assigned address (the same).
Only the enp1s0 should have it assigned.

I am trying to tame the NetworkManager configuration, but the problem is there.

(yes, current configuration they don't have the vf device still assigned, but that is how I find the machine).  I am planning to play with this machine until I can understand why it is behaving this way, as I don't know either where the missconfiguration is.

My initial conclusion is that the problem is a configuration problem, not a failover problem, but can't yet explain where the missconfiguration is.

Later, Juan.

Comment 23 Juan Quintela 2021-06-15 10:58:18 UTC

Hi

I think this should be already fixed with the backport of fixes that laurent did.

Could you re-try and close it if it works.

Later, Juan.

Comment 24 Laurent Vivier 2021-06-18 14:13:59 UTC

Could you re-test with RHEL-AV-8.5.0 to see if the problem has been fixed by the rebase?

Thanks

Comment 25 Yanghang Liu 2021-06-23 06:54:14 UTC

Hi 

It seems that this problem still exist in the following test environment:

Test Version:
host:
4.18.0-316.el8.x86_64
qemu-kvm-6.0.0-20.module+el8.5.0+11499+199527ef.x86_64
guest:
4.18.0-314.el8.x86_64

Test Device:
on the source host:82599ES
on the target host:BCM57810


Related log:
# ping 10.73.33.153
PING 10.73.33.153 (10.73.33.153) 56(84) bytes of data.
64 bytes from 10.73.33.153: icmp_seq=1 ttl=58 time=0.776 ms
64 bytes from 10.73.33.153: icmp_seq=2 ttl=58 time=0.805 ms
64 bytes from 10.73.33.153: icmp_seq=3 ttl=58 time=0.810 ms
64 bytes from 10.73.33.153: icmp_seq=4 ttl=58 time=0.775 ms
64 bytes from 10.73.33.153: icmp_seq=5 ttl=58 time=0.808 ms
64 bytes from 10.73.33.153: icmp_seq=6 ttl=58 time=0.803 ms
64 bytes from 10.73.33.153: icmp_seq=7 ttl=58 time=0.803 ms
64 bytes from 10.73.33.153: icmp_seq=8 ttl=58 time=0.837 ms
64 bytes from 10.73.33.153: icmp_seq=9 ttl=58 time=0.799 ms
64 bytes from 10.73.33.153: icmp_seq=10 ttl=58 time=0.793 ms
64 bytes from 10.73.33.153: icmp_seq=11 ttl=58 time=0.789 ms
64 bytes from 10.73.33.153: icmp_seq=12 ttl=58 time=0.796 ms
64 bytes from 10.73.33.153: icmp_seq=13 ttl=58 time=1.02 ms
64 bytes from 10.73.33.153: icmp_seq=14 ttl=58 time=0.793 ms
64 bytes from 10.73.33.153: icmp_seq=15 ttl=58 time=0.822 ms
64 bytes from 10.73.33.153: icmp_seq=16 ttl=58 time=0.791 ms
64 bytes from 10.73.33.153: icmp_seq=17 ttl=58 time=0.832 ms <--- when "virtio_net virtio2 enp4s0: failover primary slave:enp5s0 unregistered" is outputed in source guest vm dmesg,ping will not work until the migration is completed.
64 bytes from 10.73.33.153: icmp_seq=29 ttl=58 time=0.889 ms <--- when migration is completed,ping works again
64 bytes from 10.73.33.153: icmp_seq=30 ttl=58 time=0.794 ms
64 bytes from 10.73.33.153: icmp_seq=31 ttl=58 time=0.796 ms
^C
--- 10.73.33.153 ping statistics ---
31 packets transmitted, 20 received, 35.4839% packet loss, time 747ms
rtt min/avg/max/mdev = 0.775/0.816/1.019/0.059 ms

Comment 26 RHEL Program Management 2021-07-09 07:30:24 UTC

After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 27 Yanghang Liu 2021-07-22 03:04:24 UTC

Hi Laurent,


Do you plan to fix this bug ?

If yes, could we re-open this bug ?

Comment 28 Laurent Vivier 2021-07-22 08:57:02 UTC

(In reply to Yanghang Liu from comment #27)
> Hi Laurent,
> 
> 
> Do you plan to fix this bug ?
> 
> If yes, could we re-open this bug ?

Yes, it needs at least further analysis

Comment 30 yalzhang@redhat.com 2021-08-11 05:48:02 UTC

Hi Laurent, do you think the scenario can be simplified to "hotunplug the hostdev interface, and check the network status on the vm with the backup bridge type interface"? I remembered I have encountered this issue when test unplug on our 82599ES system when the feature was introduced long time ago. but it seems fixed now, the ping can keep working after the hostdev interface unregistered, check the steps below:

[on host]
# rpm -q libvirt qemu-kvm
libvirt-7.6.0-1.module+el8.5.0+12097+2c77910b.x86_64
qemu-kvm-6.0.0-26.module+el8.5.0+12044+525f0ebc.x86_64
# lspci | grep Eth
...
82:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)

[on guest]
[root@vm-179-142 ~]# ifconfig -a
enp4s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.73.179.142  netmask 255.255.254.0  broadcast 10.73.179.255
        inet6 fe80::8e2a:7cb5:c796:3c38  prefixlen 64  scopeid 0x20<link>
        inet6 2620:52:0:49b2:b1f:5bf7:68f9:4de0  prefixlen 64  scopeid 0x0<global>
        ether 52:54:00:aa:1c:ef  txqueuelen 1000  (Ethernet)
        RX packets 8859  bytes 573031 (559.6 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 264  bytes 37312 (36.4 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp4s0nsby: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        ether 52:54:00:aa:1c:ef  txqueuelen 1000  (Ethernet)
        RX packets 4523  bytes 285469 (278.7 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 47  bytes 10190 (9.9 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp5s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.73.179.142  netmask 255.255.254.0  broadcast 10.73.179.255
        inet6 fe80::d9bc:7e3e:4179:6e03  prefixlen 64  scopeid 0x20<link>
        ether 52:54:00:aa:1c:ef  txqueuelen 1000  (Ethernet)
        RX packets 4336  bytes 287562 (280.8 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 217  bytes 27122 (26.4 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
...
(keep ping on guest, then hotunplug the hostdev interface and check the ping status)
[root@vm-179-142 ~]# ping www.baidu.com
PING www.a.shifen.com (182.61.200.7) 56(84) bytes of data.
64 bytes from 182.61.200.7 (182.61.200.7): icmp_seq=1 ttl=48 time=3.79 ms
64 bytes from 182.61.200.7 (182.61.200.7): icmp_seq=2 ttl=48 time=6.53 ms
64 bytes from 182.61.200.7 (182.61.200.7): icmp_seq=3 ttl=48 time=6.17 ms
64 bytes from 182.61.200.7 (182.61.200.7): icmp_seq=4 ttl=48 time=3.15 ms
64 bytes from 182.61.200.7 (182.61.200.7): icmp_seq=5 ttl=48 time=3.13 ms
64 bytes from 182.61.200.7 (182.61.200.7): icmp_seq=6 ttl=48 time=3.22 ms
64 bytes from 182.61.200.7 (182.61.200.7): icmp_seq=7 ttl=48 time=3.12 ms
64 bytes from 182.61.200.7 (182.61.200.7): icmp_seq=8 ttl=48 time=3.21 ms
[  369.112558] pcieport 0000:00:02.4: pciehp: Slot(0-4): Attention button pressed
[  369.113922] pcieport 0000:00:02.4: pciehp: Slot(0-4): Powering off due to button press
64 bytes from 182.61.200.7 (182.61.200.7): icmp_seq=9 ttl=48 time=5.32 ms
64 bytes from 182.61.200.7 (182.61.200.7): icmp_seq=10 ttl=48 time=3.09 ms
64 bytes from 182.61.200.7 (182.61.200.7): icmp_seq=11 ttl=48 time=3.13 ms
64 bytes from 182.61.200.7 (182.61.200.7): icmp_seq=12 ttl=48 time=13.7 ms
64 bytes from 182.61.200.7 (182.61.200.7): icmp_seq=13 ttl=48 time=4.61 ms
64 bytes from 182.61.200.7 (182.61.200.7): icmp_seq=14 ttl=48 time=3.39 ms
[  374.642245] virtio_net virtio2 enp4s0: failover primary slave:enp5s0 unregistered
64 bytes from 182.61.200.7 (182.61.200.7): icmp_seq=15 ttl=48 time=3.33 ms
64 bytes from 182.61.200.7 (182.61.200.7): icmp_seq=16 ttl=48 time=4.24 ms
64 bytes from 182.61.200.7 (182.61.200.7): icmp_seq=17 ttl=48 time=3.20 ms
....

--- www.a.shifen.com ping statistics ---
123 packets transmitted, 123 received, 0% packet loss, time 128408ms
rtt min/avg/max/mdev = 3.087/3.731/17.630/1.852 ms

I will test migration once I get another system.

Comment 31 Laurent Vivier 2021-08-16 06:38:24 UTC

(In reply to yalzhang from comment #30)
> Hi Laurent, do you think the scenario can be simplified to "hotunplug the
> hostdev interface, and check the network status on the vm with the backup
> bridge type interface"? I remembered I have encountered this issue when test

Yes, I think...

> unplug on our 82599ES system when the feature was introduced long time ago.
> but it seems fixed now, the ping can keep working after the hostdev
> interface unregistered, check the steps below:

I can see this problem when the host configuration doesn't disable "spoof checking".
(Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+)

In fact it depends if the card has an internal switch or relies on an external switch for the VF.

Some configuration don't allow to have several NICs with the same MAC (the virtio-net and the VFIO).

Comment 32 John Ferlan 2021-09-08 21:21:21 UTC

Move RHEL-AV bugs to RHEL9. If necessary to resolve in RHEL8, then clone to the current RHEL8 release.  Removed the ITR from all bugs as part of the change.

Comment 34 RHEL Program Management 2022-01-22 07:27:04 UTC

After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 37 Juan Quintela 2022-02-08 18:35:23 UTC

Can you show your network configuration:

ifconfig -a inside the guest

and ip link

on the host.  On #comment 21 I addressed that configuration shown on comment 18 is wrong.  It can't be that you have both vfio and virtio devices with ip address.  Only the "failover(bonding)" device should have configured one IP.

Later, Juan.

Comment 38 Yanhui Ma 2022-02-17 06:24:51 UTC

I tried the bug on following network cards and machines with rhel9, but I didn't reproduce the bug.

[root@dell-per440-25 home]# lspci -s 0000:3b:00.1
3b:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)

[root@dell-per730-28 tests]# lspci -s 0000:06:00.1
06:00.1 Ethernet controller: Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 02)

[root@dell-per440-22 ~]# lspci -s 0000:3b:00.1
3b:00.1 Ethernet controller: Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 02)

[root@dell-per440-25 home]# rpm -q qemu-kvm
qemu-kvm-6.2.0-8.el9.x86_64
[root@dell-per440-25 home]# uname -r
5.14.0-57.kpq0.el9.x86_64
[root@dell-per440-25 home]# rpm -q libvirt
libvirt-8.0.0-3.el9.x86_64


hotunplug vf and keep ping vm from before hotunplug
# virsh qemu-monitor-command rhel90 '{"execute":"device_del","arguments":{"id":"hostdev0"}}'

[root@vm-74-194 ~]# ifconfig 
enp1s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.73.74.194  netmask 255.255.252.0  broadcast 10.73.75.255
        inet6 2620:52:0:4948:f68e:38ff:fec3:9090  prefixlen 64  scopeid 0x0<global>
        inet6 fe80::f68e:38ff:fec3:9090  prefixlen 64  scopeid 0x20<link>
        ether f4:8e:38:c3:90:90  txqueuelen 1000  (Ethernet)
        RX packets 174477  bytes 15284807 (14.5 MiB)
        RX errors 0  dropped 4922  overruns 0  frame 0
        TX packets 1299  bytes 109928 (107.3 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp4s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.200.132  netmask 255.255.255.0  broadcast 192.168.200.255
        inet6 fe80::7629:599b:a503:e9df  prefixlen 64  scopeid 0x20<link>
        inet6 2001::ca9a:3558:8328:e3e0  prefixlen 64  scopeid 0x0<global>
        ether 52:54:00:aa:1c:ef  txqueuelen 1000  (Ethernet)
        RX packets 22650  bytes 2265604 (2.1 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 3374  bytes 328154 (320.4 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp4s0nsby: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.200.132  netmask 255.255.255.0  broadcast 192.168.200.255
        inet6 fe80::17be:e17c:345e:a239  prefixlen 64  scopeid 0x20<link>
        ether 52:54:00:aa:1c:ef  txqueuelen 1000  (Ethernet)
        RX packets 22754  bytes 2276133 (2.1 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 3325  bytes 325312 (317.6 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

ping works well before hotunplug and after hotunplug.
# ping 192.168.200.132
PING 192.168.200.132 (192.168.200.132) 56(84) bytes of data.
64 bytes from 192.168.200.132: icmp_seq=1 ttl=64 time=0.229 ms
64 bytes from 192.168.200.132: icmp_seq=2 ttl=64 time=0.215 ms
64 bytes from 192.168.200.132: icmp_seq=3 ttl=64 time=0.136 ms
64 bytes from 192.168.200.132: icmp_seq=4 ttl=64 time=0.205 ms
64 bytes from 192.168.200.132: icmp_seq=5 ttl=64 time=0.203 ms
64 bytes from 192.168.200.132: icmp_seq=6 ttl=64 time=0.203 ms
64 bytes from 192.168.200.132: icmp_seq=7 ttl=64 time=0.199 ms
64 bytes from 192.168.200.132: icmp_seq=8 ttl=64 time=0.202 ms
64 bytes from 192.168.200.132: icmp_seq=9 ttl=64 time=0.204 ms

Comment 39 Yanhui Ma 2022-03-01 05:47:33 UTC

(In reply to Juan Quintela from comment #37)
> Can you show your network configuration:
> 
> ifconfig -a inside the guest

Finally I can reproduce the issue with live migration. 

[root@localhost ~]# ping 192.168.200.79
PING 192.168.200.79 (192.168.200.79) 56(84) bytes of data.
64 bytes from 192.168.200.79: icmp_seq=1 ttl=64 time=0.105 ms
64 bytes from 192.168.200.79: icmp_seq=2 ttl=64 time=0.112 ms
64 bytes from 192.168.200.79: icmp_seq=3 ttl=64 time=0.091 ms
64 bytes from 192.168.200.79: icmp_seq=4 ttl=64 time=0.092 ms
[  396.306453] virtio_net virtio2 enp4s0: failover primary slave:enp5s0 unregistered
From 192.168.200.132 icmp_seq=150 Destination Host Unreachable
From 192.168.200.132 icmp_seq=151 Destination Host Unreachable
From 192.168.200.132 icmp_seq=152 Destination Host Unreachable


Here are ifconfig -a in the guest:
[root@localhost ~]# ifconfig -a
enp4s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.200.132  netmask 255.255.255.0  broadcast 192.168.200.255
        inet6 fe80::7629:599b:a503:e9df  prefixlen 64  scopeid 0x20<link>
        inet6 2001::ca9a:3558:8328:e3e0  prefixlen 64  scopeid 0x0<global>
        ether 52:54:00:aa:1c:ef  txqueuelen 1000  (Ethernet)
        RX packets 1162  bytes 115157 (112.4 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 535  bytes 45966 (44.8 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp4s0nsby: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        ether 52:54:00:aa:1c:ef  txqueuelen 1000  (Ethernet)
        RX packets 1239  bytes 126915 (123.9 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 236  bytes 26624 (26.0 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp5s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.200.132  netmask 255.255.255.0  broadcast 192.168.200.255
        inet6 fe80::6564:75b3:1b28:8516  prefixlen 64  scopeid 0x20<link>
        ether 52:54:00:aa:1c:ef  txqueuelen 1000  (Ethernet)
        RX packets 210  bytes 23262 (22.7 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 195  bytes 14986 (14.6 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 186  bytes 20648 (20.1 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 186  bytes 20648 (20.1 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

> 
> and ip link
> 

Here are ip link on host
[root@dell-per440-25 ~]# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master switch state UP mode DEFAULT group default qlen 1000
    link/ether f4:ee:08:0d:6e:bf brd ff:ff:ff:ff:ff:ff
    altname enp4s0f0
3: eno2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
    link/ether f4:ee:08:0d:6e:c0 brd ff:ff:ff:ff:ff:ff
    altname enp4s0f1
4: enp59s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 90:e2:ba:05:63:5e brd ff:ff:ff:ff:ff:ff
5: enp59s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master br0 state UP mode DEFAULT group default qlen 1000
    link/ether 90:e2:ba:05:63:5f brd ff:ff:ff:ff:ff:ff
    vf 0     link/ether e2:94:f9:9e:1e:f8 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
6: switch: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether f4:ee:08:0d:6e:bf brd ff:ff:ff:ff:ff:ff
7: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 90:e2:ba:05:63:5f brd ff:ff:ff:ff:ff:ff
8: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000
    link/ether 52:54:00:4f:51:aa brd ff:ff:ff:ff:ff:ff
49: enp59s0f1v0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
    link/ether e2:94:f9:9e:1e:f8 brd ff:ff:ff:ff:ff:ff

> on the host.  On #comment 21 I addressed that configuration shown on comment
> 18 is wrong.  It can't be that you have both vfio and virtio devices with ip
> address.  Only the "failover(bonding)" device should have configured one IP.
> 

In our test, it seems we always have both vfio and virtio devices with the same ip address.
> Later, Juan.

Comment 41 Juan Quintela 2022-05-31 17:28:33 UTC

Laurent Vivier is the maintainer of VF, so I am letting this to him.
I still think that the Network Configuration is wrong, but I don't have hardware to check anymore.

Could you post the configuration (xml and qemu command line when running) of the guest?

Thanks, Juan.

Comment 43 RHEL Program Management 2022-08-02 07:28:09 UTC

After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 52 RHEL Program Management 2023-02-02 07:27:43 UTC

After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 55 RHEL Program Management 2023-08-02 07:28:19 UTC

After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 57 RHEL Program Management 2023-09-22 16:11:38 UTC

Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug.

Comment 58 RHEL Program Management 2023-09-22 16:14:28 UTC

This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there.

Due to differences in account names between systems, some fields were not replicated.  Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information.

To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer.  You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like:

"Bugzilla Bug" = 1234567

In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information.

Note You need to log in before you can comment on or make changes to this bug.