This bug has been migrated to another issue tracking site. It has been closed here and may no longer be being monitored.

If you would like to get updates for this issue, or to participate in it, you may do so at Red Hat Issue Tracker .
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2091528 - the network in win2016/win2022 guest can't work after failover vf migraion between MT2892 network cards
Summary: the network in win2016/win2022 guest can't work after failover vf migraion be...
Keywords:
Status: CLOSED MIGRATED
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: virtio-win
Version: 9.1
Hardware: x86_64
OS: Windows
medium
medium
Target Milestone: rc
: ---
Assignee: ybendito
QA Contact: Yanhui Ma
Jiri Herrmann
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-05-30 08:36 UTC by Yanhui Ma
Modified: 2023-09-23 17:54 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-08-01 08:08:35 UTC
Type: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHEL-925 0 None None None 2023-08-01 07:49:45 UTC
Red Hat Issue Tracker   RHELPLAN-123628 0 None None None 2023-08-01 08:08:34 UTC

Description Yanhui Ma 2022-05-30 08:36:03 UTC
Description of problem:

After live migration win2016/win2022 guest with a failover vf between MT2892 network cards on both src and dst hosts, ping will fail in windows guest and the "network and sharing center" can't be opened,  IP may will lost after a while. The guest also can't be rebooted, it will cause black screen.

Version-Release number of selected component (if applicable):

# rpm -q qemu-kvm
qemu-kvm-7.0.0-3.el9.x86_64
# uname -r
5.14.0-92.el9.x86_64
host nic info:
Mellanox Technologies MT2892 Family [ConnectX-6 Dx]

How reproducible:
100%

Steps to Reproduce:
1.create vf on both src host and dst host

echo 1 > /sys/bus/pci/devices/0000\:1a\:00.1/sriov_numvfs

2.create failover-vf and failover-bridge network on both src and dst host

# virsh net-dumpxml failover-bridge 
<network connections='1'>
  <name>failover-bridge</name>
  <uuid>1943a508-b0b7-4274-be5a-6f0143d10f40</uuid>
  <forward mode='bridge'/>
  <bridge name='br0'/>
</network>

# virsh net-dumpxml failover-vf
<network connections='1'>
  <name>failover-vf</name>
  <uuid>4319b666-8f4b-410a-886f-17b6df772224</uuid>
  <forward mode='hostdev' managed='yes'>
    <address type='pci' domain='0x0000' bus='0x1a' slot='0x08' function='0x2'/>
  </forward>
</network>

3.boot win216/win2022 guest with failover vf on src host
    <interface type='network'>
      <mac address='52:54:00:aa:1c:ef'/>
      <source network='failover-bridge'/>
      <model type='virtio'/>
      <teaming type='persistent'/>
      <alias name='ua-test'/>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
    </interface>
    <interface type='network'>
      <mac address='52:54:00:aa:1c:ef'/>
      <source network='failover-vf'/>
      <teaming type='transient' persistent='ua-test'/>
      <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
    </interface>

4. live migrating the guest
5. after migration, check the network in guest
6. reboot the guest or scan the hardware changes via Device manager in guest


Actual results:

After step 5, ping will fail, can't open "network and sharing center", see the attachment.
# ipconfig /all
Windows IP Configuration

   Host Name . . . . . . . . . . . . : WIN-A1AR6C3G7HJ
   Primary Dns Suffix  . . . . . . . : 
   Node Type . . . . . . . . . . . . : Hybrid
   IP Routing Enabled. . . . . . . . : No
   WINS Proxy Enabled. . . . . . . . : No
   DNS Suffix Search List. . . . . . : lab.eng.pek2.redhat.com

Ethernet adapter Ethernet Instance 0 9:

   Media State . . . . . . . . . . . : Media disconnected
   Connection-specific DNS Suffix  . : lab.eng.pek2.redhat.com
   Description . . . . . . . . . . . : Red Hat VirtIO Ethernet Adapter #3
   Physical Address. . . . . . . . . : 52-54-00-AA-1C-EF
   DHCP Enabled. . . . . . . . . . . : Yes
   Autoconfiguration Enabled . . . . : Yes

Ethernet adapter Ethernet Instance 0 13:

   Connection-specific DNS Suffix  . : 
   Description . . . . . . . . . . . : ConnectX Family mlx5Gen Virtual Function #3
   Physical Address. . . . . . . . . : 52-54-00-AA-1C-EF
   DHCP Enabled. . . . . . . . . . . : Yes
   Autoconfiguration Enabled . . . . : Yes
   Link-local IPv6 Address . . . . . : fe80::a5f1:2227:8917:709f%17(Preferred) 
   IPv4 Address. . . . . . . . . . . : 192.168.43.200(Preferred) 
   Subnet Mask . . . . . . . . . . . : 255.255.255.0
   Lease Obtained. . . . . . . . . . : Monday, May 30, 2022 3:58:52 AM
   Lease Expires . . . . . . . . . . : Monday, May 30, 2022 4:08:52 AM
   Default Gateway . . . . . . . . . : 192.168.43.2
   DHCP Server . . . . . . . . . . . : 192.168.43.6
   DHCPv6 IAID . . . . . . . . . . . : 853941549
   DHCPv6 Client DUID. . . . . . . . : 00-01-00-01-29-3A-24-B9-9A-9B-AB-AE-4E-E3
   DNS Servers . . . . . . . . . . . : 192.168.43.2
   NetBIOS over Tcpip. . . . . . . . : Enabled

Ethernet adapter Ethernet:

   Connection-specific DNS Suffix  . : lab.eng.pek2.redhat.com
   Description . . . . . . . . . . . : Red Hat VirtIO Ethernet Adapter #2
   Physical Address. . . . . . . . . : 52-54-00-01-22-22
   DHCP Enabled. . . . . . . . . . . : Yes
   Autoconfiguration Enabled . . . . : Yes
   IPv6 Address. . . . . . . . . . . : 2620:52:0:49d2:78cd:cab3:51da:bf42(Preferred) 
   Link-local IPv6 Address . . . . . : fe80::78cd:cab3:51da:bf42%23(Preferred) 
   IPv4 Address. . . . . . . . . . . : 10.73.211.223(Preferred) 
   Subnet Mask . . . . . . . . . . . : 255.255.254.0
   Lease Obtained. . . . . . . . . . : Monday, May 30, 2022 3:58:36 AM
   Lease Expires . . . . . . . . . . : Tuesday, May 31, 2022 3:58:35 AM
   Default Gateway . . . . . . . . . : fe80::52c7:903:533b:88e1%23
                                       10.73.211.254
   DHCP Server . . . . . . . . . . . : 10.73.2.108
   DHCPv6 IAID . . . . . . . . . . . : 122835968
   DHCPv6 Client DUID. . . . . . . . : 00-01-00-01-29-3A-24-B9-9A-9B-AB-AE-4E-E3
   DNS Servers . . . . . . . . . . . : 10.73.2.107
                                       10.73.2.108
                                       10.66.127.10
   NetBIOS over Tcpip. . . . . . . . : Enabled


# ping 192.168.43.6
Pinging 192.168.43.6 with 32 bytes of data:
Reply from 192.168.43.200: Destination host unreachable.
Request timed out.
Request timed out.
Request timed out.

Ping statistics for 192.168.43.6:
    Packets: Sent = 4, Received = 1, Lost = 3 (75% loss)

# ping 192.168.43.101
Pinging 192.168.43.101 with 32 bytes of data:
Request timed out.
Request timed out.
Reply from 192.168.43.200: Destination host unreachable.
Reply from 192.168.43.200: Destination host unreachable.

Ping statistics for 192.168.43.101:
    Packets: Sent = 4, Received = 2, Lost = 2 (50% loss)

After step 6, the guest can't be rebooted, it will cause black screen.

Expected results:

After migration, the ping src host ip and dst host ip can work well.
Additional info:

RHEL9.1 guest doesn't have the issue.

Comment 2 Laurent Vivier 2022-05-31 07:26:46 UTC
Move to sst_virtualization_windows pool as the problem occurs only with windows guest.

Comment 3 Laurent Vivier 2022-05-31 15:13:50 UTC
Perhaps the problem is related to the one seen with BZ 2090712?

Comment 4 Yvugenfi@redhat.com 2022-06-23 09:36:14 UTC
(In reply to Laurent Vivier from comment #3)
> Perhaps the problem is related to the one seen with BZ 2090712?

Yes, the BZs are related.
The way failover works now, protocol driver installation that is used to facilitate the binding in Windows guest needs to know exact PNP ID of the card to bind to. And this is a list that is part of the installation. 
We need to add additional NICs to the list on the first stage and open new BZ to work on generic mechanism to identify the card that should binded to virtio-net device

Comment 5 Yanhui Ma 2022-06-23 09:51:30 UTC
(In reply to Yvugenfi from comment #4)
> (In reply to Laurent Vivier from comment #3)
> > Perhaps the problem is related to the one seen with BZ 2090712?
> 
> Yes, the BZs are related.
> The way failover works now, protocol driver installation that is used to
> facilitate the binding in Windows guest needs to know exact PNP ID of the
> card to bind to. And this is a list that is part of the installation. 
> We need to add additional NICs to the list on the first stage and open new
> BZ to work on generic mechanism to identify the card that should binded to
> virtio-net device

Thanks for your explanation. Shall I assign the bug to you?

Comment 6 Yvugenfi@redhat.com 2022-06-25 07:55:59 UTC
(In reply to Yanhui Ma from comment #5)
> (In reply to Yvugenfi from comment #4)
> > (In reply to Laurent Vivier from comment #3)
> > > Perhaps the problem is related to the one seen with BZ 2090712?
> > 
> > Yes, the BZs are related.
> > The way failover works now, protocol driver installation that is used to
> > facilitate the binding in Windows guest needs to know exact PNP ID of the
> > card to bind to. And this is a list that is part of the installation. 
> > We need to add additional NICs to the list on the first stage and open new
> > BZ to work on generic mechanism to identify the card that should binded to
> > virtio-net device
> 
> Thanks for your explanation. Shall I assign the bug to you?

Assigning to Yuri, he is a feature owner.

Comment 11 Yanhui Ma 2023-03-24 04:37:27 UTC
Failover vf migration is only supported in RHV and it is technical preview. It is not supported in OSP and CNV. So set the priority to medium.
If anything wrong, please correct me.

Comment 12 ybendito 2023-07-13 09:56:12 UTC
Should be fixed in build 239 https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=53700760

Comment 13 Yanhui Ma 2023-07-18 07:28:35 UTC
Hi Yuri,

Seems I can still reproduce the bug with following packages version:
qemu-kvm-7.2.0-14.el9_2.x86_64
virtio-win driver:
100.93.104.23900

After migration, ping fails, the "network and sharing center" can't be opened, and the failover vf device is disabled and can't be enabled. See attachment please.


C:\Windows\system32>ipconfig /all

Windows IP Configuration

   Host Name . . . . . . . . . . . . : WIN-5UFQ492T212
   Primary Dns Suffix  . . . . . . . : 
   Node Type . . . . . . . . . . . . : Hybrid
   IP Routing Enabled. . . . . . . . : No
   WINS Proxy Enabled. . . . . . . . : No
   DNS Suffix Search List. . . . . . : lab.eng.pek2.redhat.com

Ethernet adapter Ethernet 30:

   Connection-specific DNS Suffix  . : 
   Description . . . . . . . . . . . : ConnectX Family mlx5Gen Virtual Function
   Physical Address. . . . . . . . . : 52-54-00-AA-1C-EF
   DHCP Enabled. . . . . . . . . . . : Yes
   Autoconfiguration Enabled . . . . : Yes
   Link-local IPv6 Address . . . . . : fe80::3893:79f2:7b37:e13%41(Preferred) 
   IPv4 Address. . . . . . . . . . . : 192.168.43.200(Preferred) 
   Subnet Mask . . . . . . . . . . . : 255.255.255.0
   Lease Obtained. . . . . . . . . . : Tuesday, July 18, 2023 5:57:22 AM
   Lease Expires . . . . . . . . . . : Tuesday, July 18, 2023 6:07:21 AM
   Default Gateway . . . . . . . . . : 192.168.43.2
   DHCP Server . . . . . . . . . . . : 192.168.43.6
   DHCPv6 IAID . . . . . . . . . . . : 693261312
   DHCPv6 Client DUID. . . . . . . . : 00-01-00-01-2C-45-FB-94-9A-E9-2D-4B-32-11
   DNS Servers . . . . . . . . . . . : 192.168.43.2
   NetBIOS over Tcpip. . . . . . . . : Enabled

Ethernet adapter Ethernet 28:

   Connection-specific DNS Suffix  . : lab.eng.pek2.redhat.com
   Description . . . . . . . . . . . : Red Hat VirtIO Ethernet Adapter #28
   Physical Address. . . . . . . . . : 52-54-00-01-22-22
   DHCP Enabled. . . . . . . . . . . : Yes
   Autoconfiguration Enabled . . . . : Yes
   IPv6 Address. . . . . . . . . . . : 2620:52:0:49d2:f895:47e1:a96a:6b53(Preferred) 
   Link-local IPv6 Address . . . . . : fe80::f895:47e1:a96a:6b53%18(Preferred) 
   IPv4 Address. . . . . . . . . . . : 10.73.210.159(Preferred) 
   Subnet Mask . . . . . . . . . . . : 255.255.254.0
   Lease Obtained. . . . . . . . . . : Tuesday, July 18, 2023 5:51:48 AM
   Lease Expires . . . . . . . . . . : Tuesday, July 18, 2023 5:51:48 PM
   Default Gateway . . . . . . . . . : fe80::52c7:903:533b:88e1%18
                                       10.73.211.254
   DHCP Server . . . . . . . . . . . : 10.73.2.108
   DHCPv6 IAID . . . . . . . . . . . : 559043584
   DHCPv6 Client DUID. . . . . . . . : 00-01-00-01-2C-45-FB-94-9A-E9-2D-4B-32-11
   DNS Servers . . . . . . . . . . . : 10.72.17.5
                                       10.68.5.26
   NetBIOS over Tcpip. . . . . . . . : Enabled

Ethernet adapter Ethernet 29:

   Connection-specific DNS Suffix  . : 
   Description . . . . . . . . . . . : Red Hat VirtIO Ethernet Adapter #29
   Physical Address. . . . . . . . . : 52-54-00-AA-1C-EF
   DHCP Enabled. . . . . . . . . . . : Yes
   Autoconfiguration Enabled . . . . : Yes
   Link-local IPv6 Address . . . . . : fe80::d1da:916:56dd:5440%29(Preferred) 
   Autoconfiguration IPv4 Address. . : 169.254.84.64(Preferred) 
   Subnet Mask . . . . . . . . . . . : 255.255.0.0
   Default Gateway . . . . . . . . . : 
   DHCPv6 IAID . . . . . . . . . . . : 491934720
   DHCPv6 Client DUID. . . . . . . . : 00-01-00-01-2C-45-FB-94-9A-E9-2D-4B-32-11
   DNS Servers . . . . . . . . . . . : fec0:0:0:ffff::1%1
                                       fec0:0:0:ffff::2%1
                                       fec0:0:0:ffff::3%1
   NetBIOS over Tcpip. . . . . . . . : Enabled

C:\Windows\system32>ping 192.168.43.6

Pinging 192.168.43.6 with 32 bytes of data:
Request timed out.
Request timed out.
Request timed out.
Request timed out.

Ping statistics for 192.168.43.6:
    Packets: Sent = 4, Received = 0, Lost = 4 (100% loss),

Comment 19 Yanhui Ma 2023-07-21 09:11:08 UTC
Here is a bug maybe related with 'ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off', and the bug was fixed in qemu-kvm-8.0.0-7.el9. But for failover vf migration, there is one qemu-kvm crash bug in qemu-kvm-8.0.0-7.el9.
https://issues.redhat.com/browse/RHEL-832


Bug 2128929 - [rhel9.2] hotplug/hotunplug mlx vdpa device to the occupied addr port, then qemu core dump occurs after shutdown guest

Comment 20 ybendito 2023-07-21 11:13:33 UTC
(In reply to Yanhui Ma from comment #19)
> Here is a bug maybe related with
> 'ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off', and the bug was fixed
> in qemu-kvm-8.0.0-7.el9. But for failover vf migration, there is one
> qemu-kvm crash bug in qemu-kvm-8.0.0-7.el9.
> https://issues.redhat.com/browse/RHEL-832
> 
> 
> Bug 2128929 - [rhel9.2] hotplug/hotunplug mlx vdpa device to the occupied
> addr port, then qemu core dump occurs after shutdown guest

I think the BZ https://bugzilla.redhat.com/show_bug.cgi?id=2128929 is not related to failover problem
The mentioned BZ is for _plug_ problem into wrong/occupied address.
_Our_ problem is for _unplug_ of VF during migration with failover.


Note You need to log in before you can comment on or make changes to this bug.