Bug 1830754

Summary: [RFE][netkvm] Add support for backup device of SRIOV VF
Product: Red Hat Enterprise Linux 8 Reporter: ybendito
Component: virtio-winAssignee: ybendito
virtio-win sub component: virtio-win-prewhql QA Contact: Yanhui Ma <yama>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: ailan, chayang, coli, lijin, lmiksik, mdean, phou, vrozenfe, wyu, xiagao, yanghliu, yvugenfi, zhguo
Version: 8.3Keywords: FutureFeature
Target Milestone: rc   
Target Release: 8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-02-16 14:24:38 UTC Type: Feature Request
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1865778    
Attachments:
Description Flags
Driver fix 1
none
Driver fix 2 none

Description ybendito 2020-05-03 16:59:17 UTC
Feature request: virtio-net acts as backup device for SRIOV VF in Windows

Comment 21 xiagao 2020-08-17 07:16:14 UTC
Hi Yuri,
For netkvm sirov, there are new files in virtio-win-prewhql-189, they are vioprot.inf and vioprot.cat.
I'd like to confirm two things with you.

1. Do we need to do whql certification against these drivers?
2. How to use these two files?

Thanks,
Xiaoling

Comment 22 xiagao 2020-08-26 07:28:46 UTC
Test version: virtio-win-prewhql-189
Netkvm whql test and function test all passed.

So change the status to verified.

Comment 24 ybendito 2020-09-08 07:02:53 UTC
The ISO layout looks OK

Comment 27 Peixiu Hou 2020-10-16 02:31:15 UTC
(In reply to ybendito from comment #0)
> Feature request: virtio-net acts as backup device for SRIOV VF in Windows

Hi Yuri,

About this new feature, I have follow questions:

1. for comment#0 "virtio-net acts as backup device for SRIOV VF in Windows", how to test it in windows guest?

I tried boot up a guest with a virtio-net-pci device and a vfio-pci(SRIOV VF) device, tested on win10 guest.
The qumu command line as:
-netdev tap,id=hostnet0,vhost=on \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=22:2b:62:bb:a9:84,bus=root.3,failover=on \
-device vfio-pci,host=0000:04:0a.1,id=hostdev1,bus=root.4,failover_pair_id=net0 \

And in guest, run "netcfg -v -l vioprot.inf -c p -i VIOPROT" command success, used virtio-win-prewhql-189 version.

Check network adapter, both can obtain ip address, and can ping externel ip adress.

But what need to do next(if my upper steps correct)? it's sorry that I don't know how to verify virtio-net acts as backup device for SRIOV VF in Windows.

2. What's the function about vioprot? Is it works together with netkvm driver?

3. For this new feature? which test points need to be covered in my future test? 

Thanks a lot~
Peixiu

Comment 28 Peixiu Hou 2020-10-16 02:44:56 UTC
Tested with Intel Corporation Ethernet Controller XL710 device:

[root@dell-per730-28 ~]# lspci | grep -i xl710
04:00.0 Ethernet controller: Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 02)
04:00.1 Ethernet controller: Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 02)

[root@dell-per730-28 ~]# echo 2 > /sys/bus/pci/devices/0000\:04\:00.1/sriov_numvfs

[root@dell-per730-28 ~]# lspci | grep 04:
04:00.0 Ethernet controller: Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 02)
04:00.1 Ethernet controller: Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 02)
04:0a.0 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 02)
04:0a.1 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 02)

Used versions:
kernel-4.18.0-240.el8.x86_64
qemu-kvm-5.1.0-12.module+el8.3.0+8338+cbcb1a4b.x86_64
virtio-win-prewhql-189
seabios-bin-1.13.0-2.module+el8.3.0+7353+9de0a3cc.noarch

qemu cli:
-netdev tap,id=hostnet0,vhost=on \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=22:2b:62:bb:a9:84,bus=root.3,failover=on \
-device vfio-pci,host=0000:04:0a.1,id=hostdev1,bus=root.4,failover_pair_id=net0 \

Network adapters shown in guest:
   Intel(R) XL710/X710 Virtual Function
   Red Hat VirtIO Ethernet Adapter #3

Comment 29 ybendito 2020-10-19 10:24:30 UTC
(In reply to Peixiu Hou from comment #27)
> (In reply to ybendito from comment #0)
> > Feature request: virtio-net acts as backup device for SRIOV VF in Windows
> 
> Hi Yuri,
> 
> About this new feature, I have follow questions:
> 
> 1. for comment#0 "virtio-net acts as backup device for SRIOV VF in Windows",
> how to test it in windows guest?
> 
> I tried boot up a guest with a virtio-net-pci device and a vfio-pci(SRIOV
> VF) device, tested on win10 guest.
> The qumu command line as:
> -netdev tap,id=hostnet0,vhost=on \
> -device
> virtio-net-pci,netdev=hostnet0,id=net0,mac=22:2b:62:bb:a9:84,bus=root.3,
> failover=on \
> -device
> vfio-pci,host=0000:04:0a.1,id=hostdev1,bus=root.4,failover_pair_id=net0 \
> 
> And in guest, run "netcfg -v -l vioprot.inf -c p -i VIOPROT" command
> success, used virtio-win-prewhql-189 version.
> 
> Check network adapter, both can obtain ip address, and can ping externel ip
> adress.

Do both adapters (netkvm and VF) have the same MAC address?
Expected behavior after protocol installation is that only netkvm has IP address.
VF adapter should have all the protocols not bound (except of virtio protocol)

> 
> But what need to do next(if my upper steps correct)? it's sorry that I don't
> know how to verify virtio-net acts as backup device for SRIOV VF in Windows.
> 

On first step (189) is the functionality when you unplug/plug the VF.
The entire feature (as well as failover with Linux) is to let the VFIO to survive the migration.

Later changes (they are not in the RH build yet) addresses performance of the fallback.
This part is covered by bug 1865778

> 2. What's the function about vioprot? Is it works together with netkvm
> driver?

Yes, netkvm does all the job when vioprot takes care on unbinding VF card from all the protocols (including TCPIP) and binding it only to vioprot.

> 
> 3. For this new feature? which test points need to be covered in my future
> test? 
> 

Let's review together how do you test the SRIOV failover function with Linux VM.
The procedure for Windows should be similar.


> Thanks a lot~
> Peixiu

Comment 30 Peixiu Hou 2020-10-28 01:55:07 UTC
Tested migration on win10-64 guest with SR-IOV Virtual Function device through vfio-pci device, hit BSOD after migration completed.

My test steps as follows:
1. Prepare 2 machines with support SRIOV Ethernet Adapter.

Machine 1:  a Support SRIOV device as follows:
05:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
05:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)

Machine 2: a support SRIOV device as follows:
82:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme II BCM57810 10 Gigabit Ethernet (rev 10)
82:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme II BCM57810 10 Gigabit Ethernet (rev 10)

2. Create a bridge "switch" with 05:00.0 port on machine 1.
    Create a bridge "switch" with 82:00.0 port on machine 2.

3. add "intel_iommu=on" to intel host kernel lin, reboot to make it effective. 
4. load vfio module #modprobe vfio-pci.
5. Create VF on machine1: echo 1 > /sys/bus/pci/devices/0000\:05\:00.0/sriov_numvfs 
Create VF on machine2: echo 1 > /sys/bus/pci/devices/0000\:82\:00.0/sriov_numvfs 

6. Check new created VF:
On machine 1: # lspci | grep 05:
05:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
05:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
05:10.0 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)
On machine 2: # lspci | grep 82:
82:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme II BCM57810 10 Gigabit Ethernet (rev 10)
82:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme II BCM57810 10 Gigabit Ethernet (rev 10)
82:01.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme II BCM57810 10 Gigabit Ethernet Virtual Function

7. Check Vendor ID and Product ID for VF device. 

On machine 1: # lspci -n -s 05:10.0

05:10.0 0200: 8086:10ed (rev 01)

On machine 2: # lspci -n -s 82:01.0
82:01.0 0200: 14e4:16af


8. Change bind driver of VF to vfio-pci:
On machine 1: 
# echo 0000:05:10.0 > /sys/bus/pci/devices/0000\:05\:10.0/driver/unbind

# echo "8086 10ed" > /sys/bus/pci/drivers/vfio-pci/new_id

# echo "8086 10ed" > /sys/bus/pci/drivers/vfio-pci/remove_id

# readlink -f /sys/bus/pci/devices/0000\:05\:10.0/driver

/sys/bus/pci/drivers/vfio-pci  

On machine 2:

# echo 0000:82:01.0 > /sys/bus/pci/devices/0000\:82\:01.0/driver/unbind

# echo "14e4 16af" > /sys/bus/pci/drivers/vfio-pci/new_id

# echo "14e4 16af" > /sys/bus/pci/drivers/vfio-pci/remove_id

# readlink -f /sys/bus/pci/devices/0000\:82\:01.0/driver

/sys/bus/pci/drivers/vfio-pci  


9. Set VF with same mac address on machine 1 and machine 2.
On machine 1: # ip link set enp5s0f0 vf 0 mac 22:2b:62:bb:a9:82
# ip link show enp5s0f0
4: enp5s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master switch state UP mode DEFAULT group default qlen 1000
    link/ether 00:1b:21:c3:d0:3c brd ff:ff:ff:ff:ff:ff
    vf 0     link/ether 22:2b:62:bb:a9:82 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off, query_rss off
On machine 2: # ip link set enp130s0f0 vf 0 mac 22:2b:62:bb:a9:82
# ip link show enp130s0f0
2: enp130s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master switch state UP mode DEFAULT group default qlen 1000
    link/ether 00:0a:f7:05:82:c0 brd ff:ff:ff:ff:ff:ff
    vf 0     link/ether 22:2b:62:bb:a9:82 brd ff:ff:ff:ff:ff:ff, tx rate 10000 (Mbps), max_tx_rate 10000Mbps, spoof checking on, link-state auto

10. On host 1, boot guest with virtio-net and VF which enabling failover using above MAC. Boot virtio-net-pci and vfio-pci together
-netdev tap,id=hostnet0,vhost=on \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=22:2b:62:bb:a9:82,bus=root.2,failover=on \
-device vfio-pci,host=0000:05:10.0,id=hostdev0,bus=root.3,failover_pair_id=net0 \

11. Check the info of failover  virtio-net and VF interface in win10-64 guest.
1). Load virtio-net device's driver(virtio-ein-prewhql-189) and VF device's driver(download from https://downloadcenter.intel.com/zh-cn/downloads/eula/22283/-Ethernet-Adapter-?httpDown=https%3A%2F%2Fdownloadmirror.intel.com%2F22283%2Feng%2F25_4.zip).
2). After driver load success.
# ipconfig
Both virtio-net and virtual function adapters shown.
3). Run "netcfg -v -l vioprot.inf -c p -i VIOPROT" command in guest.
# ipconfig
Only virtio-net adatper shown out, the vitual function adapter is dispeared.
ping externel address is successfull.

12. On host 2, boot guest with virtio-net and VF which enabling failover using above MAC. Boot virtio-net-pci and vfio-pci together, add incoming command:
-netdev tap,id=hostnet0,vhost=on \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=22:2b:62:bb:a9:82,bus=root.2,failover=on \
-device vfio-pci,host=0000:82:01.0,id=hostdev0,bus=root.3,failover_pair_id=net0 \
-incoming tcp:10.73.33.244:5888 \

13. Do migration from host 1 to host 2, use hmp command:
(qemu) migrate -d tcp:10.73.33.244:5888
(qemu) info migrate
globals:
store-global-state: on
only-migratable: off
send-configuration: on
send-section-footer: on
decompress-error-check: on
clear-bitmap-shift: 18
Migration status: completed
total time: 38688 ms
downtime: 17 ms
setup: 6253 ms
transferred ram: 4263426 kbytes
throughput: 1076.86 mbps
remaining ram: 0 kbytes
total ram: 4211544 kbytes
duplicate: 27472 pages
skipped: 0 pages
normal: 1063718 pages
normal bytes: 4254872 kbytes
dirty sync count: 6
page size: 4 kbytes
multifd bytes: 0 kbytes
pages-per-second: 32760

14. Check guest status on host 2, the win10-64 guest BSOD. Stop code: KERNEL SECURTY CHECK FAILURE
And the guest report warning:
QEMU 5.1.0 monitor - type 'help' for more information
(qemu) qemu-kvm: warning: TSC frequency mismatch between VM (2299997 kHz) and host (1899997 kHz), and TSC scaling unavailable
qemu-kvm: warning: TSC frequency mismatch between VM (2299997 kHz) and host (1899997 kHz), and TSC scaling unavailable

The Memory.dump upload to follows location:
http://fileshare.englab.nay.redhat.com/pub/section2/images_backup/bug1830754/Memory.dmp.zip

Used versions:
kernel-4.18.0-240.el8.x86_64
qemu-kvm-5.1.0-12.module+el8.3.0+8338+cbcb1a4b.x86_64
virtio-win-prewhql-189
seabios-bin-1.13.0-2.module+el8.3.0+7353+9de0a3cc.noarch

Thanks a lot~
Peixiu

Comment 31 ybendito 2020-10-28 12:30:00 UTC
Created attachment 1724778 [details]
Driver fix 1

Comment 44 ybendito 2020-11-01 10:07:02 UTC
Created attachment 1725559 [details]
Driver fix 2

Comment 47 ybendito 2020-11-03 13:30:51 UTC
Merged to upstream
https://github.com/virtio-win/kvm-guest-drivers-windows/pull/515

Comment 54 Vadim Rozenfeld 2020-11-12 05:09:42 UTC
Please check with the latest drivers from build 190
https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=1383208

Comment 57 Peixiu Hou 2020-11-17 09:36:05 UTC
And I tested migration(sriov failover) from host1 to host2 with virtio-win-prewhql-190.
Steps as comment#30, migration success and after migration ping external_ip success.

Used versions:
kernel-4.18.0-248.el8.x86_64
qemu-kvm-5.1.0-14.module+el8.3.0+8438+644aff69.x86_64
seabios-bin-1.14.0-1.module+el8.3.0+7638+07cf13d2.noarch
virtio-win-prewhql-190

Comment 68 Peixiu Hou 2020-11-30 07:01:27 UTC
As meeting discussed and comment#57 test results, I change this bug to verified status.

For comment#57 and comment#49 hit issues, will file a new bug to track.

Thanks~
Peixiu

Comment 69 Yvugenfi@redhat.com 2020-12-28 09:44:45 UTC
(In reply to Peixiu Hou from comment #68)
> As meeting discussed and comment#57 test results, I change this bug to
> verified status.
> 
> For comment#57 and comment#49 hit issues, will file a new bug to track.
> 
> Thanks~
> Peixiu

Hi Peixiu,

Can you please add the new BZ number for those issues here?

Thanks,
Yan.

Comment 71 Peixiu Hou 2021-01-04 04:00:40 UTC
(In reply to Yan Vugenfirer from comment #69)
> (In reply to Peixiu Hou from comment #68)
> > As meeting discussed and comment#57 test results, I change this bug to
> > verified status.
> > 
> > For comment#57 and comment#49 hit issues, will file a new bug to track.
> > 
> > Thanks~
> > Peixiu
> 
> Hi Peixiu,
> 
> Can you please add the new BZ number for those issues here?
> 

Hi Yan, 

This bz hasn't been filed, there also have some other related issues need to try, Yanghang Liu will handle this, he will test them together and update the bz number here~

Thanks~
Peixiu

> Thanks,
> Yan.

Comment 72 Yanghang Liu 2021-01-26 07:12:19 UTC
Some BZs about Win10 vm + failover vf migration that have been opened:
Bug 1907144 - [failover vf migration][windows vm] After migrating the vm, the info of the failover VF in the dst Win10 vm is displayed incorrectly
Bug 1912088 - [netkvm][failover] netkvm adapter should have valid IP address after installation of VF bonding protocol
Bug 1918594 - [failover vf] In the vm, the failover vf and failover virtio nic with the same MAC address have different valid IP addresses
Bug 1918309 - [failover vf migration][win10 vm] After migrating the source guest with 82599ES, the target guest could not ping the source host successfully

Comment 74 errata-xmlrpc 2021-02-16 14:24:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (virtio-win bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2021:0535