This bug has been migrated to another issue tracking site. It has been closed here and may no longer be being monitored.

If you would like to get updates for this issue, or to participate in it, you may do so at Red Hat Issue Tracker .
Bug 1865778 - Performance optimization of SRIOV VF in bonding with virtio-net on Windows
Summary: Performance optimization of SRIOV VF in bonding with virtio-net on Windows
Keywords:
Status: CLOSED MIGRATED
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: virtio-win
Version: 9.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: rc
: 9.0
Assignee: ybendito
QA Contact: Quan Wenli
URL:
Whiteboard:
Depends On: 1830754
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-08-04 07:05 UTC by ybendito
Modified: 2023-08-15 15:34 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-08-15 15:34:07 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker   RHEL-1298 0 None Migrated None 2023-08-15 15:29:02 UTC

Description ybendito 2020-08-04 07:05:22 UTC
Description of problem:
Bonding feature of SRIOV VF with virtio-net for Windows is coming. It should provide a temporary connectivity of the guest during migration flow when the connectivity before and after migration uses VF of SRIOV card (like it is done for Linux). It is still unclear:
1) whether the solution provides better throughput than in case of only virtio-net device over SRIOV PF
2) whether the performance of bonded VF is comparable with original performance of VF without bonding.



Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Quan Wenli 2020-08-11 03:30:26 UTC
hi, ybendito


For your mentioned in comment #0 about "like it is done for Linux", do you mean "Bug 1718673 - RFE: support for net failover devices in qemu"?

Comment 2 ybendito 2020-08-11 08:59:37 UTC
(In reply to Quan Wenli from comment #1)
> hi, ybendito
> 
> 
> For your mentioned in comment #0 about "like it is done for Linux", do you
> mean "Bug 1718673 - RFE: support for net failover devices in qemu"?

Yes, in terms of host setup and final goal of the feature:
SRIOV adapter on the host, VF dedicated to the guest, identical MAC addresses for VF and virtio-net and all this to provide seamless migration when the network connectivity is provided by SRIOV adapter.

Windows solution is different in terms of guest preparation: additional protocol driver need be installed on the guest.
Final result is similar: we have virtio-net device on the guest which works standalone all the time the VF does not exist and uses VF (under the table) when the VF is attached.

Comment 5 Yvugenfi@redhat.com 2020-11-12 09:05:58 UTC
The changes are in virtio-win-prewql 190 : https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=1383208

Comment 6 Quan Wenli 2020-12-07 08:15:46 UTC
Hi Yuri,

For comment #0, about "whether the performance of bonded VF", does "boned VF" mean Virtio-net failover VF like bug 1830754 ? if we still need check the performance part in comment #0.

Comment 7 Quan Wenli 2020-12-09 04:43:23 UTC
(In reply to Quan Wenli from comment #6)
> Hi Yuri,
> 
> For comment #0, about "whether the performance of bonded VF", does "boned
> VF" mean Virtio-net failover VF like bug 1830754 ? if we still need check
> the performance part in comment #0.

What's “the original performance of VF without bonding“ in comment #0, is it mean the command like:


-device virtio-net-pci,netdev=hostnet0,id=net0,mac=22:2b:62:bb:a9:82,bus=root.3,failover=off \
-device vfio-pci,host=82:01.0,id=hostdev0,bus=root.4,failover_pair_id=net0 \

Comment 8 ybendito 2020-12-09 10:11:50 UTC
Yes, bonded VF is exactly the setup with 2 adapters (VF and virtio-net) with identical MAC addresses and installed protocol.
There is some difference between Windows guest and Linux guest:
On Linux guest the bonding depends on failover=off
On Windows guest failover=on is important for migration etc, but the bonding depends on the installed protocol (vioprot).
So, for case of Windows the original performance of VF without bonding is the performance of VF without installed protocol (and probably without virtio-net adapter at all, as it is not needed).

Comment 9 Yanghang Liu 2020-12-10 11:29:40 UTC
Hi,Yuri

I want to double check with you about how QE can test this bug.

Could you take a look at the following two test scenarios and see which test scenario should be tested to verify this bug?


Test scenario 1: Compare the nic performance in the following three different Windows vms.

  Windows VM1 with only one virtio net device:
    -netdev tap,id=hostnet0,vhost=on \
    -device virtio-net-pci,netdev=hostnet0,id=net0,mac=22:2b:62:bb:a9:82,bus=root.3,failover=on \

  Windows VM2 with only one VF device:
    -device vfio-pci,host=0000:82:01.0,id=hostdev0,bus=root.4 \


  Windows VM3 with both failover virtio net device and failover VF device:
  (The vioprot protocol should be installed in the Windows VM3)
    -netdev tap,id=hostnet0,vhost=on \
    -device virtio-net-pci,netdev=hostnet0,id=net0,mac=22:2b:62:bb:a9:82,bus=root.3,failover=on \
    -device vfio-pci,host=0000:82:01.0,id=hostdev0,bus=root.4,failover_pair_id=net0 \


Test scenario 2: migrate the following two different Windows vm and compare the migration performance

  Windows VM1 with only one virtio net device:
    -netdev tap,id=hostnet0,vhost=on \
    -device virtio-net-pci,netdev=hostnet0,id=net0,mac=22:2b:62:bb:a9:82,bus=root.3,failover=on \


  Windows VM2 with both failover virtio net device and failover VF device:
  (The vioprot protocol should be installed in the Windows VM2, otherwise the migration will fail)
    -netdev tap,id=hostnet0,vhost=on \
    -device virtio-net-pci,netdev=hostnet0,id=net0,mac=22:2b:62:bb:a9:82,bus=root.3,failover=on \
    -device vfio-pci,host=0000:82:01.0,id=hostdev0,bus=root.4,failover_pair_id=net0 \



If I have any misunderstandings or you have any concerns/suggestions, please let me know.

Thanks in advance.

Comment 10 ybendito 2020-12-13 11:15:56 UTC
(In reply to Yanghang Liu from comment #9)
> Hi,Yuri
> 
> I want to double check with you about how QE can test this bug.
> 
> Could you take a look at the following two test scenarios and see which test
> scenario should be tested to verify this bug?
> 
> 
> Test scenario 1: Compare the nic performance in the following three
> different Windows vms.
> 
>   Windows VM1 with only one virtio net device:
>     -netdev tap,id=hostnet0,vhost=on \
>     -device
> virtio-net-pci,netdev=hostnet0,id=net0,mac=22:2b:62:bb:a9:82,bus=root.3,
> failover=on \
> 
>   Windows VM2 with only one VF device:
>     -device vfio-pci,host=0000:82:01.0,id=hostdev0,bus=root.4 \
> 
> 
>   Windows VM3 with both failover virtio net device and failover VF device:
>   (The vioprot protocol should be installed in the Windows VM3)
>     -netdev tap,id=hostnet0,vhost=on \
>     -device
> virtio-net-pci,netdev=hostnet0,id=net0,mac=22:2b:62:bb:a9:82,bus=root.3,
> failover=on \
>     -device
> vfio-pci,host=0000:82:01.0,id=hostdev0,bus=root.4,failover_pair_id=net0 \
> 
> 

I think the important part is to compare network performance of VM2 and VM3
It would be good also have network performance of VM1 for the information


> Test scenario 2: migrate the following two different Windows vm and compare
> the migration performance
> 
>   Windows VM1 with only one virtio net device:
>     -netdev tap,id=hostnet0,vhost=on \
>     -device
> virtio-net-pci,netdev=hostnet0,id=net0,mac=22:2b:62:bb:a9:82,bus=root.3,
> failover=on \
> 
> 
>   Windows VM2 with both failover virtio net device and failover VF device:
>   (The vioprot protocol should be installed in the Windows VM2, otherwise
> the migration will fail)
>     -netdev tap,id=hostnet0,vhost=on \
>     -device
> virtio-net-pci,netdev=hostnet0,id=net0,mac=22:2b:62:bb:a9:82,bus=root.3,
> failover=on \
>     -device
> vfio-pci,host=0000:82:01.0,id=hostdev0,bus=root.4,failover_pair_id=net0 \
> 
> 
> 
> If I have any misunderstandings or you have any concerns/suggestions, please
> let me know.
> 
> Thanks in advance.

Comment 11 Yanghang Liu 2020-12-14 03:08:45 UTC
Thanks for Yuri's answer


Hi wenli,

Could you help test the scenario 1 ?

> Test scenario 1: Compare the nic performance in the following three
> different Windows vms.
> 
>   Windows VM1 with only one virtio net device:
>     -netdev tap,id=hostnet0,vhost=on \
>     -device
> virtio-net-pci,netdev=hostnet0,id=net0,mac=22:2b:62:bb:a9:82,bus=root.3,
> failover=on \
> 
>   Windows VM2 with only one VF device:
>     -device vfio-pci,host=0000:82:01.0,id=hostdev0,bus=root.4 \
> 
> 
>   Windows VM3 with both failover virtio net device and failover VF device:
>   (The vioprot protocol should be installed in the Windows VM3)
>     -netdev tap,id=hostnet0,vhost=on \
>     -device
> virtio-net-pci,netdev=hostnet0,id=net0,mac=22:2b:62:bb:a9:82,bus=root.3,
> failover=on \
>     -device
> vfio-pci,host=0000:82:01.0,id=hostdev0,bus=root.4,failover_pair_id=net0 \

> I think the important part is to compare network performance of VM2 and VM3
> It would be good also have network performance of VM1 for the information

Thanks in advance.

Comment 13 Yanghang Liu 2020-12-16 07:54:01 UTC
Hi Yuri,

Could you help check the test results in the comment 12 ?

> VF vs bonded-vf
> - tcp_thream tx: no siginificent different
> - tcp_thream rx: 30% around degradation compared with vf
> - tcp_rr : no siginificent different  


Is the test result expected, or should QE open a new bug to track the "tcp_thream rx" problem ?

If you have any concerns and need QE to do more tests to verify this bug, please feel free to let me know.

Comment 16 Yanghang Liu 2020-12-17 05:18:15 UTC
Hi Wenli,

If you want to test the performance of failover VF in the Linux vm to compare with failover VF in Windows, you can refer to the following bug.
    Bug 1718673 - RFE: support for net failover devices in qemu"


Please feel free to ping me if you have any questions about how to start a Linux vm with failover vf.

Comment 17 Yanghang Liu 2020-12-17 06:13:17 UTC
HI Yuri,

If you need QE to do more testing/want to know what testing details, please feel free to let QE know.

QE will do the related tests as soon as possible.

> Internal Target Release: 8.3.1

QE is currently unclear about the following parts: 
(1)What is the expected results of the performance test about the failover VF in Windows vm ?
(2)Are there any other tests needed for this bug? / What other tests will be needed for this bug ?

Could you please help with the above ?


Thanks for your help in advance.

Comment 18 ybendito 2021-01-03 09:38:35 UTC
(In reply to Yanghang Liu from comment #17)
> QE is currently unclear about the following parts: 
> (1)What is the expected results of the performance test about the failover VF in Windows vm ?

In general, we want to see throughput numbers for failover setup that similar to throughput of VF.
But I'm afraid host-to-guest throughput results with 10G adapter do not exactly reflect a real use case of SRIOV failover.
Is it possible to make throughput tests on faster SRIOV adapters (40G or 100G) with transfer between physical machines?

Comment 19 Quan Wenli 2021-01-04 07:15:34 UTC
(In reply to ybendito from comment #18)
> (In reply to Yanghang Liu from comment #17)
> > QE is currently unclear about the following parts: 
> > (1)What is the expected results of the performance test about the failover VF in Windows vm ?
> 
> In general, we want to see throughput numbers for failover setup that
> similar to throughput of VF.
> But I'm afraid host-to-guest throughput results with 10G adapter do not
> exactly reflect a real use case of SRIOV failover.

In my test env, there are two 10g adapters back to back connected. I set the private 192 ip on host/guest/external host, the guest can only ping with host (pf)successfully, but can not ping external host, that's why I tested it between host and guest, is it an issue?  I was using the steps from https://bugzilla.redhat.com/show_bug.cgi?id=1830754#c30

 
> Is it possible to make throughput tests on faster SRIOV adapters (40G or
> 100G) with transfer between physical machines?

Comment 20 Yanghang Liu 2021-01-06 02:18:05 UTC
Wenli is mainly responsible for the performance part, and will deal with the performance part of this bug.

Please feel free to let me know if there is anything I can help about failove vf + migration test.

reassign the QA Contact to Wenli

Comment 33 Quan Wenli 2021-10-27 06:22:37 UTC
Hi Yuri,

The DTM is overdue, could you help reset the DTM? Thanks in advance!

Comment 36 RHEL Program Management 2022-02-04 07:27:14 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 37 Quan Wenli 2022-02-10 06:12:26 UTC
Hello Yuri

Could you help review this bug, it's should be closed as wontfix or reopen it.

Thanks, wenli

Comment 40 Quan Wenli 2022-07-21 07:12:00 UTC
Hello Yuri

Do we plan to fix this within 9.1.0, if no, could you reset the ITR? Thanks,

Comment 49 RHEL Program Management 2023-08-15 15:25:07 UTC
Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug.

Comment 50 RHEL Program Management 2023-08-15 15:34:07 UTC
This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there.

To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer.  You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like:

"Bugzilla Bug" = 1234567

In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues.


Note You need to log in before you can comment on or make changes to this bug.