Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.

Bug 1984461

Summary: [RFE] Add rxq to pmd assignment with rxq grouping
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Kevin Traynor <ktraynor>
Component: openvswitch2.16Assignee: Kevin Traynor <ktraynor>
Status: CLOSED ERRATA QA Contact: liting <tli>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: FDP 21.GCC: ctrautma, dmarchan, hewang, jhsiao, kfida, ralongi
Target Milestone: ---   
Target Release: FDP 21.I   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openvswitch2.16-2.16.0-1.el8fdp Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-12-09 15:37:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1984474    

Description Kevin Traynor 2021-07-21 13:53:40 UTC
Add an rxq scheduling option that allows rxqs to be grouped
on a pmd based purely on their load.

The current default 'cycles' assignment sorts rxqs by measured
processing load and then assigns them to a list of round robin PMDs.
This helps to keep the rxqs that require most processing on different
cores but as it selects the PMDs in round robin order, it equally
distributes rxqs to PMDs.

'cycles' assignment has the advantage in that it separates the most
loaded rxqs from being on the same core but maintains the rxqs being
spread across a broad range of PMDs to mitigate against changes to
traffic pattern.

'cycles' assignment has the disadvantage that in order to make the
trade off between optimising for current traffic load and mitigating
against future changes, it tries to assign and equal amount of rxqs
per PMD in a round robin manner and this can lead to a less than optimal
balance of the processing load.

Now that PMD auto load balance can help mitigate with future changes in
traffic patterns, a 'group' assignment can be used to assign rxqs based
on their measured cycles and the estimated running total of the PMDs.

In this case, there is no restriction about keeping equal number of
rxqs per PMD as it is purely load based.

This means that one PMD may have a group of low load rxqs assigned to it
while another PMD has one high load rxq assigned to it, as that is the
best balance of their measured loads across the PMDs.

Comment 1 Kevin Traynor 2021-07-21 13:58:51 UTC
The 'group' rxq->pmd assignment algorithm has been merged upstream.
https://github.com/openvswitch/ovs/commit/3dd050909a74229f024da30ae4800ede79883248

This means when enabled OVS is no longer limited to distribute the *number* of rxqs
equally across the available PMDs.

It can be enabled with:
ovs-vsctl set Open_vSwitch . other_config:pmd-rxq-assign=group

Instead it will assign the next rxq to the pmd that is estimated to have
the lowest load. The load estimate for that pmd will be updated each
time an rxq is assigned to it. (i.e. the additional load from the rxq is
added to the current pmd load).

There is an exception for rxqs which have no measured load. They might
have no measured load because they are just added, or because they have
been present but there was no traffic on them.

These rxqs are distributed to the pmds with the lowest number of rxqs.

Comment 5 liting 2021-10-25 07:58:21 UTC
Test step and result:
[root@dell-per730-56 ~]# rpm -qa|grep openvswitch2.16-2
openvswitch2.16-2.16.0-16.el8fdp.x86_64
python3-openvswitch2.16-2.16.0-16.el8fdp.x86_64

Added dpdk0, dpdk1, vhost0, vhost1 to ovsbr0. 
Configured dpdk0 with 5 queues. other port configured one queue. 
There are 3 PMD threads:core19, core23, core47.
Inside guest use testpmd forward packet. 
Use following trex command to send unidirectional traffic. 
./binary-search.py --traffic-generator=trex-txrx --frame-size=64 --num-flows=1024 --max-loss-pct=0 --search-runtime=10 --validation-runtime=60 --rate-tolerance=10 --runtime-tolerance=10 --rate=25 --rate-unit=% --duplicate-packet-failure=retry-to-fail --negative-packet-loss=retry-to-fail --rate=25 --rate-unit=% --one-shot=0 --use-src-ip-flows=1 --use-dst-ip-flows=1 --use-src-mac-flows=1 --use-dst-mac-flows=1 --send-teaching-measurement --send-teaching-warmup --teaching-warmup-packet-type=generic --teaching-warmup-packet-rate=1000 --use-src-ip-flows=1 --use-dst-ip-flows=1 --use-src-mac-flows=1 --use-dst-mac-flows=1 --use-device-stats --warmup-trial --warmup-trial-runtime=10 --warmup-trial-rate=1 --traffic-direction=unidirectional
 
[root@dell-per730-56 ~]# ovs-vsctl show
504c077f-91e8-4840-85f8-4ae29571e1ac
    Bridge ovsbr0
        datapath_type: netdev
        Port vhost0
            Interface vhost0
                type: dpdkvhostuserclient
                options: {vhost-server-path="/tmp/vhost0"}
        Port dpdk0
            Interface dpdk0
                type: dpdk
                options: {dpdk-devargs="0000:82:00.0", n_rxq="5", n_rxq_desc="1024", n_txq_desc="1024"}
        Port dpdk1
            Interface dpdk1
                type: dpdk
                options: {dpdk-devargs="0000:82:00.1", n_rxq="1", n_rxq_desc="1024", n_txq_desc="1024"}
        Port ovsbr0
            Interface ovsbr0
                type: internal
        Port vhost1
            Interface vhost1
                type: dpdkvhostuserclient
                options: {vhost-server-path="/tmp/vhost1"}
    ovs_version: "2.16.1"

[root@dell-per730-56 ~]# ovs-ofctl dump-flows ovsbr0
 cookie=0x0, duration=21571.617s, table=0, n_packets=10512319025, n_bytes=630739141500, in_port=dpdk0 actions=output:vhost0
 cookie=0x0, duration=21571.613s, table=0, n_packets=1186344304, n_bytes=71180658240, in_port=dpdk1 actions=output:vhost1
 cookie=0x0, duration=21571.609s, table=0, n_packets=1186344214, n_bytes=71180656052, in_port=vhost0 actions=output:dpdk0
 cookie=0x0, duration=21571.605s, table=0, n_packets=9849533257, n_bytes=590971998632, in_port=vhost1 actions=output:dpdk1

Enable automatic load balancing of PMDs. Check the output of ovs-appctl dpif-netdev/pmd-rxq-show. We can see that the total load on the three cores is almost the same, and when any PMD is overloaded, PMD rebalance will be triggered.
[root@netqe23 perf]# ovs-vsctl set Open_vSwitch . other_config:pmd-rxq-assign=group
[root@dell-per730-56 ~]# ovs-vsctl set open_vswitch . other_config:pmd-auto-lb=true
[root@dell-per730-56 ~]# ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-load-threshold=40

[root@dell-per730-56 ~]# ovs-vsctl get Open_vSwitch . other_config
{dpdk-init="true", dpdk-lcore-mask="0x1", dpdk-socket-mem="0,4096", pmd-auto-lb="true", pmd-auto-lb-load-threshold="30", pmd-cpu-mask="800000880000", pmd-rxq-assign=group, userspace-tso-enable="false", vhost-iommu-support="true"}

[root@dell-per730-56 ~]# ovs-appctl dpif-netdev/pmd-rxq-show
pmd thread numa_id 1 core_id 19:
  isolated : false
  port: dpdk0             queue-id:  0 (enabled)   pmd usage: 26 %
  port: dpdk0             queue-id:  3 (enabled)   pmd usage: 26 %
  port: vhost0            queue-id:  0 (enabled)   pmd usage:  0 %
  overhead:  2 %
pmd thread numa_id 1 core_id 23:
  isolated : false
  port: dpdk0             queue-id:  1 (enabled)   pmd usage: 20 %
  port: dpdk0             queue-id:  4 (enabled)   pmd usage: 21 %
  port: vhost1            queue-id:  0 (enabled)   pmd usage: 34 %
  overhead:  2 %
pmd thread numa_id 1 core_id 47:
  isolated : false
  port: dpdk0             queue-id:  2 (enabled)   pmd usage: 42 %
  port: dpdk1             queue-id:  0 (enabled)   pmd usage:  0 %
  overhead:  3 %

[root@dell-per730-56 ~]# ovs-appctl dpif-netdev/pmd-rxq-show
pmd thread numa_id 1 core_id 19:
  isolated : false
  port: dpdk0             queue-id:  0 (enabled)   pmd usage: 21 %
  port: dpdk0             queue-id:  3 (enabled)   pmd usage: 21 %
  port: vhost0            queue-id:  0 (enabled)   pmd usage:  0 %
  overhead:  1 %
pmd thread numa_id 1 core_id 23:
  isolated : false
  port: dpdk0             queue-id:  1 (enabled)   pmd usage: 18 %
  port: dpdk0             queue-id:  4 (enabled)   pmd usage: 19 %
  port: vhost1            queue-id:  0 (enabled)   pmd usage: 37 %
  overhead:  2 %
pmd thread numa_id 1 core_id 47:
  isolated : false
  port: dpdk0             queue-id:  2 (enabled)   pmd usage: 33 %
  port: dpdk1             queue-id:  0 (enabled)   pmd usage:  0 %
  overhead:  2 %

[root@dell-per730-56 ~]# ovs-appctl dpif-netdev/pmd-rxq-show
pmd thread numa_id 1 core_id 19:
  isolated : false
  port: dpdk0             queue-id:  0 (enabled)   pmd usage: 16 %
  port: dpdk0             queue-id:  3 (enabled)   pmd usage: 16 %
  port: vhost0            queue-id:  0 (enabled)   pmd usage:  0 %
  overhead:  1 %
pmd thread numa_id 1 core_id 23:
  isolated : false
  port: dpdk0             queue-id:  1 (enabled)   pmd usage: 16 %
  port: dpdk0             queue-id:  4 (enabled)   pmd usage: 17 %
  port: vhost1            queue-id:  0 (enabled)   pmd usage: 41 %
  overhead:  2 %
pmd thread numa_id 1 core_id 47:
  isolated : false
  port: dpdk0             queue-id:  2 (enabled)   pmd usage: 24 %
  port: dpdk1             queue-id:  0 (enabled)   pmd usage:  0 %
  overhead:  2 %

[root@dell-per730-56 ~]# ovs-appctl dpif-netdev/pmd-rxq-show
pmd thread numa_id 1 core_id 19:
  isolated : false
  port: dpdk0             queue-id:  0 (enabled)   pmd usage: 13 %
  port: dpdk0             queue-id:  3 (enabled)   pmd usage: 13 %
  port: vhost0            queue-id:  0 (enabled)   pmd usage:  0 %
  overhead:  1 %
pmd thread numa_id 1 core_id 23:
  isolated : false
  port: dpdk0             queue-id:  1 (enabled)   pmd usage: 14 %
  port: dpdk0             queue-id:  4 (enabled)   pmd usage: 14 %
  port: vhost1            queue-id:  0 (enabled)   pmd usage: 46 %
  overhead:  3 %
pmd thread numa_id 1 core_id 47:
  isolated : false
  port: dpdk0             queue-id:  2 (enabled)   pmd usage: 20 %
  port: dpdk1             queue-id:  0 (enabled)   pmd usage:  0 %
  overhead:  2 %

[root@dell-per730-56 ~]# ovs-appctl dpif-netdev/pmd-rxq-show
pmd thread numa_id 1 core_id 19:
  isolated : false
  port: dpdk0             queue-id:  0 (enabled)   pmd usage: 12 %
  port: dpdk0             queue-id:  3 (enabled)   pmd usage: 12 %
  port: vhost0            queue-id:  0 (enabled)   pmd usage:  0 %
  overhead:  1 %
pmd thread numa_id 1 core_id 23:
  isolated : false
  port: dpdk0             queue-id:  1 (enabled)   pmd usage: 14 %
  port: dpdk0             queue-id:  4 (enabled)   pmd usage: 15 %
  port: vhost1            queue-id:  0 (enabled)   pmd usage: 51 %
  overhead:  3 %
pmd thread numa_id 1 core_id 47:
  isolated : false
  port: dpdk0             queue-id:  2 (enabled)   pmd usage: 19 %
  port: dpdk1             queue-id:  0 (enabled)   pmd usage:  0 %
  overhead:  2 %

[root@dell-per730-56 ~]# ovs-appctl dpif-netdev/pmd-rxq-show
pmd thread numa_id 1 core_id 19:
  isolated : false
  port: vhost1            queue-id:  0 (enabled)   pmd usage: NOT AVAIL
  overhead: NOT AVAIL
pmd thread numa_id 1 core_id 23:
  isolated : false
  port: dpdk0             queue-id:  2 (enabled)   pmd usage: NOT AVAIL
  port: dpdk0             queue-id:  3 (enabled)   pmd usage: NOT AVAIL
  port: dpdk1             queue-id:  0 (enabled)   pmd usage: NOT AVAIL
  port: vhost0            queue-id:  0 (enabled)   pmd usage: NOT AVAIL
  overhead: NOT AVAIL
pmd thread numa_id 1 core_id 47:
  isolated : false
  port: dpdk0             queue-id:  0 (enabled)   pmd usage: NOT AVAIL
  port: dpdk0             queue-id:  1 (enabled)   pmd usage: NOT AVAIL
  port: dpdk0             queue-id:  4 (enabled)   pmd usage: NOT AVAIL
  overhead: NOT AVAIL

[root@dell-per730-56 ~]# ovs-appctl dpif-netdev/pmd-rxq-show
pmd thread numa_id 1 core_id 19:
  isolated : false
  port: vhost1            queue-id:  0 (enabled)   pmd usage: 43 %
  overhead:  0 %
pmd thread numa_id 1 core_id 23:
  isolated : false
  port: dpdk0             queue-id:  2 (enabled)   pmd usage: 16 %
  port: dpdk0             queue-id:  3 (enabled)   pmd usage: 11 %
  port: dpdk1             queue-id:  0 (enabled)   pmd usage:  0 %
  port: vhost0            queue-id:  0 (enabled)   pmd usage:  0 %
  overhead:  0 %
pmd thread numa_id 1 core_id 47:
  isolated : false
  port: dpdk0             queue-id:  0 (enabled)   pmd usage: 10 %
  port: dpdk0             queue-id:  1 (enabled)   pmd usage: 12 %
  port: dpdk0             queue-id:  4 (enabled)   pmd usage: 13 %
  overhead:  0 %

Comment 6 liting 2021-10-26 03:56:37 UTC
Following is test rebalance, verify pass. 
[root@dell-per730-56 ~]# ovs-appctl dpif-netdev/pmd-rxq-show
pmd thread numa_id 1 core_id 19:
  isolated : false
  port: vhost1            queue-id:  0 (enabled)   pmd usage: 73 %
  overhead:  3 %
pmd thread numa_id 1 core_id 23:
  isolated : false
  port: dpdk0             queue-id:  2 (enabled)   pmd usage: 39 %
  port: dpdk0             queue-id:  4 (enabled)   pmd usage: 39 %
  port: dpdk1             queue-id:  0 (enabled)   pmd usage:  0 %
  port: vhost0            queue-id:  0 (enabled)   pmd usage:  0 %
  overhead:  4 %
pmd thread numa_id 1 core_id 47:
  isolated : false
  port: dpdk0             queue-id:  0 (enabled)   pmd usage: 23 %
  port: dpdk0             queue-id:  1 (enabled)   pmd usage: 23 %
  port: dpdk0             queue-id:  3 (enabled)   pmd usage: 23 %
  overhead:  2 %

[root@dell-per730-56 ~]# ovs-appctl dpif-netdev/pmd-rxq-rebalance
pmd rxq rebalance requested.

[root@dell-per730-56 ~]# ovs-appctl dpif-netdev/pmd-rxq-show
pmd thread numa_id 1 core_id 19:
  isolated : false
  port: vhost1            queue-id:  0 (enabled)   pmd usage: 73 %
  overhead:  3 %
pmd thread numa_id 1 core_id 23:
  isolated : false
  port: dpdk0             queue-id:  0 (enabled)   pmd usage: NOT AVAIL
  port: dpdk0             queue-id:  2 (enabled)   pmd usage: NOT AVAIL
  port: dpdk1             queue-id:  0 (enabled)   pmd usage: NOT AVAIL
  port: vhost0            queue-id:  0 (enabled)   pmd usage: NOT AVAIL
  overhead: NOT AVAIL
pmd thread numa_id 1 core_id 47:
  isolated : false
  port: dpdk0             queue-id:  1 (enabled)   pmd usage: NOT AVAIL
  port: dpdk0             queue-id:  3 (enabled)   pmd usage: NOT AVAIL
  port: dpdk0             queue-id:  4 (enabled)   pmd usage: NOT AVAIL
  overhead: NOT AVAIL


[root@dell-per730-56 ~]# ovs-appctl dpif-netdev/pmd-rxq-show
pmd thread numa_id 1 core_id 19:
  isolated : false
  port: vhost1            queue-id:  0 (enabled)   pmd usage: 75 %
  overhead:  3 %
pmd thread numa_id 1 core_id 23:
  isolated : false
  port: dpdk0             queue-id:  0 (enabled)   pmd usage: 41 %
  port: dpdk0             queue-id:  2 (enabled)   pmd usage: 42 %
  port: dpdk1             queue-id:  0 (enabled)   pmd usage:  0 %
  port: vhost0            queue-id:  0 (enabled)   pmd usage:  0 %
  overhead:  5 %
pmd thread numa_id 1 core_id 47:
  isolated : false
  port: dpdk0             queue-id:  1 (enabled)   pmd usage: 32 %
  port: dpdk0             queue-id:  3 (enabled)   pmd usage: 30 %
  port: dpdk0             queue-id:  4 (enabled)   pmd usage: 31 %
  overhead:  2 %

/var/log/openvswitch/ovs-vswitchd.log
2021-10-26T03:35:41.478Z|00290|dpif_netdev|INFO|Core 23 on numa node 1 assigned port 'vhost0' rx queue 0 (measured processing cycles 51968336).
2021-10-26T03:40:45.976Z|00291|dpif_netdev|INFO|Performing pmd to rx queue assignment using group algorithm.
2021-10-26T03:40:45.976Z|00292|dpif_netdev|INFO|There's no available (non-isolated) pmd thread on numa node 0. Port 'vhost1' rx queue 0 will be assigned to a pmd on numa node 1. This may lead to reduced performance.
2021-10-26T03:40:45.976Z|00293|dpif_netdev|INFO|Core 19 on numa node 1 assigned port 'vhost1' rx queue 0 (measured processing cycles 96805701965).
2021-10-26T03:40:45.976Z|00294|dpif_netdev|INFO|Core 23 on numa node 1 assigned port 'dpdk0' rx queue 2 (measured processing cycles 52714018263).
2021-10-26T03:40:45.976Z|00295|dpif_netdev|INFO|Core 47 on numa node 1 assigned port 'dpdk0' rx queue 4 (measured processing cycles 52317893031).
2021-10-26T03:40:45.976Z|00296|dpif_netdev|INFO|Core 47 on numa node 1 assigned port 'dpdk0' rx queue 1 (measured processing cycles 38170039705).
2021-10-26T03:40:45.976Z|00297|dpif_netdev|INFO|Core 23 on numa node 1 assigned port 'dpdk0' rx queue 0 (measured processing cycles 37802447463).
2021-10-26T03:40:45.976Z|00298|dpif_netdev|INFO|Core 47 on numa node 1 assigned port 'dpdk0' rx queue 3 (measured processing cycles 37721545825).
2021-10-26T03:40:45.976Z|00299|dpif_netdev|INFO|Core 23 on numa node 1 assigned port 'dpdk1' rx queue 0 (measured processing cycles 45343403).

Comment 8 errata-xmlrpc 2021-12-09 15:37:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (openvswitch2.16 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:5056