Description of problem: I see max qos on switchdev port applied through openflow using meter actions. This flow doesnt get offlaoded to the hardware and remains in ovs kernel datapath. Due to this, Any max qos values configured more than handling capacity of kernel datapth fails. 14: enp4s0f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master mx-bond state UP mode DEFAULT group default qlen 1000 link/ether 04:3f:72:d9:c0:48 brd ff:ff:ff:ff:ff:ff vf 1 link/ether f8:f2:1e:03:bf:f3 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off Created max qos policy. (overcloud) [stack@undercloud-0 ~]$ openstack network qos rule list QP1 +--------------------------------------+--------------------------------------+-----------------+----------+-----------------+----------+-----------+-----------+ | ID | QoS Policy ID | Type | Max Kbps | Max Burst Kbits | Min Kbps | DSCP mark | Direction | +--------------------------------------+--------------------------------------+-----------------+----------+-----------------+----------+-----------+-----------+ | e5bbab12-b185-449c-a30d-c0273a19ba5b | c28f27d3-e2b0-4644-8d41-d98e7572d927 | bandwidth_limit | 8000000 | 8000000 | | | egress | +--------------------------------------+--------------------------------------+-----------------+----------+-----------------+----------+-----------+-----------+ Applied above policy on switchdev port. openstack port set --qos-policy QP1 vlanport2 meter table is configure on openvswitch [root@computesriovoffload-0 openvswitch]# ovs-ofctl dump-meters br-int -O OpenFlow15 OFPST_METER_CONFIG reply (OF1.5) (xid=0x2): meter=1 kbps burst stats bands= type=drop rate=8000000 burst_size=8000000 [root@computesriovoffload-0 openvswitch]# ovs-ofctl meter-features br-int -O OpenFlow15 OFPST_METER_FEATURES reply (OF1.5) (xid=0x2): max_meter:200000 max_bands:1 max_color:0 band_types: drop capabilities: kbps pktps burst stats Starting the traffic [root@vm2 ~]# iperf3 -c 6.6.6.6 Connecting to host 6.6.6.6, port 5201 [ 4] local 6.6.6.102 port 50638 connected to 6.6.6.6 port 5201 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-1.00 sec 617 MBytes 5.17 Gbits/sec 2 889 KBytes [ 4] 1.00-2.00 sec 590 MBytes 4.95 Gbits/sec 0 1.01 MBytes [ 4] 2.00-3.00 sec 615 MBytes 5.16 Gbits/sec 0 1.38 MBytes [ 4] 3.00-4.00 sec 616 MBytes 5.17 Gbits/sec 0 1.67 MBytes [ 4] 4.00-5.00 sec 616 MBytes 5.16 Gbits/sec 0 1.91 MBytes [ 4] 5.00-6.00 sec 600 MBytes 5.03 Gbits/sec 0 2.13 MBytes [ 4] 6.00-7.00 sec 598 MBytes 5.02 Gbits/sec 0 2.32 MBytes [ 4] 7.00-8.00 sec 419 MBytes 3.51 Gbits/sec 1008 167 KBytes [ 4] 8.00-9.00 sec 548 MBytes 4.59 Gbits/sec 825 119 KBytes [ 4] 9.00-10.00 sec 554 MBytes 4.64 Gbits/sec 671 161 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-10.00 sec 5.64 GBytes 4.84 Gbits/sec 2506 sender [ 4] 0.00-10.00 sec 5.63 GBytes 4.84 Gbits/sec receiver iperf Done. [root@vm2 ~]# Flow table: [root@computesriovoffload-0 heat-admin]# ovs-appctl dpctl/dump-flows -m ufid:2bffccbc-4a7d-4e7f-8f6a-a5dddd2b4a91, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(mx-bond),packet_type(ns=0/0,id=0/0),eth(src=c2:64:5c:2b:8f:75,dst=f8:f2:1e:03:bf:f3),eth_type(0x8100),vlan(vid=415,pcp=0),encap(eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no)), packets:2119586, bytes:146158126, used:0.000s, offloaded:yes, dp:tc, actions:pop_vlan,enp4s0f0_1 ufid:0ddb2f7d-5bec-42e5-b8ee-705b7df49e73, recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(enp4s0f0_1),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=f8:f2:1e:03:bf:f3,dst=c2:64:5c:2b:8f:75),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=6.6.6.4/255.255.255.252,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:47116, bytes:2707896952, used:0.026s, flags:SP., dp:ovs, actions:meter(0),push_vlan(vid=415,pcp=0),mx-bond We can see ingressing traffic is offlaoded while egressing is not. Version-Release number of selected component (if applicable): Kernel: 4.18.0-305.10.2.el8_4.x86_64 ovn: ovn-2021-21.03.0-40.el8fdp.x86_64 driver: mlx5e_rep driver version: 4.18.0-305.17.1.el8_4.x86_64 firmware-version: 16.26.6000 (DEL0000000015) How reproducible: Always Steps to Reproduce: 1. Configure max qos on switchdev port (More than 5gbps in my case) 2. Run iperf from this switchdev instance 3. Measure bandwidth Actual results: Limited to 5 gbps even though max allowed is 10 gbps due to ovs kernel datapath. Expected results: Should get offlaoded so, nic can better handle traffic engineering. Additional info: kernel-modules-extra-4.18.0-305.17.1.el8_4.x86_64 is presnet on the compute node. This may be related to mlx driver issue as well.
If I unset the neutron qos policy and leave VF with max qos 0 (disabled) openstack port unset --qos-policy vlanport2 vf 1 link/ether f8:f2:1e:03:bf:f3 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off [root@vm2 ~]# iperf3 -c 6.6.6.6 <<<<<<<<<<<<I could achieve ~9.39 gbps Connecting to host 6.6.6.6, port 5201 [ 4] local 6.6.6.102 port 35472 connected to 6.6.6.6 port 5201 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-1.00 sec 1.10 GBytes 9.42 Gbits/sec 49 915 KBytes [ 4] 1.00-2.00 sec 1.09 GBytes 9.38 Gbits/sec 7 963 KBytes [ 4] 2.00-3.00 sec 1.09 GBytes 9.39 Gbits/sec 9 1014 KBytes [ 4] 3.00-4.00 sec 1.09 GBytes 9.38 Gbits/sec 21 1.01 MBytes [ 4] 4.00-5.00 sec 1.09 GBytes 9.38 Gbits/sec 29 848 KBytes [ 4] 5.00-6.00 sec 1.09 GBytes 9.39 Gbits/sec 21 911 KBytes [ 4] 6.00-7.00 sec 1.09 GBytes 9.39 Gbits/sec 10 963 KBytes [ 4] 7.00-8.00 sec 1.09 GBytes 9.38 Gbits/sec 9 1020 KBytes [ 4] 8.00-9.00 sec 1.09 GBytes 9.38 Gbits/sec 35 1.03 MBytes [ 4] 9.00-10.00 sec 1.09 GBytes 9.40 Gbits/sec 10 1.06 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-10.00 sec 10.9 GBytes 9.39 Gbits/sec 200 sender [ 4] 0.00-10.00 sec 10.9 GBytes 9.39 Gbits/sec receiver iperf Done. As flows are offlaoded. ufid:1fc13947-5277-4950-a931-9b2c565a803a, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(enp4s0f0_1),packet_type(ns=0/0,id=0/0),eth(src=f8:f2:1e:03:bf:f3,dst=c2:64:5c:2b:8f:75),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=6.6.6.4/255.255.255.252,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:7815200, bytes:11863449200, used:0.610s, offloaded:yes, dp:tc, actions:push_vlan(vid=415,pcp=0),mx-bond ufid:2d65df8b-0c67-467d-8f98-3a57ed71b7da, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(mx-bond),packet_type(ns=0/0,id=0/0),eth(src=c2:64:5c:2b:8f:75,dst=f8:f2:1e:03:bf:f3),eth_type(0x8100),vlan(vid=415,pcp=0),encap(eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no)), packets:242339, bytes:15749194, used:0.610s, offloaded:yes, dp:tc, actions:pop_vlan,enp4s0f0_1
Hi, Haresh. Offloading of meters is not supported by OVS. Though, there is an ongoing work in this direction, but it's in early stage: https://patchwork.ozlabs.org/project/openvswitch/list/?series=260741&state=* Even if this patch set will be accepted, functionality will not be available until OVS 2.17 release in February. For the inconsistency of kernel meters, I think, at least we need to backport following patches from the upstream kernel, if they are not already: e4df1b0c2435 ("openvswitch: meter: fix race when getting now_ms.") 7d742b509dd7 ("openvswitch: meter: remove rate from the bucket size calculation") This should fix some problems with inadequate rates. Marcelo, what do you think?
one more thing, we need those kernel patches as well to improve rate limit mechanism. I did small test, Have a VF and applied 2 gbps max qos rate limit, now sending 2 streams traffic, so overall traffic rate limit for this VF remains 2 gbps. For Max qos applied directly on VF, it is 905 Mbits/sec for 1st stream and 1.005 Gbits/sec for 2nd stream. On Average, this comes ~2 gbps. For Max qos applied via metering flows, it is 977 Mbits/sec for 1st stream and 844 Mbit/sec for 2nd stream. On average, this comes ~1.821 gbps.
This is filed under an OVN component currently. Should this be moved to OVS instead?
Oh. Yes.
I suppose I can take this bz for now while we understand all the dependencies it may have (ovs, kernel tc, driver). There are patches posted upstream already that could use review from OVS team as well, though. [PATCH v2 0/6] Add support for ovs metering with tc offload I'll leave it up to the OVS team to reassign to me or keep it.
Oops, wrong button ;-) reassigning it back.
List of patches in U/S: https://patchwork.ozlabs.org/project/openvswitch/list/?submitter=80786
This effort is driven by Nvidia, should we assign this BZ to them?
Hi Amir. Parking this one with you until the patchset is accepted upstream. Then we'll take it back and handle the downstream portion. Thanks.
vswitchd patches: https://patchwork.ozlabs.org/project/openvswitch/cover/20220708095533.32489-1-jianbol@nvidia.com/ mlx5 driver patches: https://lore.kernel.org/netdev/20220702190213.80858-1-saeed%40kernel.org/T/ flow_offload patches: https://lore.kernel.org/netdev/20220224102908.5255-1-jianbol%40nvidia.com/T/ My understanding then is that this is done upstream. Amir, can you please confirm?
We confirmed on the call today that this is considered a done feature by Nvidia.
So this is part of OVS 3.0 and will not be backported. So from an OVS perspective, this is done. Marcelo, I guess we can close this BZ, and a new (clone) can be opened for the kernel part.
(In reply to Marcelo Ricardo Leitner from comment #18) > vswitchd patches: > https://patchwork.ozlabs.org/project/openvswitch/cover/20220708095533.32489- > 1-jianbol/ In OVS v3.0 per comment above. > mlx5 driver patches: > https://lore.kernel.org/netdev/20220702190213.80858-1-saeed%40kernel.org/T/ Landed in v6.0 upstream. targeted for 9.2: https://bugzilla.redhat.com/show_bug.cgi?id=2049629#c2 targeted for 8.8: https://bugzilla.redhat.com/show_bug.cgi?id=2049622#c5 > flow_offload patches: > https://lore.kernel.org/netdev/20220224102908.5255-1-jianbol%40nvidia.com/T/ in 9.2 via https://bugzilla.redhat.com/show_bug.cgi?id=2128185 in 8.7 via https://bugzilla.redhat.com/show_bug.cgi?id=2106271 Haresh, how do these versions work for OSP?
(In reply to Eelco Chaudron from comment #20) > So this is part of OVS 3.0 and will not be backported. So from an OVS > perspective, this is done. > > Marcelo, I guess we can close this BZ, and a new (clone) can be opened for > the kernel part. Perhaps by now we can convert this bz into a TestOnly one, so that we're sure we have a test at OVS level for it. There is a test case shared by Amir on the driver rebases but I can't tell if QE would be using the right FDP OVS. Jianlin, thoughts?
(In reply to Marcelo Ricardo Leitner from comment #22) > (In reply to Eelco Chaudron from comment #20) > > So this is part of OVS 3.0 and will not be backported. So from an OVS > > perspective, this is done. > > > > Marcelo, I guess we can close this BZ, and a new (clone) can be opened for > > the kernel part. > > Perhaps by now we can convert this bz into a TestOnly one, so that we're > sure we have a test at OVS level for it. > There is a test case shared by Amir on the driver rebases but I can't tell > if QE would be using the right FDP OVS. > Jianlin, thoughts? qijun, would ovs team help to add case to cover this feature?
(In reply to Jianlin Shi from comment #23) > > qijun, would ovs team help to add case to cover this feature? When the feature is available to test I will add a test case to cover the feature. Thank you.
(In reply to Marcelo Ricardo Leitner from comment #21) > (In reply to Marcelo Ricardo Leitner from comment #18) > > vswitchd patches: > > https://patchwork.ozlabs.org/project/openvswitch/cover/20220708095533.32489- > > 1-jianbol/ > > In OVS v3.0 per comment above. > > > mlx5 driver patches: > > https://lore.kernel.org/netdev/20220702190213.80858-1-saeed%40kernel.org/T/ > > Landed in v6.0 upstream. > targeted for 9.2: https://bugzilla.redhat.com/show_bug.cgi?id=2049629#c2 > targeted for 8.8: https://bugzilla.redhat.com/show_bug.cgi?id=2049622#c5 > > > flow_offload patches: > > https://lore.kernel.org/netdev/20220224102908.5255-1-jianbol%40nvidia.com/T/ > > in 9.2 via https://bugzilla.redhat.com/show_bug.cgi?id=2128185 > in 8.7 via https://bugzilla.redhat.com/show_bug.cgi?id=2106271 > > > Haresh, how do these versions work for OSP? OSP17.1 will have fixed ovs and kernel then. Thanks
Making this bz a TestOnly one per above.
Eelco, parking this with you. This just needs to be verified. fbl
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (openvswitch3.1 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:4685
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days