Bug 2136716
Summary: | ovn qos_max_rate can't be set to bigger than 200M when using a 25G mlx5 NIC with localnet port | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux Fast Datapath | Reporter: | LiLiang <liali> |
Component: | ovn22.03 | Assignee: | Ilya Maximets <i.maximets> |
Status: | CLOSED ERRATA | QA Contact: | LiLiang <liali> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | FDP 22.J | CC: | ctrautma, fleitner, i.maximets, jhsiao, jiji, mmichels, qding, ralongi |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | ovn22.03-22.03.0-120.el8fdp | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2022-12-15 15:26:33 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
LiLiang
2022-10-21 03:31:42 UTC
This issue happen when using NIC described in comment #0, but doesn't happen when using below NIC: [root@dell-per740-23 ~]# ethtool -i ens3f0np0 driver: mlx5_core version: 5.14.0-70.30.1.rt21.102.el9_0.x firmware-version: 22.34.4000 (MT_0000000359) expansion-rom-version: bus-info: 0000:5e:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: no supports-register-dump: no supports-priv-flags: yes [root@dell-per740-23 ~]# lspci -s 0000:5e:00.0 -v 5e:00.0 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx] Subsystem: Mellanox Technologies Device 0016 Flags: bus master, fast devsel, latency 0, IRQ 89, NUMA node 0, IOMMU group 81 Memory at bc000000 (64-bit, prefetchable) [size=32M] Expansion ROM at b8800000 [disabled] [size=1M] Capabilities: [60] Express Endpoint, MSI 00 Capabilities: [48] Vital Product Data Capabilities: [9c] MSI-X: Enable+ Count=64 Masked- Capabilities: [c0] Vendor Specific Information: Len=18 <?> Capabilities: [40] Power Management version 3 Capabilities: [100] Advanced Error Reporting Capabilities: [150] Alternative Routing-ID Interpretation (ARI) Capabilities: [180] Single Root I/O Virtualization (SR-IOV) Capabilities: [1c0] Secondary PCI Express Capabilities: [230] Access Control Services Capabilities: [320] Lane Margining at the Receiver <?> Capabilities: [370] Physical Layer 16.0 GT/s <?> Capabilities: [420] Data Link Feature <?> Kernel driver in use: mlx5_core Kernel modules: mlx5_core Using tc command, I can set NIC qos rate to a more than 200M value. # tc qdisc show dev ens1f1np1 qdisc htb 1: root refcnt 641 r2q 10 default 0x1 direct_packets_stat 6 direct_qlen 1000 # tc class add dev ens1f1np1 parent root classid 1:fffe htb prio 0 rate 200000000 ceil 3000000000 # tc class show dev ens1f1np1 class htb 1:fffe root prio 0 rate 200Mbit ceil 3Gbit burst 1600b cburst 1125b This is an interesting case. Since you show that the same settings in OVN didn't work with one NIC (NIC A) but did work for another other (NIC B), I immediately thought that this must be a tc issue since you've proven that OVN is capable of configuring a NIC with the desired QoS. But then you showed how you can get the QOS settings to work properly by using tc directly. So tc is apparently capable of setting QoS properly on NIC A. OVN always uses the same API calls to set QoS regardless of the NIC, and it worked properly for NIC B. And since this worked properly with NIC B, it should also mean that OVS has no issues getting these values passed to tc. I can only assume then that OVS is passing the values properly to tc with NIC A as well. So this to me still points to something being wrong with tc or the NIC driver itself. I'm going to pass this issue over to the OVS team to begin with so that they can analyze the OVS code to ensure my assumptions are correct. If they are, then this likely needs to be punted down another layer. OVN is basically using 2 functions to setup QoS: 1. netdev_set_qos(netdev, type, NULL); 2. netdev_set_queue(netdev_phy, sb_info->queue_id, &queue_details); The first one is to configure the parent QoS class and the second to configure each queue within that class. The key point here is a 'NULL' in the first call. It means that we have no configuration provided for the class itself. Since there is no configuration provided, netdev_set_qos() will try to determine the max-rate on it's own. The logic for that is try to get the link speed of the port and use it as a global max-rate that will cap total of max-rate's of individual queues. Sounds reasonable. Here goes the issue. OVS can't determine the link speed of the 25Gbps link. There is no support for that. And it's not a bug. This may also happen for other reasons or with more exotic link speeds for which there are no known enumerations. According to the documentation, OVS is using 100 Mbps instead in the case where it is unable to determine the speed. Queue speeds that OVN is trying to set are capped by that value, since rate of an individual queue can not be larger than a global rate for the class. I have a patch in works to increase the default value up to 10Gbps, and we can re-work the link speed detection and add more enum items for other link speeds. But that doesn't mean that OVN is configuring QoS correctly. There are cases where link speed can not actually be determined and, for example, tap interfaces report speed of 10 Mbps always. So, OVN will not be able to configure meaningful QoS values for such ports. One way to fix will be to get the sum of all the individual queue rates and use that value as a max-rate for the class. Maybe choose a bit higher value to not re-configure the class every time. I'll move this back to OVN for now. Will also open an RFE for OVS to better support link speeds, though this will likely be OVS 3.1 material as it will require some internal re-work and move to new ethtool APIs. (In reply to Ilya Maximets from comment #4) > One way to fix will be to get the sum of all the individual queue > rates and use that value as a max-rate for the class. Maybe > choose a bit higher value to not re-configure the class every time. That will probably not work, because not all the traffic should be limited. I suppose, something like UINT64_MAX will be a better solution. Or whatever the maximum value tc will accept. Based on Ilya's comment, I am re-classifying this issue as being an openvswitch issue. (In reply to Mark Michelson from comment #6) > Based on Ilya's comment, I am re-classifying this issue as being an > openvswitch issue. I would disagree with that. Yes, there are things we can change in OVS to make this work, but ultimately OVN doesn't configure QoS correctly in the first place. I created an RFE for OVS to detect more link speeds here: BZ 2137567 Suggesting to move the current BZ back to OVN, so this bug can be fixed without waiting for OVS 3.1+. Moving back to OVN as suggested by Ilya. Posted an OVN fix for review: https://patchwork.ozlabs.org/project/ovn/patch/20221101140032.734440-1-i.maximets@ovn.org/ llya, Is (2^32 - 1) * 8 equal to 32Gbps ? Is this enough? Because there are 100Gbps NICs. Liang. (In reply to LiLiang from comment #11) > Is (2^32 - 1) * 8 equal to 32Gbps ? > Is this enough? Because there are 100Gbps NICs. It's 34 Gbps, but yes, that might not be enough. Users are typically not using values that high though. There is an RFE to start using 64-bit netlink attributes that will allow configuring higher values: BZ 2137619. At the same time OVN is currently limited to just 4 Gbps, not even 34. See BZ 2139100. ovn22.03 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2144963 ovn22.06 fast-datapath-rhel-8 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2144964 ovn22.06 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2144965 ovn22.09 fast-datapath-rhel-8 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2144966 ovn22.09 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2144967 I guess ovn22.03-22.03.0-120.el8fdp has not been built yet because I can't find it at https://download.eng.bos.redhat.com/brewroot/packages/ovn22.03/22.03.0/ ? Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (ovn22.03 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:9059 |