The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 2119876 - [ENIC] Can't add a 25 Gb enic to OVS-DPDK
Summary: [ENIC] Can't add a 25 Gb enic to OVS-DPDK
Keywords:
Status: MODIFIED
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: openvswitch
Version: RHEL 8.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Kevin Traynor
QA Contact: Jean-Tsung Hsiao
URL:
Whiteboard:
Depends On:
Blocks: 2173805
TreeView+ depends on / blocked
 
Reported: 2022-08-19 18:05 UTC by Jean-Tsung Hsiao
Modified: 2024-01-17 10:07 UTC (History)
5 users (show)

Fixed In Version: openvswitch3.2-3.2.0-10.el9fdp
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-2236 0 None None None 2022-08-19 18:11:35 UTC

Description Jean-Tsung Hsiao 2022-08-19 18:05:20 UTC
Description of problem: [ENIC] Can't add a 25 Gb enic to OVS-DPDK

egrep 'opened|ERR|WARN' /var/log/openvswitch/ovs-vswitchd.log
2022-08-19T16:29:19.034Z|00001|vlog|INFO|opened log file /var/log/openvswitch/ovs-vswitchd.log
2022-08-19T16:29:29.204Z|00023|dpdk|WARN|EAL: No available hugepages reported in hugepages-2048kB
2022-08-19T16:29:32.205Z|00029|dpdk|ERR|PMD: rte_enic_pmd: Devcmd 88 failed with error code -1
2022-08-19T16:29:32.442Z|00036|dpdk|ERR|PMD: rte_enic_pmd: Devcmd 88 failed with error code -1
2022-08-19T16:29:32.447Z|00043|timeval|WARN|Unreasonably long 3275ms poll interval (8ms user, 2695ms system)
2022-08-19T16:29:32.447Z|00044|timeval|WARN|faults: 7671 minor, 0 major
2022-08-19T16:29:32.447Z|00045|timeval|WARN|context switches: 144 voluntary, 40 involuntary
2022-08-19T16:29:37.685Z|00100|dpdk|ERR|Invalid value for nb_tx_desc(=2048), should be: <= 256, >= 64, and a product of 32
2022-08-19T16:29:37.685Z|00102|netdev_dpdk|ERR|Interface dpdk-10(rxq:1 txq:1 lsc interrupt mode:false) configure error: Invalid argument
2022-08-19T16:29:37.685Z|00103|dpif_netdev|ERR|Failed to set interface dpdk-10 new configuration
2022-08-19T16:29:37.685Z|00104|dpif|WARN|netdev@ovs-netdev: failed to add dpdk-10 as port: Invalid argument
2022-08-19T16:29:37.685Z|00105|bridge|WARN|could not add network device dpdk-10 to ofproto (Invalid argument)
2022-08-19T16:29:37.910Z|00108|dpdk|WARN|PMD: rte_enic_pmd: MTU (9000) is greater than value configured in NIC (1500)
2022-08-19T16:29:37.910Z|00110|dpdk|ERR|Invalid value for nb_tx_desc(=2048), should be: <= 256, >= 64, and a product of 32
2022-08-19T16:29:37.910Z|00112|netdev_dpdk|ERR|Interface dpdk-10(rxq:1 txq:1 lsc interrupt mode:false) configure error: Invalid argument
2022-08-19T16:29:37.910Z|00113|dpif_netdev|ERR|Failed to set interface dpdk-10 new configuration
2022-08-19T16:29:37.910Z|00114|dpif|WARN|netdev@ovs-netdev: failed to add dpdk-10 as port: Invalid argument
2022-08-19T16:29:37.910Z|00115|bridge|WARN|could not add network device dpdk-10 to ofproto (Invalid argument)
2022-08-19T16:29:37.921Z|00118|dpdk|WARN|PMD: rte_enic_pmd: MTU (9000) is greater than value configured in NIC (1500)
2022-08-19T16:29:37.921Z|00120|dpdk|ERR|Invalid value for nb_tx_desc(=2048), should be: <= 256, >= 64, and a product of 32
2022-08-19T16:29:37.921Z|00122|netdev_dpdk|ERR|Interface dpdk-10(rxq:2 txq:1 lsc interrupt mode:false) configure error: Invalid argument
2022-08-19T16:29:37.921Z|00123|dpif_netdev|ERR|Failed to set interface dpdk-10 new configuration
2022-08-19T16:29:37.921Z|00124|dpif|WARN|netdev@ovs-netdev: failed to add dpdk-10 as port: Invalid argument
2022-08-19T16:29:37.921Z|00125|bridge|WARN|could not add network device dpdk-10 to ofproto (Invalid argument)
2022-08-19T16:29:37.937Z|00128|dpdk|WARN|PMD: rte_enic_pmd: MTU (9000) is greater than value configured in NIC (1500)
2022-08-19T16:29:37.937Z|00130|dpdk|ERR|Invalid value for nb_tx_desc(=2048), should be: <= 256, >= 64, and a product of 32
2022-08-19T16:29:37.937Z|00132|netdev_dpdk|ERR|Interface dpdk-10(rxq:2 txq:1 lsc interrupt mode:false) configure error: Invalid argument
2022-08-19T16:29:37.937Z|00133|dpif_netdev|ERR|Failed to set interface dpdk-10 new configuration
2022-08-19T16:29:37.937Z|00134|dpif|WARN|netdev@ovs-netdev: failed to add dpdk-10 as port: Invalid argument
2022-08-19T16:29:37.937Z|00135|bridge|WARN|could not add network device dpdk-10 to ofproto (Invalid argument)
2022-08-19T16:29:37.960Z|00147|dpdk|WARN|PMD: rte_enic_pmd: MTU (9000) is greater than value configured in NIC (1500)
2022-08-19T16:29:37.960Z|00149|dpdk|ERR|Invalid value for nb_tx_desc(=2048), should be: <= 256, >= 64, and a product of 32
2022-08-19T16:29:37.960Z|00151|netdev_dpdk|ERR|Interface dpdk-10(rxq:2 txq:1 lsc interrupt mode:false) configure error: Invalid argument
2022-08-19T16:29:37.960Z|00152|dpif_netdev|ERR|Failed to set interface dpdk-10 new configuration
2022-08-19T16:29:37.960Z|00153|bridge|WARN|could not add network device dpdk-10 to ofproto (Invalid argument)
2022-08-19T16:29:37.963Z|00156|dpdk|WARN|PMD: rte_enic_pmd: MTU (9000) is greater than value configured in NIC (1500)
2022-08-19T16:29:37.964Z|00158|dpdk|ERR|Invalid value for nb_tx_desc(=2048), should be: <= 256, >= 64, and a product of 32
2022-08-19T16:29:37.964Z|00160|netdev_dpdk|ERR|Interface dpdk-10(rxq:2 txq:1 lsc interrupt mode:false) configure error: Invalid argument
2022-08-19T16:29:37.964Z|00161|dpif_netdev|ERR|Failed to set interface dpdk-10 new configuration
2022-08-19T16:29:37.980Z|00174|dpdk|WARN|PMD: rte_enic_pmd: MTU (9000) is greater than value configured in NIC (1500)
2022-08-19T16:29:37.980Z|00176|dpdk|ERR|Invalid value for nb_tx_desc(=2048), should be: <= 256, >= 64, and a product of 32
2022-08-19T16:29:37.980Z|00178|netdev_dpdk|ERR|Interface dpdk-10(rxq:2 txq:1 lsc interrupt mode:false) configure error: Invalid argument
2022-08-19T16:29:37.980Z|00179|dpif_netdev|ERR|Failed to set interface dpdk-10 new configuration
2022-08-19T16:29:37.982Z|00185|netdev_dpdk|WARN|dpdkvhostuser ports are considered deprecated;  please migrate to dpdkvhostuserclient ports.
2022-08-19T16:29:38.547Z|00190|dpdk|WARN|PMD: rte_enic_pmd: MTU (9000) is greater than value configured in NIC (1500)
2022-08-19T16:29:38.547Z|00192|dpdk|ERR|Invalid value for nb_tx_desc(=2048), should be: <= 256, >= 64, and a product of 32
2022-08-19T16:29:38.547Z|00194|netdev_dpdk|ERR|Interface dpdk-10(rxq:2 txq:1 lsc interrupt mode:false) configure error: Invalid argument
2022-08-19T16:29:38.547Z|00195|dpif_netdev|ERR|Failed to set interface dpdk-10 new configuration
2022-08-19T16:29:38.547Z|00197|dpif|WARN|Dropped 3 log messages in last 0 seconds (most recently, 0 seconds ago) due to excessive rate
2022-08-19T16:29:38.547Z|00198|dpif|WARN|netdev@ovs-netdev: failed to add dpdk-10 as port: Invalid argument
2022-08-19T16:29:38.555Z|00202|dpdk|WARN|PMD: rte_enic_pmd: MTU (9000) is greater than value configured in NIC (1500)
2022-08-19T16:29:38.555Z|00204|dpdk|ERR|Invalid value for nb_tx_desc(=2048), should be: <= 256, >= 64, and a product of 32
2022-08-19T16:29:38.555Z|00206|netdev_dpdk|ERR|Interface dpdk-10(rxq:2 txq:1 lsc interrupt mode:false) configure error: Invalid argument
2022-08-19T16:29:38.555Z|00207|dpif_netdev|ERR|Failed to set interface dpdk-10 new configuration
2022-08-19T16:31:10.879Z|00214|dpdk|WARN|PMD: rte_enic_pmd: MTU (9000) is greater than value configured in NIC (1500)
2022-08-19T16:31:10.879Z|00216|dpdk|ERR|Invalid value for nb_tx_desc(=2048), should be: <= 256, >= 64, and a product of 32
2022-08-19T16:31:10.879Z|00218|netdev_dpdk|ERR|Interface dpdk-10(rxq:2 txq:1 lsc interrupt mode:false) configure error: Invalid argument
2022-08-19T16:31:10.879Z|00219|dpif_netdev|ERR|Failed to set interface dpdk-10 new configuration
2022-08-19T16:31:10.879Z|00221|dpif|WARN|Dropped 1 log messages in last 93 seconds (most recently, 93 seconds ago) due to excessive rate
2022-08-19T16:31:10.879Z|00222|dpif|WARN|netdev@ovs-netdev: failed to add dpdk-10 as port: Invalid argument
2022-08-19T16:31:10.879Z|00223|bridge|WARN|Dropped 4 log messages in last 93 seconds (most recently, 93 seconds ago) due to excessive rate
2022-08-19T16:31:10.879Z|00224|bridge|WARN|could not add network device dpdk-10 to ofproto (Invalid argument)
2022-08-19T16:31:10.912Z|00228|dpdk|WARN|PMD: rte_enic_pmd: MTU (9000) is greater than value configured in NIC (1500)
2022-08-19T16:31:10.912Z|00230|dpdk|ERR|Invalid value for nb_tx_desc(=2048), should be: <= 256, >= 64, and a product of 32
2022-08-19T16:31:10.912Z|00232|netdev_dpdk|ERR|Interface dpdk-10(rxq:2 txq:1 lsc interrupt mode:false) configure error: Invalid argument
2022-08-19T16:31:10.912Z|00233|dpif_netdev|ERR|Failed to set interface dpdk-10 new configuration
2022-08-19T16:31:10.912Z|00235|dpif|WARN|netdev@ovs-netdev: failed to add dpdk-10 as port: Invalid argument
2022-08-19T16:31:11.014Z|00239|dpdk|WARN|PMD: rte_enic_pmd: MTU (9000) is greater than value configured in NIC (1500)
2022-08-19T16:31:11.014Z|00241|dpdk|ERR|Invalid value for nb_tx_desc(=2048), should be: <= 256, >= 64, and a product of 32
2022-08-19T16:31:11.014Z|00243|netdev_dpdk|ERR|Interface dpdk-10(rxq:2 txq:1 lsc interrupt mode:false) configure error: Invalid argument
2022-08-19T16:31:11.014Z|00244|dpif_netdev|ERR|Failed to set interface dpdk-10 new configuration
2022-08-19T16:31:11.014Z|00246|dpif|WARN|netdev@ovs-netdev: failed to add dpdk-10 as port: Invalid argument
2022-08-19T16:31:12.968Z|00250|dpdk|WARN|PMD: rte_enic_pmd: MTU (9000) is greater than value configured in NIC (1500)
2022-08-19T16:31:12.968Z|00252|dpdk|ERR|Invalid value for nb_tx_desc(=2048), should be: <= 256, >= 64, and a product of 32
2022-08-19T16:31:12.968Z|00254|netdev_dpdk|ERR|Interface dpdk-10(rxq:2 txq:1 lsc interrupt mode:false) configure error: Invalid argument
2022-08-19T16:31:12.968Z|00255|dpif_netdev|ERR|Failed to set interface dpdk-10 new configuration
2022-08-19T16:31:12.968Z|00257|dpif|WARN|netdev@ovs-netdev: failed to add dpdk-10 as port: Invalid argument
2022-08-19T16:31:15.023Z|00261|dpdk|WARN|PMD: rte_enic_pmd: MTU (9000) is greater than value configured in NIC (1500)
2022-08-19T16:31:15.023Z|00263|dpdk|ERR|Invalid value for nb_tx_desc(=2048), should be: <= 256, >= 64, and a product of 32
2022-08-19T16:31:15.023Z|00265|netdev_dpdk|ERR|Interface dpdk-10(rxq:2 txq:1 lsc interrupt mode:false) configure error: Invalid argument
2022-08-19T16:31:15.023Z|00266|dpif_netdev|ERR|Failed to set interface dpdk-10 new configuration
2022-08-19T16:31:15.023Z|00268|dpif|WARN|netdev@ovs-netdev: failed to add dpdk-10 as port: Invalid argument
2022-08-19T16:31:23.742Z|00276|dpdk|WARN|PMD: rte_enic_pmd: MTU (9000) is greater than value configured in NIC (1500)
2022-08-19T16:31:23.742Z|00278|dpdk|ERR|Invalid value for nb_tx_desc(=2048), should be: <= 256, >= 64, and a product of 32
2022-08-19T16:31:23.742Z|00280|netdev_dpdk|ERR|Interface dpdk-10(rxq:2 txq:1 lsc interrupt mode:false) configure error: Invalid argument
2022-08-19T16:31:23.742Z|00281|dpif_netdev|ERR|Failed to set interface dpdk-10 new configuration
2022-08-19T16:31:23.742Z|00286|dpif|WARN|netdev@ovs-netdev: failed to add dpdk-10 as port: Invalid argument
2022-08-19T16:31:23.747Z|00289|dpdk|WARN|PMD: rte_enic_pmd: MTU (9000) is greater than value configured in NIC (1500)
2022-08-19T16:31:23.747Z|00291|dpdk|ERR|Invalid value for nb_tx_desc(=2048), should be: <= 256, >= 64, and a product of 32
2022-08-19T16:31:23.747Z|00293|netdev_dpdk|ERR|Interface dpdk-10(rxq:2 txq:1 lsc interrupt mode:false) configure error: Invalid argument
2022-08-19T16:31:23.747Z|00294|dpif_netdev|ERR|Failed to set interface dpdk-10 new configuration
2022-08-19T16:31:23.747Z|00299|dpif|WARN|netdev@ovs-netdev: failed to add dpdk-10 as port: Invalid argument

https://beaker.engineering.redhat.com/jobs/6927998
https://beaker.engineering.redhat.com/jobs/6927671

Version-Release number of selected component (if applicable):
FDP 22.G:

Rhel8/OVS 2.15
Rhel9/OVS 2.17


How reproducible: Reproducible


Steps to Reproduce: Add 25 Gb ENIC to OVS-DPDK
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Flavio Leitner 2023-02-03 18:14:02 UTC
Re: "dpdk|ERR|PMD: rte_enic_pmd: Devcmd 88 failed with error code -1"

It affects only 2.15. The 2.17 dpdk contains the fix below:

https://inbox.dpdk.org/dev/20211026000418.13540-1-hyonkim@cisco.com/T/

Probing the availability of Flow Manager API may print the following
error log.

PMD: rte_enic_pmd: Devcmd 88 failed with error code -1

The error indicates a flow manager operation failed and happens when
advanced filtering is disabled on vNIC. It is harmless but confusing
to the user. Since advanced filtering is a prerequisite, check first
if it is available and avoid the error message altogether.

Fixes: ea7768b5bba8 ("net/enic: add flow implementation based on Flow Manager API")
Cc: stable

----

Re: dpdk|ERR|Invalid value for nb_tx_desc(=2048), should be: <= 256, >= 64, and a product of 32
Seems to come from:
https://github.com/DPDK/dpdk/blob/bc1db4f45af35c87e7d97db7a24d479674aa8a43/lib/ethdev/rte_ethdev.c#L1960
We need to find out why it is 2048.


fbl

Comment 3 Kevin Traynor 2023-02-07 18:03:06 UTC
Regarding "dpdk|ERR|PMD: rte_enic_pmd: Devcmd 88 failed with error code -1"

The fix for this was backported upstream to DPDK 20.11.4.
https://git.dpdk.org/dpdk-stable/commit/?h=20.11&id=4c5c31b120f1113de28ed53eff31cfd5651e8c94

For rhel8/2.15 the fix is available in FDP 22.I with openvswitch2.15-2.15.0-122.el8fdp which contains DPDK v20.11.6.

For rhel9/2.17 the fix was in from the initial version of OVS 2.17 so should not appear with that.

----

For 2048 issue. That is the default tx queue size in OVS since OVS 2.7. Something must have changed in the enic driver/firmware that reduced what is allowed or it has only started checking for a too large value in the driver recently. A possible fix would be to check the max size from the driver in OVS and limit what is configured based on this.

For now a workaround for enic would be to set the tx queue size to a suitable value when adding the enic port. e.g.
$ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk options:dpdk-devargs=<enic pci> -- set Interface dpdk0 options:n_txq_desc=256

Comment 4 Jean-Tsung Hsiao 2023-02-08 18:01:32 UTC
Hi Kevin,
Yes, the "Devcmd 88 failed with error code -1" issue is gone.
And, the desc workaround works.
See log below.
Thanks!
Jean

[root@netqe37 jhsiao]# ovs-vsctl show
f4711e8a-cbea-4112-bf6b-a585bb878ddc
    Bridge ovsbr0
        datapath_type: netdev
        Port dpdk-10
            Interface dpdk-10
                type: dpdk
                options: {dpdk-devargs="0000:1d:00.0", n_rxq="1", n_rxq_desc="256", n_txq_desc="256"}
        Port dpdk-11
            Interface dpdk-11
                type: dpdk
                options: {dpdk-devargs="0000:1d:00.2", n_rxq="1", n_rxq_desc="256", n_txq_desc="256"}
        Port ovsbr0
            Interface ovsbr0
                type: internal
    ovs_version: "2.15.8"
[root@netqe37 jhsiao]# uname -r
4.18.0-372.41.1.el8_6.x86_64
[root@netqe37 jhsiao]# rpm -q openvswitch2.15
openvswitch2.15-2.15.0-133.el8fdp.x86_64
[root@netqe37 jhsiao]#

Comment 5 Kevin Traynor 2023-02-10 11:27:26 UTC
Patch [1] sent to adjust number of descriptors based on device limits where required:

    netdev-dpdk: Check rx/tx descriptor sizes for device.
    
    By default OVS configures 2048 descriptors for tx and rx queues
    on DPDK devices. It also allows the user to configure those values.
    
    If the values used are not acceptable to the device than queue setup
    will fail.
    
    The device exposes it's max/min/alignment requirements, so use those
    to ensure that an acceptable value is used during queue setup.
    
    If the default or user value is not acceptable, adjust to a suitable
    value.


[1] https://mail.openvswitch.org/pipermail/ovs-dev/2023-February/401935.html

Comment 8 Kevin Traynor 2023-03-15 17:56:49 UTC
There are some changes [0] on DPDK main branch that might be needed to get the correct max value for the devices. I have asked upstream maintainers to comment [1].

Can you paste the error logs you see for Rx so we can show upstream maintainers?

Thanks.

[0]
commit 22572e84fbda2c195707ffbb0dd6af4433d7a219
Author: John Daley <johndale>
Date:   Fri Jan 28 09:58:13 2022 -0800

    net/enic: support max descriptors allowed by adapter

[1] https://bugs.dpdk.org/show_bug.cgi?id=1185

Comment 9 John Daley 2023-03-15 18:29:34 UTC
If this is a new Cisco 25GB adapter, it probably needs to be configured to increase the number of descriptors and the number descriptor queues. Please see https://doc.dpdk.org/guides/nics/enic.html#configuration-information. You will need to use the Cisco UCS manager to change the configuration (either UCSM or CIMC depending on if you are using a blade or a rack server). You can overprovision the adapter with more queues and descriptors than you think you will need. e.g.:

RQs: 64
WQs: 32
CQs: 64
Interrupts: 33
RQ descriptors: 4096
WQ descriptrors: 4096

You should also enable advanced filters. There will be a radio button to enable it. After making the configuration changes you will need to reboot the host.

Comment 10 Kevin Traynor 2023-05-16 14:02:35 UTC
Submitted v4 patch upstream

https://mail.openvswitch.org/pipermail/ovs-dev/2023-May/404590.html

Comment 12 Kevin Traynor 2023-06-16 09:26:18 UTC
Patch to check device for descriptor limits has merged upstream [0] and will be part of OVS 3.2.

This should prevent queue setup failures on the NIC (or any NIC) because they are configured with a max tx/rx queue size of <2048.

It may still be the case that NICs configured with low max queue sizes are more susceptible to dropping packets because
of lack of buffering and the NIC will have to configured to have a larger max.

[0] https://github.com/openvswitch/ovs/commit/9dad8dfd1ed9e1c4629b584b477114e11f3556b7

   netdev-dpdk: Check rx/tx descriptor sizes for device.

   By default OVS configures 2048 descriptors for tx and rx queues
   on DPDK devices. It also allows the user to configure those values.

   If the values used are not acceptable to the device then queue setup
   would fail.

   The device exposes it's max/min/alignment requirements and OVS
   applies some limits also. Use these to ensure an acceptable value
   is used for the number of descriptors on a device tx/rx.

   If the default or user value is not acceptable, adjust to a suitable
   value and log.

   Reported-at: https://bugzilla.redhat.com/2119876
   Reviewed-by: David Marchand <david.marchand>
   Reviewed-by: Simon Horman <simon.horman>
   Signed-off-by: Kevin Traynor <ktraynor>
   Signed-off-by: Ilya Maximets <i.maximets>

Comment 13 Kevin Traynor 2024-01-17 10:07:13 UTC
Confirmed from Enic maintainer in comment #9 that Cisco 25GB adapter may have default max queue sizes of less than the default value that OVS attempts to configure. To increase the NIC max size values Cisco UCS manager must be used.

In order to accommodate NICs that have a lower max queue size or other requirements. OVS now reads the descriptor size requirements from the NIC and adapts the value it configures based on this. This merged as part of OVS 3.2.


Note You need to log in before you can comment on or make changes to this bug.