Description of problem: Version-Release number of selected component (if applicable): [root@netqe22 perf]# rpm -qa|grep openvs kernel-kernel-networking-openvswitch-perf-1.0-235.noarch openvswitch-selinux-extra-policy-1.0-29.el9fdp.noarch openvswitch2.17-2.17.0-0.2.el9fdp.x86_64 [root@netqe22 perf]# ethtool -i enp130s0f0np0 driver: bnxt_en version: 5.14.0-52.el9.x86_64 firmware-version: 20.6.143.0/pkg 20.06.04.06 expansion-rom-version: bus-info: 0000:82:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: no [root@netqe22 perf]# uname -r 5.14.0-52.el9.x86_64 How reproducible: Steps to Reproduce: 1. Add ovs bridge ovs-vsctl set Open_vSwitch . other_config={} ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem="0,4098" ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=800000800000 ovs-vsctl add-br ovsbr0 -- set bridge ovsbr0 datapath_type=netdev 2. Bind bnxt_en card to dpdk [root@netqe22 perf]# driverctl -v set-override 0000:82:00.0 vfio-pci driverctl: setting driver override for 0000:82:00.0: vfio-pci driverctl: loading driver vfio-pci driverctl: unbinding previous driver bnxt_en driverctl: reprobing driver for 0000:82:00.0 driverctl: saving driver override for 0000:82:00.0 3. Add dpdk port to ovsbr0 [root@netqe22 perf]# ovs-vsctl add-port ovsbr0 dpdk0 -- set Interface dpdk0 type=dpdk type=dpdk options:dpdk-devargs=0000:82:00.0 ovs-vsctl: Error detected while setting up 'dpdk0': Error attaching device '0000:82:00.0' to DPDK. See ovs-vswitchd log for details. ovs-vsctl: The default log directory is "/var/log/openvswitch". Actual results: Add dpdk port to ovs bridge failed [root@netqe22 ~]# tail -f /var/log/openvswitch/ovs-vswitchd.log 2022-02-17T07:16:14.737Z|00083|dpdk|INFO|EAL: Using IOMMU type 1 (Type 1) 2022-02-17T07:16:14.998Z|00084|dpdk|INFO|EAL: Probe PCI driver: net_bnxt (14e4:16d7) device: 0000:82:00.0 (socket 1) 2022-02-17T07:16:14.998Z|00085|dpdk|ERR|ethdev initialisation failed 2022-02-17T07:16:14.998Z|00086|dpdk|INFO|EAL: Releasing PCI mapped resource for 0000:82:00.0 2022-02-17T07:16:14.998Z|00087|dpdk|INFO|EAL: Calling pci_unmap_resource for 0000:82:00.0 at 0x4201000000 2022-02-17T07:16:14.998Z|00088|dpdk|INFO|EAL: Calling pci_unmap_resource for 0000:82:00.0 at 0x4201010000 2022-02-17T07:16:14.998Z|00089|dpdk|INFO|EAL: Calling pci_unmap_resource for 0000:82:00.0 at 0x4201110000 2022-02-17T07:16:15.198Z|00090|dpdk|ERR|EAL: Driver cannot attach the device (0000:82:00.0) 2022-02-17T07:16:15.198Z|00091|dpdk|ERR|EAL: Failed to attach device on primary process 2022-02-17T07:16:15.198Z|00092|netdev_dpdk|WARN|Error attaching device '0000:82:00.0' to DPDK 2022-02-17T07:16:15.198Z|00093|netdev|WARN|dpdk0: could not set configuration (Invalid argument) 2022-02-17T07:16:15.198Z|00094|dpdk|ERR|Invalid port_id=128 Expected results: Add dpdk port to ovs bridge successfully. Additional info: https://beaker.engineering.redhat.com/jobs/6311938
Thanks, this seems like a regression in dpdk v21.11, introduced by: 3972281f47b2 ("net/bnxt: fix device readiness check") Reverting this patch makes init pass fine for me. I scheduled a scratch build for you to test (but brew seems to take a long time..). Has this problem been reproduced on RHEL8?
I update the firmware and it work well with openvswitch2.17-2.17.0-0.2.el9fdp.x86_64. [root@netqe22 ~]# ethtool -i enp130s0f0np0 driver: bnxt_en version: 4.18.0-369.el8.x86_64 firmware-version: 220.0.59.0/pkg 220.0.83.0 expansion-rom-version: bus-info: 0000:82:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: no
After update the firmware, I run the ovs dpdk pvp performance job of ovs2.17 again. It still not got the result. https://beaker.engineering.redhat.com/jobs/6418733 From the test log. the dpdk port seems cannot receive any packet.
I run the ovs dpdk pvp performance job of openvswitch2.17-2.17.0-0.4.el9fdp. It also got 0 result. https://beaker.engineering.redhat.com/jobs/6421187 It work well with openvswitch2.15-2.15.0-42.el9fdp. https://beaker.engineering.redhat.com/jobs/6421135
I can do nothing of the beaker logs: I can see no DPDK or OVS commands, it's hard to tell what is what. There is no usable logs. To save time, please provide an environment and a way to reproduce the issue, and I will have a look.
I upgrade the i40e firmware of netqe22, the case still not work well. you can continue to access to them for debug. thanks. [root@netqe32 ~]# ethtool -i ens3f0 driver: i40e version: 4.18.0-305.el8.x86_64 firmware-version: 8.50 0x8000b6ed 1.3082.0 expansion-rom-version: bus-info: 0000:5e:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: yes
Links are not even going up with kernel netdevs. I logged on the systems and I see following. On the dut, I see that the kernel netdev won't stay up: [ 6402.875392] bnxt_en 0000:82:00.0 enp130s0f0np0: renamed from eth0 [ 6402.886050] bnxt_en 0000:82:00.0: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link) [ 6404.113651] bnxt_en 0000:82:00.1 eth0: Broadcom BCM57414 NetXtreme-E 10Gb/25Gb Ethernet found at mem ca200000, node addr 00:0a:f7:b7:09:51 [ 6404.116904] bnxt_en 0000:82:00.1 enp130s0f1np1: renamed from eth0 [ 6404.127553] bnxt_en 0000:82:00.1: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link) [ 9932.291741] bnxt_en 0000:82:00.0 enp130s0f0np0: unsupported speed! [11172.415616] bnxt_en 0000:82:00.1 enp130s0f1np1: NIC Link is Up, 10000 Mbps full duplex, Flow control: ON - receive & transmit [11172.428238] bnxt_en 0000:82:00.1 enp130s0f1np1: FEC autoneg off encoding: None [11172.436318] IPv6: ADDRCONF(NETDEV_CHANGE): enp130s0f1np1: link becomes ready [11225.435240] bnxt_en 0000:82:00.1 enp130s0f1np1: NIC Link is Down [11267.953958] bnxt_en 0000:82:00.1 enp130s0f1np1: unsupported speed! [11617.480252] bnxt_en 0000:82:00.0 eth0: Broadcom BCM57414 NetXtreme-E 10Gb/25Gb Ethernet found at mem ca210000, node addr 00:0a:f7:b7:09:50 [11617.483667] bnxt_en 0000:82:00.0 enp130s0f0np0: renamed from eth0 [11617.494142] bnxt_en 0000:82:00.0: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link) [11617.525693] bnxt_en 0000:82:00.1 eth0: Broadcom BCM57414 NetXtreme-E 10Gb/25Gb Ethernet found at mem ca200000, node addr 00:0a:f7:b7:09:51 [11617.529025] bnxt_en 0000:82:00.1 enp130s0f1np1: renamed from eth0 [11617.539590] bnxt_en 0000:82:00.1: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link) [11734.009748] bnxt_en 0000:82:00.1 enp130s0f1np1: unsupported speed! [14286.494915] bnxt_en 0000:82:00.0 eth0: Broadcom BCM57414 NetXtreme-E 10Gb/25Gb Ethernet found at mem ca210000, node addr 00:0a:f7:b7:09:50 [14286.497770] bnxt_en 0000:82:00.0 enp130s0f0np0: renamed from eth0 [14286.508805] bnxt_en 0000:82:00.0: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link) [14286.540300] bnxt_en 0000:82:00.1 eth0: Broadcom BCM57414 NetXtreme-E 10Gb/25Gb Ethernet found at mem ca200000, node addr 00:0a:f7:b7:09:51 [14286.543631] bnxt_en 0000:82:00.1 enp130s0f1np1: renamed from eth0 [14286.554194] bnxt_en 0000:82:00.1: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link) On the tester side, there are warnings: [ 9.565143] i40e: Intel(R) Ethernet Connection XL710 Network Driver [ 9.565144] i40e: Copyright (c) 2013 - 2019 Intel Corporation. [ 9.586715] i40e 0000:5e:00.0: fw 8.5.67516 api 1.15 nvm 8.50 0x8000b6ed 1.3082.0 [8086:1572] [8086:0007] [ 9.596288] i40e 0000:5e:00.0: The driver for the device detected a newer version of the NVM image v1.15 than expected v1.9. Please install the most recent version of the network driver. [ 9.923572] i40e 0000:5e:00.0: MAC address: 3c:fd:fe:ad:7b:4c [ 9.929457] i40e 0000:5e:00.0: FW LLDP is enabled [ 9.936525] i40e 0000:5e:00.0: Query for DCB configuration failed, err I40E_ERR_NOT_READY aq_err OK [ 9.945565] i40e 0000:5e:00.0: DCB init failed -63, disabled [ 10.012278] i40e 0000:5e:00.0: PCI-Express: Speed 8.0GT/s Width x8 [ 10.018876] i40e 0000:5e:00.0: Features: PF-id[0] VFs: 64 VSIs: 66 QP: 56 RSS FD_ATR FD_SB NTUPLE VxLAN Geneve PTP VEPA [ 10.311717] i40e 0000:5e:00.1: fw 8.5.67516 api 1.15 nvm 8.50 0x8000b6ed 1.3082.0 [8086:1572] [8086:0000] [ 10.311719] i40e 0000:5e:00.1: The driver for the device detected a newer version of the NVM image v1.15 than expected v1.9. Please install the most recent version of the network driver. [ 10.879484] i40e 0000:5e:00.1: MAC address: 3c:fd:fe:ad:7b:4d [ 10.879744] i40e 0000:5e:00.1: FW LLDP is enabled [ 10.880783] i40e 0000:5e:00.1: Query for DCB configuration failed, err I40E_ERR_NOT_READY aq_err OK [ 10.880784] i40e 0000:5e:00.1: DCB init failed -63, disabled [ 10.898577] i40e 0000:5e:00.1: PCI-Express: Speed 8.0GT/s Width x8 [ 10.898996] i40e 0000:5e:00.1: Features: PF-id[1] VFs: 64 VSIs: 66 QP: 56 RSS FD_ATR FD_SB NTUPLE VxLAN Geneve PTP VEPA I logged into trex console, and I see: owner | root | root | link | DOWN | DOWN | You mentionned that no cable was changed, ok, but the issue could be SFP or cable that broke recently.
It's strange that I didn't make any changes to the cable. I will try run ovs2 15 job.
I run ovs2.15 and ovs2.17 job again. The only difference between the two jobs is the OVS version. The ovs2.15 work well. And ovs2.17 does not work. The netqe22/netqe32 has no ovs2.15 job: https://beaker.engineering.redhat.com/jobs/6447129 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/03/64471/6447129/11700996/141928037/bnxt_10.html ovs2.17 job: https://beaker.engineering.redhat.com/jobs/6447257 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/03/64472/6447257/11701198/141930043/bnxt_10.html And I also change the traffic sender from T-rex to Xena. and It also cannot got result. https://beaker.engineering.redhat.com/jobs/6448932 Do you still need to test the test environment? I'll prepare it if necessary.
For fdp22c, openvswitch2.17-2.17.0-7.el9fdp, still has this issue https://beaker.engineering.redhat.com/jobs/6490673
For fdp22d(openvswitch2.17-2.17.0-15.el9fdp.x86_64.rpm), change traffic sender to use Xena. it still has this issue. https://beaker.engineering.redhat.com/jobs/6614285
For openvswitch2.17-2.17.0-18.el9fdp.x86_64.rpm, it still exist this issue. https://beaker.engineering.redhat.com/jobs/6743770
For fdp22f openvswitch2.17-2.17.0-30.el9fdp.x86_64, this issue still exist. https://beaker.engineering.redhat.com/jobs/6871236
The issue does not exist on 25g bnxt_en card. It work well on 730-52 25g bnxt_en card. It only exist on netqe22 10g bnxt_en card. The firmware is same between the 10g and 25g card. https://beaker.engineering.redhat.com/jobs/6875119 And It work well with ovs2.15, ovs2.16 on 10g bnxt_en card. It only does not work well with ovs2.17 on 10g bnxt_en card. https://beaker.engineering.redhat.com/jobs/6875032 https://beaker.engineering.redhat.com/jobs/6872604 And checked the netqe22 10g card has disabled "LLDP nearest bridge" on bios setting.
It has no issue for openvswitch2.17-2.17.0-70.el9fdp.x86_64.rpm. Following is the job on anther 25g bnxt_en card. So close it. https://beaker.engineering.redhat.com/jobs/8328149
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days