Bug 2055531
Summary: | bnxt_en card: add dpdk port to ovs bridge failed with ovs2.17 | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux Fast Datapath | Reporter: | liting <tli> |
Component: | openvswitch2.17 | Assignee: | David Marchand <dmarchan> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | liting <tli> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | FDP 22.A | CC: | ansaini, cgoncalves, ctrautma, dmarchan, jhsiao, jpradhan, ralongi |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2023-10-11 07:42:53 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
liting
2022-02-17 07:26:33 UTC
Thanks, this seems like a regression in dpdk v21.11, introduced by: 3972281f47b2 ("net/bnxt: fix device readiness check") Reverting this patch makes init pass fine for me. I scheduled a scratch build for you to test (but brew seems to take a long time..). Has this problem been reproduced on RHEL8? I update the firmware and it work well with openvswitch2.17-2.17.0-0.2.el9fdp.x86_64. [root@netqe22 ~]# ethtool -i enp130s0f0np0 driver: bnxt_en version: 4.18.0-369.el8.x86_64 firmware-version: 220.0.59.0/pkg 220.0.83.0 expansion-rom-version: bus-info: 0000:82:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: no After update the firmware, I run the ovs dpdk pvp performance job of ovs2.17 again. It still not got the result. https://beaker.engineering.redhat.com/jobs/6418733 From the test log. the dpdk port seems cannot receive any packet. I run the ovs dpdk pvp performance job of openvswitch2.17-2.17.0-0.4.el9fdp. It also got 0 result. https://beaker.engineering.redhat.com/jobs/6421187 It work well with openvswitch2.15-2.15.0-42.el9fdp. https://beaker.engineering.redhat.com/jobs/6421135 I can do nothing of the beaker logs: I can see no DPDK or OVS commands, it's hard to tell what is what. There is no usable logs. To save time, please provide an environment and a way to reproduce the issue, and I will have a look. I upgrade the i40e firmware of netqe22, the case still not work well. you can continue to access to them for debug. thanks. [root@netqe32 ~]# ethtool -i ens3f0 driver: i40e version: 4.18.0-305.el8.x86_64 firmware-version: 8.50 0x8000b6ed 1.3082.0 expansion-rom-version: bus-info: 0000:5e:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: yes Links are not even going up with kernel netdevs. I logged on the systems and I see following. On the dut, I see that the kernel netdev won't stay up: [ 6402.875392] bnxt_en 0000:82:00.0 enp130s0f0np0: renamed from eth0 [ 6402.886050] bnxt_en 0000:82:00.0: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link) [ 6404.113651] bnxt_en 0000:82:00.1 eth0: Broadcom BCM57414 NetXtreme-E 10Gb/25Gb Ethernet found at mem ca200000, node addr 00:0a:f7:b7:09:51 [ 6404.116904] bnxt_en 0000:82:00.1 enp130s0f1np1: renamed from eth0 [ 6404.127553] bnxt_en 0000:82:00.1: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link) [ 9932.291741] bnxt_en 0000:82:00.0 enp130s0f0np0: unsupported speed! [11172.415616] bnxt_en 0000:82:00.1 enp130s0f1np1: NIC Link is Up, 10000 Mbps full duplex, Flow control: ON - receive & transmit [11172.428238] bnxt_en 0000:82:00.1 enp130s0f1np1: FEC autoneg off encoding: None [11172.436318] IPv6: ADDRCONF(NETDEV_CHANGE): enp130s0f1np1: link becomes ready [11225.435240] bnxt_en 0000:82:00.1 enp130s0f1np1: NIC Link is Down [11267.953958] bnxt_en 0000:82:00.1 enp130s0f1np1: unsupported speed! [11617.480252] bnxt_en 0000:82:00.0 eth0: Broadcom BCM57414 NetXtreme-E 10Gb/25Gb Ethernet found at mem ca210000, node addr 00:0a:f7:b7:09:50 [11617.483667] bnxt_en 0000:82:00.0 enp130s0f0np0: renamed from eth0 [11617.494142] bnxt_en 0000:82:00.0: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link) [11617.525693] bnxt_en 0000:82:00.1 eth0: Broadcom BCM57414 NetXtreme-E 10Gb/25Gb Ethernet found at mem ca200000, node addr 00:0a:f7:b7:09:51 [11617.529025] bnxt_en 0000:82:00.1 enp130s0f1np1: renamed from eth0 [11617.539590] bnxt_en 0000:82:00.1: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link) [11734.009748] bnxt_en 0000:82:00.1 enp130s0f1np1: unsupported speed! [14286.494915] bnxt_en 0000:82:00.0 eth0: Broadcom BCM57414 NetXtreme-E 10Gb/25Gb Ethernet found at mem ca210000, node addr 00:0a:f7:b7:09:50 [14286.497770] bnxt_en 0000:82:00.0 enp130s0f0np0: renamed from eth0 [14286.508805] bnxt_en 0000:82:00.0: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link) [14286.540300] bnxt_en 0000:82:00.1 eth0: Broadcom BCM57414 NetXtreme-E 10Gb/25Gb Ethernet found at mem ca200000, node addr 00:0a:f7:b7:09:51 [14286.543631] bnxt_en 0000:82:00.1 enp130s0f1np1: renamed from eth0 [14286.554194] bnxt_en 0000:82:00.1: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link) On the tester side, there are warnings: [ 9.565143] i40e: Intel(R) Ethernet Connection XL710 Network Driver [ 9.565144] i40e: Copyright (c) 2013 - 2019 Intel Corporation. [ 9.586715] i40e 0000:5e:00.0: fw 8.5.67516 api 1.15 nvm 8.50 0x8000b6ed 1.3082.0 [8086:1572] [8086:0007] [ 9.596288] i40e 0000:5e:00.0: The driver for the device detected a newer version of the NVM image v1.15 than expected v1.9. Please install the most recent version of the network driver. [ 9.923572] i40e 0000:5e:00.0: MAC address: 3c:fd:fe:ad:7b:4c [ 9.929457] i40e 0000:5e:00.0: FW LLDP is enabled [ 9.936525] i40e 0000:5e:00.0: Query for DCB configuration failed, err I40E_ERR_NOT_READY aq_err OK [ 9.945565] i40e 0000:5e:00.0: DCB init failed -63, disabled [ 10.012278] i40e 0000:5e:00.0: PCI-Express: Speed 8.0GT/s Width x8 [ 10.018876] i40e 0000:5e:00.0: Features: PF-id[0] VFs: 64 VSIs: 66 QP: 56 RSS FD_ATR FD_SB NTUPLE VxLAN Geneve PTP VEPA [ 10.311717] i40e 0000:5e:00.1: fw 8.5.67516 api 1.15 nvm 8.50 0x8000b6ed 1.3082.0 [8086:1572] [8086:0000] [ 10.311719] i40e 0000:5e:00.1: The driver for the device detected a newer version of the NVM image v1.15 than expected v1.9. Please install the most recent version of the network driver. [ 10.879484] i40e 0000:5e:00.1: MAC address: 3c:fd:fe:ad:7b:4d [ 10.879744] i40e 0000:5e:00.1: FW LLDP is enabled [ 10.880783] i40e 0000:5e:00.1: Query for DCB configuration failed, err I40E_ERR_NOT_READY aq_err OK [ 10.880784] i40e 0000:5e:00.1: DCB init failed -63, disabled [ 10.898577] i40e 0000:5e:00.1: PCI-Express: Speed 8.0GT/s Width x8 [ 10.898996] i40e 0000:5e:00.1: Features: PF-id[1] VFs: 64 VSIs: 66 QP: 56 RSS FD_ATR FD_SB NTUPLE VxLAN Geneve PTP VEPA I logged into trex console, and I see: owner | root | root | link | DOWN | DOWN | You mentionned that no cable was changed, ok, but the issue could be SFP or cable that broke recently. It's strange that I didn't make any changes to the cable. I will try run ovs2 15 job. I run ovs2.15 and ovs2.17 job again. The only difference between the two jobs is the OVS version. The ovs2.15 work well. And ovs2.17 does not work. The netqe22/netqe32 has no ovs2.15 job: https://beaker.engineering.redhat.com/jobs/6447129 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/03/64471/6447129/11700996/141928037/bnxt_10.html ovs2.17 job: https://beaker.engineering.redhat.com/jobs/6447257 https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2022/03/64472/6447257/11701198/141930043/bnxt_10.html And I also change the traffic sender from T-rex to Xena. and It also cannot got result. https://beaker.engineering.redhat.com/jobs/6448932 Do you still need to test the test environment? I'll prepare it if necessary. For fdp22c, openvswitch2.17-2.17.0-7.el9fdp, still has this issue https://beaker.engineering.redhat.com/jobs/6490673 For fdp22d(openvswitch2.17-2.17.0-15.el9fdp.x86_64.rpm), change traffic sender to use Xena. it still has this issue. https://beaker.engineering.redhat.com/jobs/6614285 For openvswitch2.17-2.17.0-18.el9fdp.x86_64.rpm, it still exist this issue. https://beaker.engineering.redhat.com/jobs/6743770 For fdp22f openvswitch2.17-2.17.0-30.el9fdp.x86_64, this issue still exist. https://beaker.engineering.redhat.com/jobs/6871236 The issue does not exist on 25g bnxt_en card. It work well on 730-52 25g bnxt_en card. It only exist on netqe22 10g bnxt_en card. The firmware is same between the 10g and 25g card. https://beaker.engineering.redhat.com/jobs/6875119 And It work well with ovs2.15, ovs2.16 on 10g bnxt_en card. It only does not work well with ovs2.17 on 10g bnxt_en card. https://beaker.engineering.redhat.com/jobs/6875032 https://beaker.engineering.redhat.com/jobs/6872604 And checked the netqe22 10g card has disabled "LLDP nearest bridge" on bios setting. It has no issue for openvswitch2.17-2.17.0-70.el9fdp.x86_64.rpm. Following is the job on anther 25g bnxt_en card. So close it. https://beaker.engineering.redhat.com/jobs/8328149 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days |