Bug 2217867
| Summary: | [OSP-17.1][Mellnox-Cx6] OVN-HWOL: LLDP flows cause performance regression | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Pradipta Kumar Sahoo <psahoo> |
| Component: | openvswitch | Assignee: | RHOSP:NFV_Eng <rhosp-nfv-int> |
| Status: | ASSIGNED --- | QA Contact: | Eran Kuris <ekuris> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 17.1 (Wallaby) | CC: | apevec, atzin, bnemeth, chrisw, dasmith, eglynn, erpeters, gurpsing, hakhande, jhakimra, jraju, kchamart, mhazan, mkabat, mleitner, mnietoji, pgrist, rhosp-nfv-int, rjarry, sbauza, sgordon, supadhya, vcandapp, vromanso, wizhao |
| Target Milestone: | z3 | Keywords: | Performance, Triaged |
| Target Release: | 17.1 | Flags: | rjarry:
needinfo?
(bnemeth) ifrangs: needinfo? (rhosp-nfv-int) |
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Known Issue | |
| Doc Text: |
There is currently a known issue on Nvidia ConnectX-5 and ConnectX-6 NICs, when using hardware offload, where some offloaded flows on a PF can cause transient performance issues on the associated VFs. This issue is specifically observed with LLDP and VRRP traffic.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | Bug | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 2172622 | ||
|
Description
Pradipta Kumar Sahoo
2023-06-27 10:13:08 UTC
Pradipta, Hmm. I thought HWOL performance at best will match SR-IOV, but in general it will be lower as compared to SR-IOV. Do you see an anomaly here in terms results compared to results with 16.2? Regards Gurpreet Gurpreet, I didn't observe any major performance gap in the 16.2 HWOL test on same hardware. Result sheet: https://docs.google.com/spreadsheets/d/1GF1fPqcxjQGCnyY6qmtZ-ILngQkLLZOYpO2aUDpAZmw/edit?usp=sharing Traffic profile details which have used in Tgen: --------------------------------------------- Protocol: UDP Flow Modification: src-ip, dst-ip Number of flows: 1024 Search-time and final-validation-time: 30sec & 300sec Loss Scenarios: 0.002 Traffic direction: Bi-direction Frame Sizes: 64, 128, 256, 512, 1024, 1500, 9000 Can you show flows? ovs-appctl dpctl/dump-flows -m type=offloaded I have seen that any drop flow there is decreasing performance. In a ml2-ovs scenario i had some vrrp packets that were arriving to the compute and being dropped, after disabling ha in the router, i stopped receiving those vrrp packets and performance was good. I opened this bz https://bugzilla.redhat.com/show_bug.cgi?id=2221922 ufid:39c661b9-2bf6-428e-9bdc-1974e90acbce, skb_priority(0/0),tunnel(tun_id=0xb70e,src=10.10.141.157,dst=10.10.141.136,ttl=0/0,tp_dst=4789,flags(+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(vxlan_sys_4789),packet_type(ns=0/0,id=0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:148, bytes:7992, used:0.680s, offloaded:yes, dp:tc, actions:drop In a ovn scenario I had some lldp packets arriving to the compute and being dropped. After disabling lldp in the switch, i stop receiving those lldp packets and the performance was good ufid:00b82f41-503f-4a47-8f41-9dd0197f18f9, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(mx-bond),packet_type(ns=0/0,id=0/0),eth(src=f4:52:14:25:28:74,dst=01:80:c2:00:00:0e),eth_type(0x88cc), packets:0, bytes:0, used:never, offloaded:yes, dp:tc, actions:drop For reference, all the test logs are shared in comment #1. OVN-HWOL Test log: http://storage.scalelab.redhat.com/psahoo/PerfTaskLog/OSP17.1/nfv_hwol/trafficgen--2023-06-21_11%3A41%3A28_UTC--8291a5c3-740d-4ec8-a437-81a9c02265be.tar.xz In my test topology there is neutron router has been used. The test is in P-V-P scenarios on vlan provider network. Sample datapath flows output during the test: ufid:4e38023e-37d6-48b3-9429-cf2ed7a5a35e, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ens1f1np1_2),packet_type(ns =0/0,id=0/0),eth(src=3c:fd:fe:ee:4a:2c,dst=3c:fd:fe:ee:4a:2d),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:16273681, bytes:244075 82184, used:0.220s, offloaded:yes, dp:tc, actions:push_vlan(vid=178,pcp=0),ens1f1np1 ufid:9e8eaf02-64d7-4f1e-a498-27107f4f0223, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ens1f1np1_2),packet_type(ns =0/0,id=0/0),eth(src=3c:fd:fe:ee:47:0c,dst=3c:fd:fe:ee:47:0d),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:16264148, bytes:243932 81302, used:0.220s, offloaded:yes, dp:tc, actions:push_vlan(vid=178,pcp=0),ens1f1np1 ufid:39364176-9664-4061-8529-114e4d81b114, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ens1f1np1_2),packet_type(ns =0/0,id=0/0),eth(src=3c:fd:fe:ee:4c:50,dst=3c:fd:fe:ee:4c:51),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:16256199, bytes:243813 59310, used:0.220s, offloaded:yes, dp:tc, actions:push_vlan(vid=178,pcp=0),ens1f1np1 ufid:e2af17d5-4539-4064-bf38-c4cd9638b9c8, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ens1f1np1_2),packet_type(ns =0/0,id=0/0),eth(src=3c:fd:fe:ee:4c:34,dst=3c:fd:fe:ee:4c:35),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:16249698, bytes:243716 07882, used:0.220s, offloaded:yes, dp:tc, actions:push_vlan(vid=178,pcp=0),ens1f1np1 ufid:3745c18c-a008-44d9-bddc-ed6ffc5b09dd, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ens1f0np0_3),packet_type(ns =0/0,id=0/0),eth(src=3c:fd:fe:ee:4a:2d,dst=3c:fd:fe:ee:4a:2c),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:16657233, bytes:249829 09990, used:0.030s, offloaded:yes, dp:tc, actions:push_vlan(vid=177,pcp=0),ens1f0np0 ufid:c01e044d-429f-4f70-b4ba-b0c4ceef5e81, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ens1f0np0_3),packet_type(ns =0/0,id=0/0),eth(src=3c:fd:fe:ee:47:0d,dst=3c:fd:fe:ee:47:0c),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:16647589, bytes:249684 44008, used:0.030s, offloaded:yes, dp:tc, actions:push_vlan(vid=177,pcp=0),ens1f0np0 ufid:4c032c08-01e1-409d-87be-8391fdb5e960, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ens1f0np0_3),packet_type(ns =0/0,id=0/0),eth(src=3c:fd:fe:ee:4c:51,dst=3c:fd:fe:ee:4c:50),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:16640747, bytes:249581 82444, used:0.030s, offloaded:yes, dp:tc, actions:push_vlan(vid=177,pcp=0),ens1f0np0 ufid:3ad1be3d-8ae5-4cb7-87bd-4520736fba99, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ens1f0np0_3),packet_type(ns=0/0,id=0/0),eth(src=3c:fd:fe:ee:4c:35,dst=3c:fd:fe:ee:4c:34),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:16630009, bytes:24942075444, used:0.030s, offloaded:yes, dp:tc, actions:push_vlan(vid=177,pcp=0),ens1f0np0 ... ufid:c1ec11dd-7005-43b7-ab9a-b69071814698, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ens1f1np1),packet_type(ns=0 /0,id=0/0),eth(src=3c:fd:fe:ee:4a:2d,dst=3c:fd:fe:ee:4a:2c),eth_type(0x8100),vlan(vid=178,pcp=0),encap(eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no)), packets:16270689, bytes:24338011238, used:0.220s, offloaded:yes, dp:tc, actions:pop_vlan,ens1f1np1_2 ufid:09135c27-ee57-4dba-b39d-f5d9d9bc2922, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ens1f1np1),packet_type(ns=0 /0,id=0/0),eth(src=3c:fd:fe:ee:47:0d,dst=3c:fd:fe:ee:47:0c),eth_type(0x8100),vlan(vid=178,pcp=0),encap(eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0 ,frag=no)), packets:16261071, bytes:24323622724, used:0.220s, offloaded:yes, dp:tc, actions:pop_vlan,ens1f1np1_2 ufid:3214a6d1-021d-4fff-b477-55b3bc7a2ab5, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ens1f1np1),packet_type(ns=0 /0,id=0/0),eth(src=3c:fd:fe:ee:4c:51,dst=3c:fd:fe:ee:4c:50),eth_type(0x8100),vlan(vid=178,pcp=0),encap(eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no)), packets:16254223, bytes:24313379552, used:0.220s, offloaded:yes, dp:tc, actions:pop_vlan,ens1f1np1_2 ufid:ff221669-2619-427c-bc3b-6c64173ba9ec, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ens1f1np1),packet_type(ns=0 /0,id=0/0),eth(src=3c:fd:fe:ee:4c:35,dst=3c:fd:fe:ee:4c:34),eth_type(0x8100),vlan(vid=178,pcp=0),encap(eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no)), packets:16243467, bytes:24297288576, used:0.220s, offloaded:yes, dp:tc, actions:pop_vlan,ens1f1np1_2 ufid:56f2f088-eea1-4603-91be-bbf1a87489d4, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ens1f0np0),packet_type(ns=0/0,id=0/0),eth(src=c8:fe:6a:f1:d6:5b,dst=01:80:c2:00:00:0e),eth_type(0x88cc), packets:0, bytes:0, used:never, offloaded:yes, dp:tc, actions:drop ufid:3114804a-6a37-4a26-affe-9522ac3471ac, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ens1f0np0),packet_type(ns=0/0,id=0/0),eth(src=3c:fd:fe:ee:4a:2c,dst=3c:fd:fe:ee:4a:2d),eth_type(0x8100),vlan(vid=177,pcp=0),encap(eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0 ,frag=no)), packets:16660338, bytes:24920926612, used:0.030s, offloaded:yes, dp:tc, actions:pop_vlan,ens1f0np0_3 ufid:56a9b2ea-b9a8-455f-97b3-9bbcb446c463, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ens1f0np0),packet_type(ns=0 /0,id=0/0),eth(src=3c:fd:fe:ee:47:0c,dst=3c:fd:fe:ee:47:0d),eth_type(0x8100),vlan(vid=177,pcp=0),encap(eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no)), packets:16650780, bytes:24906627886, used:0.030s, offloaded:yes, dp:tc, actions:pop_vlan,ens1f0np0_3 ufid:28331545-925d-4e3e-8908-7d8276aa4038, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ens1f0np0),packet_type(ns=0 /0,id=0/0),eth(src=3c:fd:fe:ee:4c:50,dst=3c:fd:fe:ee:4c:51),eth_type(0x8100),vlan(vid=177,pcp=0),encap(eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0 ,frag=no)), packets:16642836, bytes:24894743718, used:0.030s, offloaded:yes, dp:tc, actions:pop_vlan,ens1f0np0_3 ufid:380fb46d-52e3-46d5-b49f-c8bd748e3d1a, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ens1f0np0),packet_type(ns=0 /0,id=0/0),eth(src=3c:fd:fe:ee:4c:34,dst=3c:fd:fe:ee:4c:35),eth_type(0x8100),vlan(vid=177,pcp=0),encap(eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0 ,frag=no)), packets:16636355, bytes:24885048198, used:0.030s, offloaded:yes, dp:tc, actions:pop_vlan,ens1f0np0_3 I think this flow is causing performance drop. It may be lldp. I would try to stop that traffic and check ufid:56f2f088-eea1-4603-91be-bbf1a87489d4, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(ens1f0np0),packet_type(ns=0/0,id=0/0),eth(src=c8:fe:6a:f1:d6:5b,dst=01:80:c2:00:00:0e),eth_type(0x88cc), packets:0, bytes:0, used:never, offloaded:yes, dp:tc, actions:drop As discussed with Andrew and the net-perf team, I wanted to share some insights regarding LLDP traffic and its interaction with OVS. LLDP traffic typically has very low bandwidth. However, when an LLDP packet arrives on a host with OVS, it will likely result in a 'miss' in OVS unless the same packet (with the same header) arrives more frequently than the datapath-flow idle-time expiration. During my test scenarios, I didn't observe frequent LLDP flows in the datapath layer. Considering this, I believe enabling LLDP is a standard procedure in the customer environment. Hence it has been enabled in the Switch. root@juniper-nfv1> show lldp LLDP : Enabled Advertisement interval : 30 seconds Transmit delay : 2 seconds Hold timer : 120 seconds Notification interval : 5 Second(s) Config Trap Interval : 0 seconds Connection Hold timer : 300 seconds LLDP MED : Enabled MED fast start count : 3 Packets Port ID TLV subtype : interface-name Port Description TLV type : interface-alias (ifAlias) In/terface Parent Interface LLDP LLDP-MED Power Negotiation all - Enabled Enabled Enabled Pradipta Is the miss for LLDP traffic impacting the performance for other workload traffic? I assume Trex is not generating any LLDP traffic and performance results are not based on non-trex generated traffic. Regards Gurpreet Yes, in my test I saw that LLDP traffic generated by the switch are reducing throughput of data traffic around 10% or 12% Miguel This looks like bz 2221922 is related to the same issue. After some debugging, I found that there is a direct correlation between LLDP packets received and rx_discards_phy increase on the PF interface. The VFs are assigned to a VM running testpmd and when an LLDP packet arrives on the host, the RX rate drops(stable rate is 14mpps per port, after receiving an LLDP packet, it drops down to 13mpps per port for a few seconds before returning to normal). See below: [root@computehwoffload-r740 ~]# for i in mx-bond ens6f0np0 ens6f1np1 ens6f1np1_1 ens6f1np1_8; do ip -d link show $i; done 20: mx-bond: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 9000 qdisc noqueue master ovs-system state UP mode DEFAULT group default qlen 1000 link/ether 98:03:9b:9d:73:00 brd ff:ff:ff:ff:ff:ff promiscuity 1 minmtu 68 maxmtu 65535 bond mode active-backup active_slave ens6f0np0 miimon 0 updelay 0 downdelay 0 peer_notify_delay 0 use_carrier 1 arp_interval 0 arp_missed_max 2 arp_validate none arp_all_targets any primary_reselect always fail_over_mac none xmit_hash_policy layer2 resend_igmp 1 num_grat_arp 1 all_slaves_active 0 min_links 0 lp_interval 1 packets_per_slave 1 lacp_active on lacp_rate slow ad_select stable tlb_dynamic_lb 1 openvswitch_slave addrgenmode eui64 numtxqueues 16 numrxqueues 16 gso_max_size 65536 gso_max_segs 65535 tso_max_size 524280 tso_max_segs 65535 gro_max_size 65536 6: ens6f0np0: <BROADCAST,MULTICAST,PROMISC,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master mx-bond state UP mode DEFAULT group default qlen 1000 link/ether 98:03:9b:9d:73:00 brd ff:ff:ff:ff:ff:ff promiscuity 2 minmtu 68 maxmtu 9978 bond_slave state ACTIVE mii_status UP link_failure_count 0 perm_hwaddr 98:03:9b:9d:73:00 queue_id 0 addrgenmode eui64 numtxqueues 576 numrxqueues 80 gso_max_size 65536 gso_max_segs 65535 tso_max_size 524280 tso_max_segs 65535 gro_max_size 65536 portname p0 switchid 00739d00039b0398 parentbus pci parentdev 0000:18:00.0 vf 0 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off vf 1 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off vf 2 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off vf 3 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off vf 4 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off vf 5 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off vf 6 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off vf 7 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off vf 8 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off vf 9 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off altname enp24s0f0np0 8: ens6f1np1: <BROADCAST,MULTICAST,PROMISC,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master mx-bond state UP mode DEFAULT group default qlen 1000 link/ether 98:03:9b:9d:73:00 brd ff:ff:ff:ff:ff:ff permaddr 98:03:9b:9d:73:01 promiscuity 1 minmtu 68 maxmtu 9978 bond_slave state BACKUP mii_status UP link_failure_count 0 perm_hwaddr 98:03:9b:9d:73:01 queue_id 0 addrgenmode eui64 numtxqueues 576 numrxqueues 80 gso_max_size 65536 gso_max_segs 65535 tso_max_size 524280 tso_max_segs 65535 gro_max_size 65536 portname p1 switchid 00739d00039b0398 parentbus pci parentdev 0000:18:00.1 vf 0 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off vf 1 link/ether fa:16:3e:95:28:f9 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off vf 2 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off vf 3 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off vf 4 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off vf 5 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off vf 6 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off vf 7 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off vf 8 link/ether fa:16:3e:0e:a3:7a brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off vf 9 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off altname enp24s0f1np1 52: ens6f1np1_1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000 link/ether 12:6f:37:7e:71:48 brd ff:ff:ff:ff:ff:ff promiscuity 1 minmtu 68 maxmtu 9978 openvswitch_slave addrgenmode eui64 numtxqueues 40 numrxqueues 40 gso_max_size 65536 gso_max_segs 65535 tso_max_size 65536 tso_max_segs 65535 gro_max_size 65536 portname pf1vf1 switchid 00739d00039b0398 parentbus pci parentdev 0000:18:00.1 altname enp24s0f1npf1vf1 altname ens6f1npf1vf1 59: ens6f1np1_8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000 link/ether d2:f0:2d:d7:68:a4 brd ff:ff:ff:ff:ff:ff promiscuity 1 minmtu 68 maxmtu 9978 openvswitch_slave addrgenmode eui64 numtxqueues 40 numrxqueues 40 gso_max_size 65536 gso_max_segs 65535 tso_max_size 65536 tso_max_segs 65535 gro_max_size 65536 portname pf1vf8 switchid 00739d00039b0398 parentbus pci parentdev 0000:18:00.1 altname enp24s0f1npf1vf8 altname ens6f1npf1vf8 [root@computehwoffload-r740 ~]# cat /proc/net/bonding/mx-bond Ethernet Channel Bonding Driver: v5.14.0-284.23.1.el9_2.x86_64 Bonding Mode: fault-tolerance (active-backup) Primary Slave: None Currently Active Slave: ens6f0np0 MII Status: up MII Polling Interval (ms): 0 Up Delay (ms): 0 Down Delay (ms): 0 Peer Notification Delay (ms): 0 Slave Interface: ens6f0np0 MII Status: up Speed: 40000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 98:03:9b:9d:73:00 Slave queue ID: 0 Slave Interface: ens6f1np1 MII Status: up Speed: 40000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 98:03:9b:9d:73:01 Slave queue ID: 0 [root@computehwoffload-r740 ~]# tcpdump -nnei mx-bond ether proto 0x88cc 2>/dev/null & while true; do echo $(date +%H:%M:%S.%N) ens6f0np0 $(ethtool -S ens6f0np0 | grep -e rx_discards_phy); sleep 3; done [1] 355250 10:26:04.115486390 ens6f0np0 rx_discards_phy: 3817918797 10:26:07.128211256 ens6f0np0 rx_discards_phy: 3817918797 10:26:10.138754931 ens6f0np0 rx_discards_phy: 3817918797 10:26:13.149529768 ens6f0np0 rx_discards_phy: 3817918797 10:26:16.159410485 ens6f0np0 rx_discards_phy: 3817918797 10:26:16.851660 f4:52:14:25:28:7a > 01:80:c2:00:00:0e, ethertype LLDP (0x88cc), length 211: LLDP, length 197: nfv-private-sw05 10:26:19.169313826 ens6f0np0 rx_discards_phy: 3821575293 10:26:22.179908957 ens6f0np0 rx_discards_phy: 3826320710 10:26:25.188932723 ens6f0np0 rx_discards_phy: 3831025468 10:26:28.197987965 ens6f0np0 rx_discards_phy: 3834404055 10:26:31.207463084 ens6f0np0 rx_discards_phy: 3834404055 10:26:34.217173200 ens6f0np0 rx_discards_phy: 3834404055 10:26:37.226648204 ens6f0np0 rx_discards_phy: 3834404055 10:26:40.236542661 ens6f0np0 rx_discards_phy: 3834404055 10:26:43.245958602 ens6f0np0 rx_discards_phy: 3834404055 10:26:46.257063966 ens6f0np0 rx_discards_phy: 3834404055 10:26:46.917407 f4:52:14:25:28:7a > 01:80:c2:00:00:0e, ethertype LLDP (0x88cc), length 211: LLDP, length 197: nfv-private-sw05 10:26:49.266472441 ens6f0np0 rx_discards_phy: 3838136857 10:26:52.276672696 ens6f0np0 rx_discards_phy: 3842907826 10:26:55.287084080 ens6f0np0 rx_discards_phy: 3847663971 10:26:58.296209675 ens6f0np0 rx_discards_phy: 3850935565 10:27:01.305549687 ens6f0np0 rx_discards_phy: 3850935565 10:27:04.316570355 ens6f0np0 rx_discards_phy: 3850935565 10:27:07.325294825 ens6f0np0 rx_discards_phy: 3850935565 10:27:10.462213557 ens6f0np0 rx_discards_phy: 3850935565 10:27:13.472317477 ens6f0np0 rx_discards_phy: 3850935565 10:27:16.551827288 ens6f0np0 rx_discards_phy: 3850935565 10:27:16.963882 f4:52:14:25:28:7a > 01:80:c2:00:00:0e, ethertype LLDP (0x88cc), length 211: LLDP, length 197: nfv-private-sw05 10:27:19.565699568 ens6f0np0 rx_discards_phy: 3855069645 10:27:22.575045649 ens6f0np0 rx_discards_phy: 3859840105 10:27:25.585629566 ens6f0np0 rx_discards_phy: 3864608229 10:27:28.596472921 ens6f0np0 rx_discards_phy: 3866940652 10:27:31.605664812 ens6f0np0 rx_discards_phy: 3866940652 10:27:34.614802869 ens6f0np0 rx_discards_phy: 3866940652 10:27:37.624192348 ens6f0np0 rx_discards_phy: 3866940652 10:27:40.633373542 ens6f0np0 rx_discards_phy: 3866940652 10:27:43.642362008 ens6f0np0 rx_discards_phy: 3866940652 ^C [root@computehwoffload-r740 ~]# tcpdump -nnei mx-bond ether proto 0x88cc 2>/dev/null & while true; do date +%H:%M:%S.%N; ovs-appctl dpctl/dump-flows --names type=offloaded | sort; sleep 5; done [1] 379393 10:31:27.365671792 ct_mark(0/0x2),recirc_id(0),in_port(ens6f1np1_1),eth(src=fa:16:3e:95:28:f9,dst=f8:f2:1e:03:c8:c4),eth_type(0x0800),ipv4(frag=no), packets:74609919722, bytes:4775033718524, used:0.850s, actions:push_vlan(vid=148,pcp=0),mx-bond ct_mark(0/0x2),recirc_id(0),in_port(ens6f1np1_8),eth(src=fa:16:3e:0e:a3:7a,dst=f8:f2:1e:03:c8:c6),eth_type(0x0800),ipv4(frag=no), packets:74785507282, bytes:4786271309440, used:0.850s, actions:push_vlan(vid=149,pcp=0),mx-bond ct_mark(0/0x2),recirc_id(0),in_port(mx-bond),eth(src=f8:f2:1e:03:c8:c4,dst=fa:16:3e:95:28:f9),eth_type(0x8100),vlan(vid=148,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:74785616824, bytes:4487136539530, used:0.850s, actions:pop_vlan,ens6f1np1_1 ct_mark(0/0x2),recirc_id(0),in_port(mx-bond),eth(src=f8:f2:1e:03:c8:c6,dst=fa:16:3e:0e:a3:7a),eth_type(0x8100),vlan(vid=149,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:74610329061, bytes:4476619309814, used:0.060s, actions:pop_vlan,ens6f1np1_8 10:31:32.385077338 ct_mark(0/0x2),recirc_id(0),in_port(ens6f1np1_1),eth(src=fa:16:3e:95:28:f9,dst=f8:f2:1e:03:c8:c4),eth_type(0x0800),ipv4(frag=no), packets:74676944871, bytes:4779323328060, used:0.990s, actions:push_vlan(vid=148,pcp=0),mx-bond ct_mark(0/0x2),recirc_id(0),in_port(ens6f1np1_8),eth(src=fa:16:3e:0e:a3:7a,dst=f8:f2:1e:03:c8:c6),eth_type(0x0800),ipv4(frag=no), packets:74852628463, bytes:4790567065024, used:0.990s, actions:push_vlan(vid=149,pcp=0),mx-bond ct_mark(0/0x2),recirc_id(0),in_port(mx-bond),eth(src=f8:f2:1e:03:c8:c4,dst=fa:16:3e:95:28:f9),eth_type(0x8100),vlan(vid=148,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:74852738028, bytes:4491163811770, used:0.990s, actions:pop_vlan,ens6f1np1_1 ct_mark(0/0x2),recirc_id(0),in_port(mx-bond),eth(src=f8:f2:1e:03:c8:c6,dst=fa:16:3e:0e:a3:7a),eth_type(0x8100),vlan(vid=149,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:74677354242, bytes:4480640820674, used:0.990s, actions:pop_vlan,ens6f1np1_8 10:31:37.395068949 ct_mark(0/0x2),recirc_id(0),in_port(ens6f1np1_1),eth(src=fa:16:3e:95:28:f9,dst=f8:f2:1e:03:c8:c4),eth_type(0x0800),ipv4(frag=no), packets:74748609952, bytes:4783909893244, used:0.880s, actions:push_vlan(vid=148,pcp=0),mx-bond ct_mark(0/0x2),recirc_id(0),in_port(ens6f1np1_8),eth(src=fa:16:3e:0e:a3:7a,dst=f8:f2:1e:03:c8:c6),eth_type(0x0800),ipv4(frag=no), packets:74924293720, bytes:4795153641472, used:0.880s, actions:push_vlan(vid=149,pcp=0),mx-bond ct_mark(0/0x2),recirc_id(0),in_port(mx-bond),eth(src=f8:f2:1e:03:c8:c4,dst=fa:16:3e:95:28:f9),eth_type(0x8100),vlan(vid=148,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:74924403300, bytes:4495463728090, used:0.880s, actions:pop_vlan,ens6f1np1_1 ct_mark(0/0x2),recirc_id(0),in_port(mx-bond),eth(src=f8:f2:1e:03:c8:c6,dst=fa:16:3e:0e:a3:7a),eth_type(0x8100),vlan(vid=149,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:74749019328, bytes:4484940725834, used:0.880s, actions:pop_vlan,ens6f1np1_8 10:31:42.404690194 ct_mark(0/0x2),recirc_id(0),in_port(ens6f1np1_1),eth(src=fa:16:3e:95:28:f9,dst=f8:f2:1e:03:c8:c4),eth_type(0x0800),ipv4(frag=no), packets:74820293001, bytes:4788497608380, used:0.770s, actions:push_vlan(vid=148,pcp=0),mx-bond ct_mark(0/0x2),recirc_id(0),in_port(ens6f1np1_8),eth(src=fa:16:3e:0e:a3:7a,dst=f8:f2:1e:03:c8:c6),eth_type(0x0800),ipv4(frag=no), packets:74995976783, bytes:4799741357504, used:0.770s, actions:push_vlan(vid=149,pcp=0),mx-bond ct_mark(0/0x2),recirc_id(0),in_port(mx-bond),eth(src=f8:f2:1e:03:c8:c4,dst=fa:16:3e:95:28:f9),eth_type(0x8100),vlan(vid=148,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:74996086316, bytes:4499764709050, used:0.770s, actions:pop_vlan,ens6f1np1_1 ct_mark(0/0x2),recirc_id(0),in_port(mx-bond),eth(src=f8:f2:1e:03:c8:c6,dst=fa:16:3e:0e:a3:7a),eth_type(0x8100),vlan(vid=149,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:74820702361, bytes:4489241707814, used:0.770s, actions:pop_vlan,ens6f1np1_8 10:31:47.415092157 ct_mark(0/0x2),recirc_id(0),in_port(ens6f1np1_1),eth(src=fa:16:3e:95:28:f9,dst=f8:f2:1e:03:c8:c4),eth_type(0x0800),ipv4(frag=no), packets:74891948649, bytes:4793083569852, used:0.660s, actions:push_vlan(vid=148,pcp=0),mx-bond ct_mark(0/0x2),recirc_id(0),in_port(ens6f1np1_8),eth(src=fa:16:3e:0e:a3:7a,dst=f8:f2:1e:03:c8:c6),eth_type(0x0800),ipv4(frag=no), packets:75067632619, bytes:4804327331008, used:0.660s, actions:push_vlan(vid=149,pcp=0),mx-bond ct_mark(0/0x2),recirc_id(0),in_port(mx-bond),eth(src=f8:f2:1e:03:c8:c4,dst=fa:16:3e:95:28:f9),eth_type(0x8100),vlan(vid=148,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:75067742169, bytes:4504064060230, used:0.660s, actions:pop_vlan,ens6f1np1_1 ct_mark(0/0x2),recirc_id(0),in_port(mx-bond),eth(src=f8:f2:1e:03:c8:c6,dst=fa:16:3e:0e:a3:7a),eth_type(0x8100),vlan(vid=149,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:74892358011, bytes:4493541046814, used:0.660s, actions:pop_vlan,ens6f1np1_8 recirc_id(0),in_port(mx-bond),eth(src=f4:52:14:25:28:7a,dst=01:80:c2:00:00:0e),eth_type(0x88cc), packets:0, bytes:0, used:never, actions:drop 10:31:47.352709 f4:52:14:25:28:7a > 01:80:c2:00:00:0e, ethertype LLDP (0x88cc), length 211: LLDP, length 197: nfv-private-sw05 10:31:52.425718380 ct_mark(0/0x2),recirc_id(0),in_port(ens6f1np1_1),eth(src=fa:16:3e:95:28:f9,dst=f8:f2:1e:03:c8:c4),eth_type(0x0800),ipv4(frag=no), packets:74953495176, bytes:4797022547580, used:0.870s, actions:push_vlan(vid=148,pcp=0),mx-bond ct_mark(0/0x2),recirc_id(0),in_port(ens6f1np1_8),eth(src=fa:16:3e:0e:a3:7a,dst=f8:f2:1e:03:c8:c6),eth_type(0x0800),ipv4(frag=no), packets:75129682080, bytes:4808298496512, used:0.870s, actions:push_vlan(vid=149,pcp=0),mx-bond ct_mark(0/0x2),recirc_id(0),in_port(mx-bond),eth(src=f8:f2:1e:03:c8:c4,dst=fa:16:3e:95:28:f9),eth_type(0x8100),vlan(vid=148,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:75129791603, bytes:4507787026270, used:0.870s, actions:pop_vlan,ens6f1np1_1 ct_mark(0/0x2),recirc_id(0),in_port(mx-bond),eth(src=f8:f2:1e:03:c8:c6,dst=fa:16:3e:0e:a3:7a),eth_type(0x8100),vlan(vid=149,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:74953904510, bytes:4497233836754, used:0.870s, actions:pop_vlan,ens6f1np1_8 recirc_id(0),in_port(mx-bond),eth(src=f4:52:14:25:28:7a,dst=01:80:c2:00:00:0e),eth_type(0x88cc), packets:0, bytes:0, used:never, actions:drop 10:31:57.435854645 ct_mark(0/0x2),recirc_id(0),in_port(ens6f1np1_1),eth(src=fa:16:3e:95:28:f9,dst=f8:f2:1e:03:c8:c4),eth_type(0x0800),ipv4(frag=no), packets:75024789018, bytes:4801585353468, used:0.250s, actions:push_vlan(vid=148,pcp=0),mx-bond ct_mark(0/0x2),recirc_id(0),in_port(ens6f1np1_8),eth(src=fa:16:3e:0e:a3:7a,dst=f8:f2:1e:03:c8:c6),eth_type(0x0800),ipv4(frag=no), packets:75201652911, bytes:4812904629696, used:0.250s, actions:push_vlan(vid=149,pcp=0),mx-bond ct_mark(0/0x2),recirc_id(0),in_port(mx-bond),eth(src=f8:f2:1e:03:c8:c4,dst=fa:16:3e:95:28:f9),eth_type(0x8100),vlan(vid=148,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:75201762440, bytes:4512105276490, used:0.250s, actions:pop_vlan,ens6f1np1_1 ct_mark(0/0x2),recirc_id(0),in_port(mx-bond),eth(src=f8:f2:1e:03:c8:c6,dst=fa:16:3e:0e:a3:7a),eth_type(0x8100),vlan(vid=149,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:75025198372, bytes:4501511468474, used:0.250s, actions:pop_vlan,ens6f1np1_8 recirc_id(0),in_port(ens6f1np1_1),eth(src=fa:16:3e:95:28:f9,dst=f8:f2:1e:03:c8:c4),eth_type(0x0806), packets:0, bytes:0, used:never, actions:push_vlan(vid=148,pcp=0),mx-bond recirc_id(0),in_port(ens6f1np1_8),eth(src=fa:16:3e:0e:a3:7a,dst=f8:f2:1e:03:c8:c6),eth_type(0x0806), packets:0, bytes:0, used:never, actions:push_vlan(vid=149,pcp=0),mx-bond recirc_id(0),in_port(mx-bond),eth(src=f4:52:14:25:28:7a,dst=01:80:c2:00:00:0e),eth_type(0x88cc), packets:0, bytes:0, used:10.080s, actions:drop 10:32:02.447256298 ct_mark(0/0x2),recirc_id(0),in_port(ens6f1np1_1),eth(src=fa:16:3e:95:28:f9,dst=f8:f2:1e:03:c8:c4),eth_type(0x0800),ipv4(frag=no), packets:75090206558, bytes:4805772076028, used:0.530s, actions:push_vlan(vid=148,pcp=0),mx-bond ct_mark(0/0x2),recirc_id(0),in_port(ens6f1np1_8),eth(src=fa:16:3e:0e:a3:7a,dst=f8:f2:1e:03:c8:c6),eth_type(0x0800),ipv4(frag=no), packets:75267148882, bytes:4817096371840, used:0.530s, actions:push_vlan(vid=149,pcp=0),mx-bond ct_mark(0/0x2),recirc_id(0),in_port(mx-bond),eth(src=f8:f2:1e:03:c8:c4,dst=fa:16:3e:95:28:f9),eth_type(0x8100),vlan(vid=148,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:75267258435, bytes:4516035036190, used:0.530s, actions:pop_vlan,ens6f1np1_1 ct_mark(0/0x2),recirc_id(0),in_port(mx-bond),eth(src=f8:f2:1e:03:c8:c6,dst=fa:16:3e:0e:a3:7a),eth_type(0x8100),vlan(vid=149,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:75090615916, bytes:4505436521114, used:0.530s, actions:pop_vlan,ens6f1np1_8 recirc_id(0),in_port(ens6f1np1_1),eth(src=fa:16:3e:95:28:f9,dst=f8:f2:1e:03:c8:c4),eth_type(0x0806), packets:0, bytes:0, used:never, actions:push_vlan(vid=148,pcp=0),mx-bond recirc_id(0),in_port(ens6f1np1_8),eth(src=fa:16:3e:0e:a3:7a,dst=f8:f2:1e:03:c8:c6),eth_type(0x0806), packets:0, bytes:0, used:never, actions:push_vlan(vid=149,pcp=0),mx-bond 10:32:07.457799792 ct_mark(0/0x2),recirc_id(0),in_port(ens6f1np1_1),eth(src=fa:16:3e:95:28:f9,dst=f8:f2:1e:03:c8:c4),eth_type(0x0800),ipv4(frag=no), packets:75167237138, bytes:4810702033148, used:0.030s, actions:push_vlan(vid=148,pcp=0),mx-bond ct_mark(0/0x2),recirc_id(0),in_port(ens6f1np1_8),eth(src=fa:16:3e:0e:a3:7a,dst=f8:f2:1e:03:c8:c6),eth_type(0x0800),ipv4(frag=no), packets:75344179568, bytes:4822026335744, used:0.030s, actions:push_vlan(vid=149,pcp=0),mx-bond ct_mark(0/0x2),recirc_id(0),in_port(mx-bond),eth(src=f8:f2:1e:03:c8:c4,dst=fa:16:3e:95:28:f9),eth_type(0x8100),vlan(vid=148,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:75344289085, bytes:4520656875190, used:0.030s, actions:pop_vlan,ens6f1np1_1 ct_mark(0/0x2),recirc_id(0),in_port(mx-bond),eth(src=f8:f2:1e:03:c8:c6,dst=fa:16:3e:0e:a3:7a),eth_type(0x8100),vlan(vid=149,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:75167646446, bytes:4510058352914, used:0.030s, actions:pop_vlan,ens6f1np1_8 10:32:12.485211959 ct_mark(0/0x2),recirc_id(0),in_port(ens6f1np1_1),eth(src=fa:16:3e:95:28:f9,dst=f8:f2:1e:03:c8:c4),eth_type(0x0800),ipv4(frag=no), packets:75224579847, bytes:4814371966524, used:0.980s, actions:push_vlan(vid=148,pcp=0),mx-bond ct_mark(0/0x2),recirc_id(0),in_port(ens6f1np1_8),eth(src=fa:16:3e:0e:a3:7a,dst=f8:f2:1e:03:c8:c6),eth_type(0x0800),ipv4(frag=no), packets:75401522303, bytes:4825696270784, used:0.980s, actions:push_vlan(vid=149,pcp=0),mx-bond ct_mark(0/0x2),recirc_id(0),in_port(mx-bond),eth(src=f8:f2:1e:03:c8:c4,dst=fa:16:3e:95:28:f9),eth_type(0x8100),vlan(vid=148,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:75401631879, bytes:4524097442830, used:0.980s, actions:pop_vlan,ens6f1np1_1 ct_mark(0/0x2),recirc_id(0),in_port(mx-bond),eth(src=f8:f2:1e:03:c8:c6,dst=fa:16:3e:0e:a3:7a),eth_type(0x8100),vlan(vid=149,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:75224989222, bytes:4513498919474, used:0.980s, actions:pop_vlan,ens6f1np1_8 10:32:17.408220 f4:52:14:25:28:7a > 01:80:c2:00:00:0e, ethertype LLDP (0x88cc), length 211: LLDP, length 197: nfv-private-sw05 10:32:17.517194539 ct_mark(0/0x2),recirc_id(0),in_port(ens6f1np1_1),eth(src=fa:16:3e:95:28:f9,dst=f8:f2:1e:03:c8:c4),eth_type(0x0800),ipv4(frag=no), packets:75296249192, bytes:4818958804604, used:0.880s, actions:push_vlan(vid=148,pcp=0),mx-bond ct_mark(0/0x2),recirc_id(0),in_port(ens6f1np1_8),eth(src=fa:16:3e:0e:a3:7a,dst=f8:f2:1e:03:c8:c6),eth_type(0x0800),ipv4(frag=no), packets:75473191844, bytes:4830283121408, used:0.880s, actions:push_vlan(vid=149,pcp=0),mx-bond ct_mark(0/0x2),recirc_id(0),in_port(mx-bond),eth(src=f8:f2:1e:03:c8:c4,dst=fa:16:3e:95:28:f9),eth_type(0x8100),vlan(vid=148,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:75473301401, bytes:4528397614150, used:0.880s, actions:pop_vlan,ens6f1np1_1 ct_mark(0/0x2),recirc_id(0),in_port(mx-bond),eth(src=f8:f2:1e:03:c8:c6,dst=fa:16:3e:0e:a3:7a),eth_type(0x8100),vlan(vid=149,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:75296658539, bytes:4517799078494, used:0.880s, actions:pop_vlan,ens6f1np1_8 recirc_id(0),in_port(mx-bond),eth(src=f4:52:14:25:28:7a,dst=01:80:c2:00:00:0e),eth_type(0x88cc), packets:0, bytes:0, used:never, actions:drop ^C [root@computehwoffload-r740 ~]# tcpdump -nnei mx-bond ether proto 0x88cc 2>/dev/null & ssh -i test_keypair.key cloud-user.228.38 sudo python3 /root/dpdk-port-stats.py -s /run/dpdk/rte/dpdk_telemetry.v2 -t 5 [1] 441749 The authenticity of host '10.46.228.38 (10.46.228.38)' can't be established. ED25519 key fingerprint is SHA256:BQhMC4Vzv4ygXHu13liV98FHEGOC5YZlgs63zEpCo0E. This key is not known by any other names Are you sure you want to continue connecting (yes/no/[fingerprint])? yes Warning: Permanently added '10.46.228.38' (ED25519) to the list of known hosts. --- 0: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s 1: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s --- 0: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s 1: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s 10:44:48.492633 f4:52:14:25:28:7a > 01:80:c2:00:00:0e, ethertype LLDP (0x88cc), length 211: LLDP, length 197: nfv-private-sw05 --- 0: RX=13.3M pkt/s DROP=0.0 pkt/s TX=13.3M pkt/s 1: RX=13.3M pkt/s DROP=0.0 pkt/s TX=13.3M pkt/s --- 0: RX=12.8M pkt/s DROP=0.0 pkt/s TX=12.7M pkt/s 1: RX=12.7M pkt/s DROP=0.0 pkt/s TX=12.8M pkt/s --- 0: RX=13.5M pkt/s DROP=0.0 pkt/s TX=13.4M pkt/s 1: RX=13.4M pkt/s DROP=0.0 pkt/s TX=13.5M pkt/s --- 0: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s 1: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s --- 0: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s 1: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s --- 0: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s 1: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s 10:45:18.545548 f4:52:14:25:28:7a > 01:80:c2:00:00:0e, ethertype LLDP (0x88cc), length 211: LLDP, length 197: nfv-private-sw05 --- 0: RX=13.3M pkt/s DROP=0.0 pkt/s TX=13.3M pkt/s 1: RX=13.3M pkt/s DROP=0.0 pkt/s TX=13.3M pkt/s --- 0: RX=12.8M pkt/s DROP=0.0 pkt/s TX=12.7M pkt/s 1: RX=12.7M pkt/s DROP=0.0 pkt/s TX=12.8M pkt/s --- 0: RX=13.3M pkt/s DROP=0.0 pkt/s TX=13.3M pkt/s 1: RX=13.3M pkt/s DROP=0.0 pkt/s TX=13.3M pkt/s --- 0: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s 1: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s --- 0: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s 1: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s Additional info:
1) Increasing other_config:max-idle to 60s to allow the lldp drop offloaded flow to remain installed causes the packet drop to remain constant. Tcpdump does not see the LLDP packets anymore as they are dropped by hardware by the offloaded flow.
[root@computehwoffload-r740 ~]# devlink dev param show pci/0000:18:00.0 name flow_steering_mode
pci/0000:18:00.0:
name flow_steering_mode type driver-specific
values:
cmode runtime value smfs
[root@computehwoffload-r740 ~]# devlink dev param show pci/0000:18:00.1 name flow_steering_mode
pci/0000:18:00.1:
name flow_steering_mode type driver-specific
values:
cmode runtime value smfs
[root@computehwoffload-r740 ~]# ovs-vsctl set o . other_config:max-idle=60000
[root@computehwoffload-r740 ~]# ovs-appctl dpctl/dump-flows --names type=offloaded
ct_mark(0/0x2),recirc_id(0),in_port(ens6f1np1_1),eth(src=fa:16:3e:95:28:f9,dst=f8:f2:1e:03:c8:c4),eth_type(0x0800),ipv4(frag=no), packets:11852789113, bytes:758577366370, used:0.610s, actions:push_vlan(vid=148,pcp=0),mx-bond
ct_mark(0/0x2),recirc_id(0),in_port(ens6f1np1_8),eth(src=fa:16:3e:0e:a3:7a,dst=f8:f2:1e:03:c8:c6),eth_type(0x0800),ipv4(frag=no), packets:11852870204, bytes:758582539004, used:0.610s, actions:push_vlan(vid=149,pcp=0),mx-bond
ct_mark(0/0x2),recirc_id(0),in_port(mx-bond),eth(src=f8:f2:1e:03:c8:c4,dst=fa:16:3e:95:28:f9),eth_type(0x8100),vlan(vid=148,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:11853410853, bytes:711204182404, used:0.610s, actions:pop_vlan,ens6f1np1_1
ct_mark(0/0x2),recirc_id(0),in_port(mx-bond),eth(src=f8:f2:1e:03:c8:c6,dst=fa:16:3e:0e:a3:7a),eth_type(0x8100),vlan(vid=149,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:11853294719, bytes:711197251940, used:0.611s, actions:pop_vlan,ens6f1np1_8
recirc_id(0),in_port(ens6f1np1_1),eth(src=fa:16:3e:95:28:f9,dst=f8:f2:1e:03:c8:c4),eth_type(0x0806), packets:0, bytes:0, used:never, actions:push_vlan(vid=148,pcp=0),mx-bond
recirc_id(0),in_port(ens6f1np1_8),eth(src=fa:16:3e:0e:a3:7a,dst=f8:f2:1e:03:c8:c6),eth_type(0x0806), packets:0, bytes:0, used:never, actions:push_vlan(vid=149,pcp=0),mx-bond
recirc_id(0),in_port(mx-bond),eth(src=f4:52:14:25:28:7a,dst=01:80:c2:00:00:0e),eth_type(0x88cc), packets:3, bytes:633, used:18.010s, actions:drop
[root@computehwoffload-r740 ~]# ssh -i test_keypair.key cloud-user.228.38 sudo python3 /root/dpdk-port-stats.py -s /run/dpdk/rte/dpdk_telemetry.v2 -t 5
---
0: RX=12.8M pkt/s DROP=0.0 pkt/s TX=12.7M pkt/s
1: RX=12.7M pkt/s DROP=0.0 pkt/s TX=12.8M pkt/s
---
0: RX=12.8M pkt/s DROP=0.0 pkt/s TX=12.7M pkt/s
1: RX=12.7M pkt/s DROP=0.0 pkt/s TX=12.8M pkt/s
---
0: RX=12.8M pkt/s DROP=0.0 pkt/s TX=12.7M pkt/s
1: RX=12.7M pkt/s DROP=0.0 pkt/s TX=12.8M pkt/s
---
0: RX=12.8M pkt/s DROP=0.0 pkt/s TX=12.7M pkt/s
1: RX=12.7M pkt/s DROP=0.0 pkt/s TX=12.8M pkt/s
---
0: RX=12.8M pkt/s DROP=0.0 pkt/s TX=12.7M pkt/s
1: RX=12.7M pkt/s DROP=0.0 pkt/s TX=12.8M pkt/s
---
0: RX=12.8M pkt/s DROP=0.0 pkt/s TX=12.7M pkt/s
1: RX=12.7M pkt/s DROP=0.0 pkt/s TX=12.8M pkt/s
---
0: RX=12.8M pkt/s DROP=0.0 pkt/s TX=12.7M pkt/s
1: RX=12.7M pkt/s DROP=0.0 pkt/s TX=12.8M pkt/s
---
0: RX=12.8M pkt/s DROP=0.0 pkt/s TX=12.7M pkt/s
1: RX=12.7M pkt/s DROP=0.0 pkt/s TX=12.8M pkt/s
---
0: RX=12.8M pkt/s DROP=0.0 pkt/s TX=12.7M pkt/s
1: RX=12.7M pkt/s DROP=0.0 pkt/s TX=12.8M pkt/s
---
0: RX=12.8M pkt/s DROP=0.0 pkt/s TX=12.7M pkt/s
1: RX=12.7M pkt/s DROP=0.0 pkt/s TX=12.8M pkt/s
---
0: RX=12.8M pkt/s DROP=0.0 pkt/s TX=12.7M pkt/s
1: RX=12.7M pkt/s DROP=0.0 pkt/s TX=12.8M pkt/s
---
0: RX=12.8M pkt/s DROP=0.0 pkt/s TX=12.7M pkt/s
1: RX=12.7M pkt/s DROP=0.0 pkt/s TX=12.8M pkt/s
---
0: RX=12.8M pkt/s DROP=0.0 pkt/s TX=12.7M pkt/s
1: RX=12.7M pkt/s DROP=0.0 pkt/s TX=12.8M pkt/s
---
0: RX=12.8M pkt/s DROP=0.0 pkt/s TX=12.7M pkt/s
1: RX=12.7M pkt/s DROP=0.0 pkt/s TX=12.8M pkt/s
---
0: RX=12.8M pkt/s DROP=0.0 pkt/s TX=12.7M pkt/s
1: RX=12.7M pkt/s DROP=0.0 pkt/s TX=12.8M pkt/s
---
0: RX=12.8M pkt/s DROP=0.0 pkt/s TX=12.7M pkt/s
1: RX=12.7M pkt/s DROP=0.0 pkt/s TX=12.8M pkt/s
---
0: RX=12.8M pkt/s DROP=0.0 pkt/s TX=12.7M pkt/s
1: RX=12.7M pkt/s DROP=0.0 pkt/s TX=12.8M pkt/s
---
0: RX=12.8M pkt/s DROP=0.0 pkt/s TX=12.7M pkt/s
1: RX=12.7M pkt/s DROP=0.0 pkt/s TX=12.8M pkt/s
---
0: RX=12.8M pkt/s DROP=0.0 pkt/s TX=12.7M pkt/s
1: RX=12.7M pkt/s DROP=0.0 pkt/s TX=12.8M pkt/s
---
0: RX=12.8M pkt/s DROP=0.0 pkt/s TX=12.7M pkt/s
1: RX=12.7M pkt/s DROP=0.0 pkt/s TX=12.8M pkt/s
2) Changing flow_steering_mode to dmfs fixes the packet drop.
[root@computehwoffload-r740 ~]# ovs-vsctl set o . other_config:max-idle=10000
[root@computehwoffload-r740 ~]# devlink dev param set pci/0000:18:00.0 name flow_steering_mode value dmfs cmode runtime
[root@computehwoffload-r740 ~]# devlink dev param set pci/0000:18:00.1 name flow_steering_mode value dmfs cmode runtime
[root@computehwoffload-r740 ~]# systemctl restart openvswitch
[root@computehwoffload-r740 ~]# devlink dev param show pci/0000:18:00.0 name flow_steering_mode
pci/0000:18:00.0:
name flow_steering_mode type driver-specific
values:
cmode runtime value dmfs
[root@computehwoffload-r740 ~]# devlink dev param show pci/0000:18:00.1 name flow_steering_mode
pci/0000:18:00.1:
name flow_steering_mode type driver-specific
values:
cmode runtime value dmfs
[root@computehwoffload-r740 ~]# ssh -i test_keypair.key cloud-user.228.38 sudo python3 /root/dpdk-port-stats.py -s /run/dpdk/rte/dpdk_telemetry.v2 -t 5
---
0: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s
1: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s
---
0: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s
1: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s
11:53:24.801714 f4:52:14:25:28:7a > 01:80:c2:00:00:0e, ethertype LLDP (0x88cc), length 211: LLDP, length 197: nfv-private-sw05
---
0: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s
1: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s
---
0: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s
1: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s
---
0: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s
1: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s
---
0: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s
1: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s
---
0: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s
1: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s
---
0: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s
1: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s
11:53:54.825009 f4:52:14:25:28:7a > 01:80:c2:00:00:0e, ethertype LLDP (0x88cc), length 211: LLDP, length 197: nfv-private-sw05
---
0: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s
1: RX=14.0M pkt/s DROP=0.0 pkt/s TX=14.0M pkt/s
Robin, I'm not sure if you're planning to post a new comment or what, but this bz could use a fresh summary. It's not clear from the comments above where the performance issue lies, especially after comment #12 compared smfs and dmfs with no recorded drops. Please also add mentions to the vm's forced context switches triggered by the LLDP packets. That's key to this bz, actually. Hi Marcelo, I don't have access to the platform anymore and cannot provide with traces. However, here are my observations: With comment #12 step 1 (smfs + other_config:max-idle=60000) The rx_discards_phy counter was increasing constantly which means that the packets are dropped at reception by the CX-5 ports, testpmd running in the VM only sees a lower rx rate. With comment #12 step 2 (dmfs + other_config:max-idle=10000) The rx_discards_phy counter remains constant and testpmd running in the VM sees the rate at which the traffic generator is sending. I think this should be easy to reproduce without openstack. I have updated the summary. I'm not certain about the formulation. Will try to refine it later on. @bnemeth @wizhao do you have any idea what could be causing this perf regression? (In reply to Robin Jarry from comment #16) > @bnemeth @wizhao do you have any idea what could be > causing this perf regression? I have sent it to NVIDIA to take a look. We have also seen regressions with SMFS when compared with DMFS. NVIDIA was looking into this a few months back, but maybe they were chasing a red herring in their setup. I have told them that we run LLDP in our labs and this might be a big hint for them to reproduce it on their end. Unfortunately DMFS/SMFS is NVIDIA proprietary so we can't open it up to see what might be going wrong. However NVIDIA has strongly urged us to switch to SMFS since this is the only mode they would be supporting moving forward. I see that you are invite to Thursday's 8AM EST meeting with NVIDIA. We need to bring this topic up with them. |