Description of problem: On ovs-dpdk bondport port, if LACP was enabled and dpdkbond port role as active, hige traffic will let LACP PDU lose on switch side. When the packet of LACP PDU cannot be received within a certain period of time, different switches reset the port state according to their default behavior. Version-Release number of selected component (if applicable): all of openvswitch version(contain ovs2.13, ovs2.15, ovs2.16, ovs2.17) How reproducible: 100% Steps to Reproduce: 1. configure lacp passive mode on Cisco 9364 switch. To prevent port flipping caused by lacp pdu packet loss, set the no lacp suspend-individual and lacp graceful-convergence properties. Reference: https://www.cisco.com/c/en/us/td/docs/switches/datacenter/nexus9000/sw/93x/interfaces/configuration/guide/b-cisco-nexus-9000-nx-os-interfaces-configuration-guide-93x/b-cisco-nexus-9000-nx-os-interfaces-configuration-guide-93x_chapter_010000.html#task_26B7966FF2CF44B39532F5D70FDC4DB3 # show port-channel summary Flags: D - Down P - Up in port-channel (members) I - Individual H - Hot-standby (LACP only) s - Suspended r - Module-removed b - BFD Session Wait S - Switched R - Routed U - Up (port-channel) p - Up in delay-lacp mode (member) M - Not in use. Min-links not met -------------------------------------------------------------------------------- Group Port- Type Protocol Member Ports Channel -------------------------------------------------------------------------------- 48 Po48(SU) Eth LACP Eth1/3/1(P) Eth1/3/2(P) # show port-channel internal info interface port-channel 48 port-channel48 channel : 48 bundle : 65535 ifindex : 0x1600002f admin mode : passive oper mode : passive nports : 2 active : 2 pre cfg : 0 ltl : 0x2 (2) lif : 0x1045 iod : 0x46 (70) global id : 1 flag : 0 lock count : 0 num. of SIs: 0 ac mbrs : 0 0 lacp graceful conv disable : 0 lacp suspend indiv disable : 1 pc min-links : 1 pc max-bundle : 32 pc max active members : 32 pc is-suspend-minlinks : 0 port load defer enable : 0 port-channel bfd config enabled : 0 port-channel bfd config complete: 0 port-channel bfd destination: null port-channel bfd start timeout: 0 lacp fast-select-hot-standby disable : 0 port-channel port hash-distribution : none ethpm bundle lock count : 0 lacp delayed-enable fop Ethernet1/3/2 0x38015000 lacp delayed-enable : 0 lacp delayed-enable cfg-port none lacp delayed-enable oper-port none lacp delayed-enable local best priority : 0xffffffff lacp delayed-enable remote best priority : 0xffffffff lacp vpc conv enabled : 0 gir conv enabled : 0 bundle number map: 1-2Members: Ethernet1/3/1 [bundle_no = 1] is_ltl_programmed = 1 Port BFD session state: 5 (none) Ethernet1/3/2 [bundle_no = 0] is_ltl_programmed = 1 Port BFD session state: 5 (none) port-channel external lock: Lock Info: resource [eth-port-channel 48] type[0] p_gwrap[(nil)] FREE @ 28099 usecs after Mon Aug 22 05:05:30 2022 type[1] p_gwrap[(nil)] FREE @ 505551 usecs after Mon Aug 22 05:05:40 2022 type[2] p_gwrap[(nil)] FREE @ 739313 usecs after Mon Aug 22 05:05:38 2022 0x1600002f internal (ethpm bundle) lock: Lock Info: resource [eth-port-channel 48] type[0] p_gwrap[(nil)] FREE @ 28085 usecs after Mon Aug 22 05:05:30 2022 type[1] p_gwrap[(nil)] FREE @ 824981 usecs after Mon Aug 22 07:45:41 2022 type[2] p_gwrap[(nil)] FREE @ 823693 usecs after Mon Aug 22 07:45:41 2022 0x1600002f check the counters of interface resets sw-cisco9364(config)# show interface eth1/3/2 | grep "interface resets" 4436 interface resets sw-cisco9364(config)# show interface eth1/3/1 | grep "interface resets" 4528 interface resets sw-cisco9364(config)# show interface po48 | grep "interface resets" 1 interface resets 2. build an active-backup dpdkbond port on DUT side: ovs-vsctl set Open_vSwitch . other_config={} ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem="4096,4096" ovs-vsctl --no-wait set Open_vSwitch . other_config:vhost-iommu-support=true ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask="0xc0000" ovs-vsctl add-br ovsbr0 -- set bridge ovsbr0 datapath_type=netdev ovs-vsctl add-bond ovsbr0 dpdkbond dpdk0 dpdk1 "bond_mode=active-backup" -- set Interface dpdk0 type=dpdk options:dpdk-devargs=0000:af:00.0 options:n_rxq=1 mtu_request=9120 -- set Interface dpdk1 type=dpdk options:dpdk-devargs=0000:af:00.1 options:n_rxq=1 mtu_request=9120 ovs-vsctl set interface dpdk0 options:dpdk-lsc-interrupt=true ovs-vsctl set interface dpdk1 options:dpdk-lsc-interrupt=true ovs-vsctl set Port dpdkbond vlan_mode=trunk ovs-vsctl set Port dpdkbond bond_updelay=0 ovs-vsctl set Port dpdkbond bond_downdelay=0 ovs-vsctl set Port dpdkbond other_config:bond-rebalance-interval=10000 ovs-vsctl add-port ovsbr0 vhost0 -- set interface vhost0 type=dpdkvhostuserclient options:vhost-server-path=/tmp/vhost0 options:n_rxq=1 ovs-vsctl set Interface vhost0 mtu_request=9120 ovs-ofctl del-flows ovsbr0 ovs-ofctl add-flow ovsbr0 actions=NORMAL ovs-vsctl set port dpdkbond lacp=active ovs-vsctl set port dpdkbond other_config:lacp-time=slow ovs-vsctl set port dpdkbond other_config:bond-detect-mode=carrier # ovs-vsctl list port dpdkbond _uuid : 183cc9ba-d89b-454a-8830-e6ec4be33637 bond_active_slave : "04:3f:72:b0:35:33" bond_downdelay : 0 bond_fake_iface : false bond_mode : active-backup bond_updelay : 0 cvlans : [] external_ids : {} fake_bridge : false interfaces : [3b862d36-6201-49aa-a441-c26a87951981, 44e85a72-c17c-4a19-921d-d280285b7c7e] lacp : active mac : [] name : dpdkbond other_config : {bond-detect-mode=carrier, lacp-time=slow, lb-output-action="false"} protected : false qos : [] rstp_statistics : {} rstp_status : {} statistics : {} status : {} tag : [] trunks : [] vlan_mode : trunk # ovs-appctl bond/show ---- dpdkbond ---- bond_mode: active-backup bond may use recirculation: no, Recirc-ID : -1 bond-hash-basis: 0 lb_output action: disabled, bond-id: -1 updelay: 0 ms downdelay: 0 ms lacp_status: negotiated lacp_fallback_ab: false active-backup primary: <none> active member mac: 04:3f:72:b0:35:33(dpdk1) member dpdk0: enabled may_enable: true member dpdk1: enabled active member may_enable: true # ovs-appctl lacp/show-stats ---- dpdkbond statistics ---- member: dpdk0: TX PDUs: 38 RX PDUs: 30 RX Bad PDUs: 0 RX Marker Request PDUs: 0 Link Expired: 0 Link Defaulted: 1 Carrier Status Changed: 1 member: dpdk1: TX PDUs: 38 RX PDUs: 30 RX Bad PDUs: 0 RX Marker Request PDUs: 0 Link Expired: 0 Link Defaulted: 1 Carrier Status Changed: 1 3. The traffic flows in from the Eth1/7/3 port of the 9364 switch, and reaches the dpdkbond port of the DUT through the po48 port. w-cisco9364(config-if)# show interface eth1/7/3 Ethernet1/7/3 is up admin state is up, Dedicated Interface Hardware: 25000 Ethernet, address: b0c5.3cf6.36d6 (bia b0c5.3cf6.36d6) MTU 9216 bytes, BW 25000000 Kbit , DLY 10 usec reliability 255/255, txload 9/255, rxload 152/255 Encapsulation ARPA, medium is broadcast Port mode is trunk full-duplex, 25 Gb/s, media type is 100G Beacon is turned off Auto-Negotiation is turned off FEC mode is Auto Input flow-control is off, output flow-control is off Auto-mdix is turned off Rate mode is dedicated Switchport monitor is off EtherType is 0x8100 EEE (efficient-ethernet) : n/a admin fec state is auto, oper fec state is Fc-fec Last link flapped 07:28:57 Last clearing of "show interface" counters 31w5d 5934 interface resets Load-Interval #1: 30 seconds 30 seconds input rate 14982412288 bits/sec, 27541193 packets/sec 30 seconds output rate 903541720 bits/sec, 1660914 packets/sec input rate 14.98 Gbps, 27.54 Mpps; output rate 903.54 Mbps, 1.66 Mpps Load-Interval #2: 5 minute (300 seconds) 300 seconds input rate 2557908344 bits/sec, 4702013 packets/sec 300 seconds output rate 120199752 bits/sec, 220939 packets/sec input rate 2.56 Gbps, 4.70 Mpps; output rate 120.20 Mbps, 220.94 Kpps RX 25473914277495 unicast packets 24798739588 multicast packets 34175 broadcast packets 25498713051251 input packets 5426349482871866 bytes 126981761196 jumbo packets 0 storm suppression bytes 0 runts 0 giants 0 CRC 0 no buffer 0 input error 0 short frame 0 overrun 0 underrun 0 ignored 0 watchdog 0 bad etype drop 0 bad proto drop 0 if down drop 0 input with dribble 0 input discard 0 Rx pause TX 14405876978993 unicast packets 516940646 multicast packets 162408927 broadcast packets 14406556328566 output packets 4372264666645070 bytes 116807377983 jumbo packets 0 output error 0 collision 0 deferred 0 late collision 0 lost carrier 0 no carrier 0 babble 665728362 output discard 0 Tx pause sw-cisco9364(config-if)# show interface po48 port-channel48 is up admin state is up, Hardware: Port-Channel, address: b0c5.3cf6.36cd (bia b0c5.3cf6.36cd) MTU 9216 bytes, BW 50000000 Kbit , DLY 10 usec reliability 255/255, txload 48/255, rxload 8/255 Encapsulation ARPA, medium is broadcast Port mode is trunk full-duplex, 25 Gb/s Input flow-control is off, output flow-control is off Auto-mdix is turned off Switchport monitor is off EtherType is 0x8100 Members in this channel: Eth1/3/1, Eth1/3/2 Last clearing of "show interface" counters never 19 interface resets Load-Interval #1: 30 seconds 30 seconds input rate 1673082512 bits/sec, 3075512 packets/sec 30 seconds output rate 9586691656 bits/sec, 17622574 packets/sec input rate 1.67 Gbps, 3.08 Mpps; output rate 9.59 Gbps, 17.62 Mpps Load-Interval #2: 5 minute (300 seconds) 300 seconds input rate 299868160 bits/sec, 551213 packets/sec 300 seconds output rate 2013170352 bits/sec, 3700644 packets/sec input rate 299.87 Mbps, 551.21 Kpps; output rate 2.01 Gbps, 3.70 Mpps RX 3981154743 unicast packets 1084 multicast packets 13 broadcast packets 3981155840 input packets 1596705130904 bytes 79892543 jumbo packets 0 storm suppression bytes 0 runts 0 giants 0 CRC 0 no buffer 0 input error 0 short frame 0 overrun 0 underrun 0 ignored 0 watchdog 0 bad etype drop 0 bad proto drop 0 if down drop 0 input with dribble 0 input discard 0 Rx pause TX 19623333237 unicast packets 549465 multicast packets 0 broadcast packets 19623882702 output packets 3530787644005 bytes 126411484 jumbo packets 0 output error 0 collision 0 deferred 0 late collision 0 lost carrier 0 no carrier 0 babble 8525 output discard 0 Tx pause 4. check openvswitch log and lacp port stats tail -f /var/log/openvswitch/ovs-vswitchd.log 2022-08-22T09:13:17.125Z|00173|bond|INFO|member dpdk1: link state down 2022-08-22T09:13:17.125Z|00174|bond|INFO|member dpdk1: disabled 2022-08-22T09:13:17.125Z|00175|bond|INFO|bond dpdkbond: active member is now dpdk0 2022-08-22T09:13:22.277Z|00176|bond|INFO|member dpdk1: link state up 2022-08-22T09:13:22.277Z|00177|bond|INFO|member dpdk1: enabled # ovs-appctl lacp/show-stats ---- dpdkbond statistics ---- member: dpdk0: TX PDUs: 70 RX PDUs: 57 RX Bad PDUs: 0 RX Marker Request PDUs: 0 Link Expired: 0 Link Defaulted: 2 Carrier Status Changed: 3 member: dpdk1: TX PDUs: 69 RX PDUs: 57 RX Bad PDUs: 0 RX Marker Request PDUs: 0 Link Expired: 1 Link Defaulted: 2 Carrier Status Changed: 3 # ovs-appctl bond/show ---- dpdkbond ---- bond_mode: active-backup bond may use recirculation: no, Recirc-ID : -1 bond-hash-basis: 0 lb_output action: disabled, bond-id: -1 updelay: 0 ms downdelay: 0 ms lacp_status: negotiated lacp_fallback_ab: false active-backup primary: <none> active member mac: 04:3f:72:b0:35:33(dpdk1) member dpdk0: enabled may_enable: true member dpdk1: enabled active member may_enable: true sw-cisco9364(config-if)# show lacp counters interface po48 NOTE: Clear lacp counters to get accurate statistics ------------------------------------------------------------------------------ LACPDUs Markers/Resp LACPDUs Port Sent Recv Recv Sent Pkts Err ------------------------------------------------------------------------------ port-channel48 Ethernet1/3/1 159 152 0 0 0 Ethernet1/3/2 158 149 0 0 0 sw-cisco9364(config)# show interface po48 | grep "interface resets" 1 interface resets sw-cisco9364(config)# show interface eth1/3/1 | grep "interface resets" 4528 interface resets sw-cisco9364(config)# show interface eth1/3/2 | grep "interface resets" 4437 interface resets Actual results: 1. from step4, dpdk1 which slave of dpdkbond status change. correspnding eth1/3/2 port resets stats also add 1. Expected results: The lacp configuration can be disabled in active-backup/balance-slb mode if possible. Additional info: If dpdkbond has lacp enabled and set to active mode, packet loss of lacp pdu seems to be inevitable under heavy traffic. Different switches have different default behaviors for lacp pdu packet loss. Should we give warnings in the documentation?