Bug 2120247 - [dpdkbond]LACP PDU packet lose when send hige rate traffic to dpdkbond port
Summary: [dpdkbond]LACP PDU packet lose when send hige rate traffic to dpdkbond port
Keywords:
Status: NEW
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: openvswitch
Version: FDP 22.G
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: ---
Assignee: Robin Jarry
QA Contact: mhou
URL:
Whiteboard:
Depends On:
Blocks: 2149301
TreeView+ depends on / blocked
 
Reported: 2022-08-22 10:14 UTC by mhou
Modified: 2023-07-13 07:25 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2149301 (view as bug list)
Environment:
Last Closed:
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-2243 0 None None None 2022-08-22 10:19:14 UTC

Description mhou 2022-08-22 10:14:38 UTC
Description of problem:
On ovs-dpdk bondport port, if LACP was enabled and dpdkbond port role as active, hige traffic will let LACP PDU lose on switch side. When the packet of LACP PDU cannot be received within a certain period of time, different switches reset the port state according to their default behavior.

Version-Release number of selected component (if applicable):
all of openvswitch version(contain ovs2.13, ovs2.15, ovs2.16, ovs2.17)

How reproducible: 100%


Steps to Reproduce:
1. configure lacp passive mode on Cisco 9364 switch. To prevent port flipping caused by lacp pdu packet loss, set the no lacp suspend-individual and lacp graceful-convergence properties.

Reference: https://www.cisco.com/c/en/us/td/docs/switches/datacenter/nexus9000/sw/93x/interfaces/configuration/guide/b-cisco-nexus-9000-nx-os-interfaces-configuration-guide-93x/b-cisco-nexus-9000-nx-os-interfaces-configuration-guide-93x_chapter_010000.html#task_26B7966FF2CF44B39532F5D70FDC4DB3

# show port-channel summary 
Flags:  D - Down        P - Up in port-channel (members)
        I - Individual  H - Hot-standby (LACP only)
        s - Suspended   r - Module-removed
        b - BFD Session Wait
        S - Switched    R - Routed
        U - Up (port-channel)
        p - Up in delay-lacp mode (member)
        M - Not in use. Min-links not met
--------------------------------------------------------------------------------
Group Port-       Type     Protocol  Member Ports
      Channel
--------------------------------------------------------------------------------
48    Po48(SU)    Eth      LACP      Eth1/3/1(P)  Eth1/3/2(P)  

# show port-channel internal info interface port-channel 48

port-channel48
channel    : 48
bundle     : 65535
ifindex    : 0x1600002f
admin mode : passive
oper mode  : passive
nports     : 2
active     : 2
pre cfg    : 0
ltl        : 0x2 (2)
lif        : 0x1045
iod        : 0x46 (70)
global id  : 1
flag       : 0
lock count : 0
num. of SIs: 0
ac mbrs    : 0 0
lacp graceful conv disable   : 0 
lacp suspend indiv disable   : 1 
pc min-links                 : 1 
pc max-bundle                : 32 
pc max active members        : 32 
pc is-suspend-minlinks       : 0 
port load defer enable       : 0 
port-channel bfd config enabled     : 0 
port-channel bfd config complete: 0 
port-channel bfd destination: null 
port-channel bfd start timeout: 0 
lacp fast-select-hot-standby disable   : 0 
port-channel port hash-distribution    : none
ethpm bundle lock count : 0
lacp delayed-enable fop Ethernet1/3/2 0x38015000 
lacp delayed-enable : 0 
lacp delayed-enable cfg-port none
lacp delayed-enable oper-port none
lacp delayed-enable local best priority : 0xffffffff 
lacp delayed-enable remote best priority : 0xffffffff 
lacp vpc conv enabled   : 0 
gir conv enabled   : 0 
bundle number map:
1-2Members: 
Ethernet1/3/1 [bundle_no = 1]     is_ltl_programmed = 1
Port BFD session state: 5 (none)
Ethernet1/3/2 [bundle_no = 0]     is_ltl_programmed = 1
Port BFD session state: 5 (none)
port-channel external lock: 
Lock Info: resource [eth-port-channel 48] 
  type[0] p_gwrap[(nil)]
      FREE @ 28099 usecs after Mon Aug 22 05:05:30 2022
  type[1] p_gwrap[(nil)]
      FREE @ 505551 usecs after Mon Aug 22 05:05:40 2022
  type[2] p_gwrap[(nil)]
      FREE @ 739313 usecs after Mon Aug 22 05:05:38 2022
0x1600002f
internal (ethpm bundle) lock: 
Lock Info: resource [eth-port-channel 48] 
  type[0] p_gwrap[(nil)]
      FREE @ 28085 usecs after Mon Aug 22 05:05:30 2022
  type[1] p_gwrap[(nil)]
      FREE @ 824981 usecs after Mon Aug 22 07:45:41 2022
  type[2] p_gwrap[(nil)]
      FREE @ 823693 usecs after Mon Aug 22 07:45:41 2022
0x1600002f

check the counters of interface resets
sw-cisco9364(config)# show interface eth1/3/2 | grep "interface resets"
  4436 interface resets
sw-cisco9364(config)# show interface eth1/3/1 | grep "interface resets"
  4528 interface resets
sw-cisco9364(config)# show interface po48 | grep "interface resets"
  1 interface resets


2. build an active-backup dpdkbond port on DUT side:
ovs-vsctl set Open_vSwitch . other_config={}
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem="4096,4096"
ovs-vsctl --no-wait set Open_vSwitch . other_config:vhost-iommu-support=true
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true
ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask="0xc0000"
ovs-vsctl add-br ovsbr0 -- set bridge ovsbr0 datapath_type=netdev 
ovs-vsctl add-bond ovsbr0 dpdkbond dpdk0 dpdk1 "bond_mode=active-backup" -- set Interface dpdk0 type=dpdk options:dpdk-devargs=0000:af:00.0 options:n_rxq=1 mtu_request=9120 -- set Interface dpdk1 type=dpdk options:dpdk-devargs=0000:af:00.1 options:n_rxq=1 mtu_request=9120
ovs-vsctl set interface dpdk0 options:dpdk-lsc-interrupt=true
ovs-vsctl set interface dpdk1 options:dpdk-lsc-interrupt=true
ovs-vsctl set Port dpdkbond vlan_mode=trunk
ovs-vsctl set Port dpdkbond bond_updelay=0
ovs-vsctl set Port dpdkbond bond_downdelay=0

ovs-vsctl set Port dpdkbond other_config:bond-rebalance-interval=10000

ovs-vsctl add-port ovsbr0 vhost0 -- set interface vhost0 type=dpdkvhostuserclient options:vhost-server-path=/tmp/vhost0 options:n_rxq=1
ovs-vsctl set Interface vhost0 mtu_request=9120
ovs-ofctl del-flows ovsbr0
ovs-ofctl add-flow ovsbr0 actions=NORMAL
ovs-vsctl set port dpdkbond lacp=active
ovs-vsctl set port dpdkbond other_config:lacp-time=slow
ovs-vsctl set port dpdkbond other_config:bond-detect-mode=carrier
# ovs-vsctl list port dpdkbond
_uuid               : 183cc9ba-d89b-454a-8830-e6ec4be33637
bond_active_slave   : "04:3f:72:b0:35:33"
bond_downdelay      : 0
bond_fake_iface     : false
bond_mode           : active-backup
bond_updelay        : 0
cvlans              : []
external_ids        : {}
fake_bridge         : false
interfaces          : [3b862d36-6201-49aa-a441-c26a87951981, 44e85a72-c17c-4a19-921d-d280285b7c7e]
lacp                : active
mac                 : []
name                : dpdkbond
other_config        : {bond-detect-mode=carrier, lacp-time=slow, lb-output-action="false"}
protected           : false
qos                 : []
rstp_statistics     : {}
rstp_status         : {}
statistics          : {}
status              : {}
tag                 : []
trunks              : []
vlan_mode           : trunk
# ovs-appctl bond/show
---- dpdkbond ----
bond_mode: active-backup
bond may use recirculation: no, Recirc-ID : -1
bond-hash-basis: 0
lb_output action: disabled, bond-id: -1
updelay: 0 ms
downdelay: 0 ms
lacp_status: negotiated
lacp_fallback_ab: false
active-backup primary: <none>
active member mac: 04:3f:72:b0:35:33(dpdk1)

member dpdk0: enabled
  may_enable: true

member dpdk1: enabled
  active member
  may_enable: true
# ovs-appctl lacp/show-stats
---- dpdkbond statistics ----

member: dpdk0:
  TX PDUs: 38
  RX PDUs: 30
  RX Bad PDUs: 0
  RX Marker Request PDUs: 0
  Link Expired: 0
  Link Defaulted: 1
  Carrier Status Changed: 1

member: dpdk1:
  TX PDUs: 38
  RX PDUs: 30
  RX Bad PDUs: 0
  RX Marker Request PDUs: 0
  Link Expired: 0
  Link Defaulted: 1
  Carrier Status Changed: 1


3. The traffic flows in from the Eth1/7/3 port of the 9364 switch, and reaches the dpdkbond port of the DUT through the po48 port.
w-cisco9364(config-if)# show interface eth1/7/3
Ethernet1/7/3 is up
admin state is up, Dedicated Interface
  Hardware: 25000 Ethernet, address: b0c5.3cf6.36d6 (bia b0c5.3cf6.36d6)
  MTU 9216 bytes, BW 25000000 Kbit , DLY 10 usec
  reliability 255/255, txload 9/255, rxload 152/255
  Encapsulation ARPA, medium is broadcast
  Port mode is trunk
  full-duplex, 25 Gb/s, media type is 100G
  Beacon is turned off
  Auto-Negotiation is turned off  FEC mode is Auto
  Input flow-control is off, output flow-control is off
  Auto-mdix is turned off
  Rate mode is dedicated
  Switchport monitor is off 
  EtherType is 0x8100 
  EEE (efficient-ethernet) : n/a
    admin fec state is auto, oper fec state is Fc-fec
  Last link flapped 07:28:57
  Last clearing of "show interface" counters 31w5d
  5934 interface resets
  Load-Interval #1: 30 seconds
    30 seconds input rate 14982412288 bits/sec, 27541193 packets/sec
    30 seconds output rate 903541720 bits/sec, 1660914 packets/sec
    input rate 14.98 Gbps, 27.54 Mpps; output rate 903.54 Mbps, 1.66 Mpps
  Load-Interval #2: 5 minute (300 seconds)
    300 seconds input rate 2557908344 bits/sec, 4702013 packets/sec
    300 seconds output rate 120199752 bits/sec, 220939 packets/sec
    input rate 2.56 Gbps, 4.70 Mpps; output rate 120.20 Mbps, 220.94 Kpps
  RX
    25473914277495 unicast packets  24798739588 multicast packets  34175 broadcast packets
    25498713051251 input packets  5426349482871866 bytes
    126981761196 jumbo packets  0 storm suppression bytes
    0 runts  0 giants  0 CRC  0 no buffer
    0 input error  0 short frame  0 overrun   0 underrun  0 ignored
    0 watchdog  0 bad etype drop  0 bad proto drop  0 if down drop
    0 input with dribble  0 input discard
    0 Rx pause
  TX
    14405876978993 unicast packets  516940646 multicast packets  162408927 broadcast packets
    14406556328566 output packets  4372264666645070 bytes
    116807377983 jumbo packets
    0 output error  0 collision  0 deferred  0 late collision
    0 lost carrier  0 no carrier  0 babble  665728362 output discard
    0 Tx pause

sw-cisco9364(config-if)# show interface po48
port-channel48 is up
admin state is up,
  Hardware: Port-Channel, address: b0c5.3cf6.36cd (bia b0c5.3cf6.36cd)
  MTU 9216 bytes, BW 50000000 Kbit , DLY 10 usec
  reliability 255/255, txload 48/255, rxload 8/255
  Encapsulation ARPA, medium is broadcast
  Port mode is trunk
  full-duplex, 25 Gb/s
  Input flow-control is off, output flow-control is off
  Auto-mdix is turned off
  Switchport monitor is off 
  EtherType is 0x8100 
  Members in this channel: Eth1/3/1, Eth1/3/2
  Last clearing of "show interface" counters never
  19 interface resets
  Load-Interval #1: 30 seconds
    30 seconds input rate 1673082512 bits/sec, 3075512 packets/sec
    30 seconds output rate 9586691656 bits/sec, 17622574 packets/sec
    input rate 1.67 Gbps, 3.08 Mpps; output rate 9.59 Gbps, 17.62 Mpps
  Load-Interval #2: 5 minute (300 seconds)
    300 seconds input rate 299868160 bits/sec, 551213 packets/sec
    300 seconds output rate 2013170352 bits/sec, 3700644 packets/sec
    input rate 299.87 Mbps, 551.21 Kpps; output rate 2.01 Gbps, 3.70 Mpps
  RX
    3981154743 unicast packets  1084 multicast packets  13 broadcast packets
    3981155840 input packets  1596705130904 bytes
    79892543 jumbo packets  0 storm suppression bytes
    0 runts  0 giants  0 CRC  0 no buffer
    0 input error  0 short frame  0 overrun   0 underrun  0 ignored
    0 watchdog  0 bad etype drop  0 bad proto drop  0 if down drop
    0 input with dribble  0 input discard
    0 Rx pause
  TX
    19623333237 unicast packets  549465 multicast packets  0 broadcast packets
    19623882702 output packets  3530787644005 bytes
    126411484 jumbo packets
    0 output error  0 collision  0 deferred  0 late collision
    0 lost carrier  0 no carrier  0 babble  8525 output discard
    0 Tx pause


4. check openvswitch log and lacp port stats
tail -f /var/log/openvswitch/ovs-vswitchd.log
2022-08-22T09:13:17.125Z|00173|bond|INFO|member dpdk1: link state down
2022-08-22T09:13:17.125Z|00174|bond|INFO|member dpdk1: disabled
2022-08-22T09:13:17.125Z|00175|bond|INFO|bond dpdkbond: active member is now dpdk0
2022-08-22T09:13:22.277Z|00176|bond|INFO|member dpdk1: link state up
2022-08-22T09:13:22.277Z|00177|bond|INFO|member dpdk1: enabled


# ovs-appctl lacp/show-stats
---- dpdkbond statistics ----

member: dpdk0:
  TX PDUs: 70
  RX PDUs: 57
  RX Bad PDUs: 0
  RX Marker Request PDUs: 0
  Link Expired: 0
  Link Defaulted: 2
  Carrier Status Changed: 3

member: dpdk1:
  TX PDUs: 69
  RX PDUs: 57
  RX Bad PDUs: 0
  RX Marker Request PDUs: 0
  Link Expired: 1
  Link Defaulted: 2
  Carrier Status Changed: 3

# ovs-appctl bond/show
---- dpdkbond ----
bond_mode: active-backup
bond may use recirculation: no, Recirc-ID : -1
bond-hash-basis: 0
lb_output action: disabled, bond-id: -1
updelay: 0 ms
downdelay: 0 ms
lacp_status: negotiated
lacp_fallback_ab: false
active-backup primary: <none>
active member mac: 04:3f:72:b0:35:33(dpdk1)

member dpdk0: enabled
  may_enable: true

member dpdk1: enabled
  active member
  may_enable: true

sw-cisco9364(config-if)# show lacp counters interface po48
NOTE: Clear lacp counters to get accurate statistics

------------------------------------------------------------------------------
                             LACPDUs                      Markers/Resp LACPDUs
Port              Sent                Recv                  Recv Sent  Pkts Err
------------------------------------------------------------------------------
port-channel48
Ethernet1/3/1      159                  152                    0      0    0      
Ethernet1/3/2      158                  149                    0      0    0   

sw-cisco9364(config)# show interface po48 | grep "interface resets"
  1 interface resets
sw-cisco9364(config)# show interface eth1/3/1 | grep "interface resets"
  4528 interface resets
sw-cisco9364(config)# show interface eth1/3/2 | grep "interface resets"
  4437 interface resets

Actual results:
1. from step4, dpdk1 which slave of dpdkbond status change. correspnding eth1/3/2 port resets stats also add 1. 

Expected results:
The lacp configuration can be disabled in active-backup/balance-slb mode if possible.

Additional info:
If dpdkbond has lacp enabled and set to active mode, packet loss of lacp pdu seems to be inevitable under heavy traffic. 

Different switches have different default behaviors for lacp pdu packet loss. Should we give warnings in the documentation?


Note You need to log in before you can comment on or make changes to this bug.