The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 2223306 - RFE: Add more debug logs for lacp
Summary: RFE: Add more debug logs for lacp
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: openvswitch3.0
Version: RHEL 9.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Mike Pattrick
QA Contact: Zhiqiang Fang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-07-17 11:01 UTC by Haresh Khandelwal
Modified: 2024-09-09 17:37 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-09-09 17:37:48 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-3018 0 None None None 2023-07-17 11:02:05 UTC

Description Haresh Khandelwal 2023-07-17 11:01:01 UTC
Description of problem:

while troubleshooting a ovs (kernel, no dpdk) lacp bond issue, i enabled below 2 debuggers. 

[root@computesriov-0 openvswitch]# ovs-appctl vlog/list
                 console    syslog    file
                 -------    ------    ------
bond               OFF        ERR        DBG
lacp               OFF        ERR        DBG

I performed link failure at uplink switch.

When both member interfaces were up.

2023-07-17T10:45:18.232Z|06252|bond|DBG|bond lacp-bond: enp4s0f0np0 0kB, enp4s0f1np1 0kB
2023-07-17T10:45:28.241Z|06256|bond|DBG|bond lacp-bond: enp4s0f0np0 0kB, enp4s0f1np1 0kB

Brought down 1 member interface at uplink switch.

2023-07-17T10:45:31.672Z|06257|bond|INFO|member enp4s0f0np0: link state down
2023-07-17T10:45:31.672Z|06258|bond|INFO|member enp4s0f0np0: disabled
2023-07-17T10:45:31.672Z|06259|bond|INFO|bond lacp-bond: active member is now enp4s0f1np1
2023-07-17T10:45:31.673Z|08614|bond(revalidator7)|DBG|bond lacp-bond: member enp4s0f0np0: main thread has not yet enabled member
2023-07-17T10:45:31.679Z|08615|bond(revalidator7)|DBG|bond lacp-bond: member enp4s0f0np0: admissibility verdict is to drop pkt, active member: false, may_enable: false, enabled: false, LACP status: negotiated
2023-07-17T10:45:38.686Z|06260|bond|DBG|bond lacp-bond: enp4s0f1np1 0kB
2023-07-17T10:45:48.696Z|06261|bond|DBG|bond lacp-bond: enp4s0f1np1 0kB


Brought down 2nd member interface.

2023-07-17T10:45:53.835Z|06262|bond|INFO|member enp4s0f1np1: link state down
2023-07-17T10:45:53.835Z|06263|bond|INFO|member enp4s0f1np1: disabled
2023-07-17T10:45:53.835Z|06264|bond|INFO|bond lacp-bond: all members disabled

Brought up 1st member interface.

2023-07-17T10:46:28.543Z|06271|bond|INFO|member enp4s0f0np0: link state up
2023-07-17T10:46:28.543Z|06272|bond|INFO|member enp4s0f0np0: enabled
2023-07-17T10:46:28.543Z|06273|bond|INFO|bond lacp-bond: active member is now enp4s0f0np0
2023-07-17T10:46:36.065Z|06274|bond|DBG|bond lacp-bond: enp4s0f0np0 0kB
2023-07-17T10:46:46.075Z|06275|bond|DBG|bond lacp-bond: enp4s0f0np0 0kB

Brought up 2nd member interface.

2023-07-17T10:46:53.055Z|06276|bond|INFO|member enp4s0f1np1: link state up
2023-07-17T10:46:53.055Z|06277|bond|INFO|member enp4s0f1np1: enabled
2023-07-17T10:46:56.559Z|06278|bond|DBG|bond lacp-bond: enp4s0f0np0 0kB, enp4s0f1np1 0kB

LACP re-negotiated successfully.

[root@computesriov-0 tripleo-admin]# ovs-appctl lacp/show
---- lacp-bond ----
  status: active negotiated
  sys_id: 04:3f:72:d9:c0:48
  sys_priority: 65534
  aggregation key: 1
  lacp_time: fast

member: enp4s0f0np0: current attached
  port_id: 2
  port_priority: 65535
  may_enable: true

  actor sys_id: 04:3f:72:d9:c0:48
  actor sys_priority: 65534
  actor port_id: 2
  actor port_priority: 65535
  actor key: 1
  actor state: activity timeout aggregation synchronized collecting distributing

  partner sys_id: c8:fe:6a:f2:44:00
  partner sys_priority: 127
  partner port_id: 5
  partner port_priority: 127
  partner key: 5
  partner state: activity timeout aggregation synchronized collecting distributing

member: enp4s0f1np1: current attached
  port_id: 1
  port_priority: 65535
  may_enable: true

  actor sys_id: 04:3f:72:d9:c0:48
  actor sys_priority: 65534
  actor port_id: 1
  actor port_priority: 65535
  actor key: 1
  actor state: activity timeout aggregation synchronized collecting distributing

  partner sys_id: c8:fe:6a:f2:44:00
  partner sys_priority: 127
  partner port_id: 6
  partner port_priority: 127
  partner key: 5
  partner state: activity timeout aggregation synchronized collecting distributing
[root@computesriov-0 tripleo-admin]# 
[root@computesriov-0 tripleo-admin]# 

[root@computesriov-0 tripleo-admin]# ovs-appctl bond/show
---- lacp-bond ----
bond_mode: balance-slb
bond may use recirculation: no, Recirc-ID : -1
bond-hash-basis: 0
lb_output action: disabled, bond-id: -1
all members active: false
updelay: 0 ms
downdelay: 0 ms
next rebalance: 9098 ms
lacp_status: negotiated
lacp_fallback_ab: true
active-backup primary: <none>
active member mac: 04:3f:72:d9:c0:48(enp4s0f0np0)

member enp4s0f0np0: enabled
  active member
  may_enable: true

member enp4s0f1np1: enabled
  may_enable: true

[root@computesriov-0 tripleo-admin]# 


I expect "lacp" debugger should have more debugs enabled to understand what is going with lacp state machine.  

Version-Release number of selected component (if applicable):
openvswitch3.0-3.0.0-28.el9fdp.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Configure ovs lacp bond
2. Perform link fail over
3.

Actual results:
No logs to suggest what is going on with lacp sync

Expected results:
should have more logs to help in troubleshoot

Additional info:
I have performed this with ovs kernel datapath, however same would be true for ovs-dpdk datapath as well.

Comment 3 Mike Pattrick 2024-09-09 17:37:48 UTC
Reopening as a jira ticket: https://issues.redhat.com/browse/FDP-778


Note You need to log in before you can comment on or make changes to this bug.