Bug 2182549

Summary: OVS MAC tables not updating correctly upon F5 virtual Load Balancers HA switchover
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Francois Palin <fpalin>
Component: openvswitch2.13Assignee: Mike Pattrick <mpattric>
Status: CLOSED COMPLETED QA Contact: ovs-qe
Severity: high Docs Contact:
Priority: unspecified    
Version: RHEL 8.0CC: aconole, ctrautma, fleitner, jhsiao, ralongi
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-09-09 16:13:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Francois Palin 2023-03-29 02:34:43 UTC
Description of problem:
Customer has a setup that is using 2 x F5 Load Balancer instances in Active/Standby mode. The Active LB1 on Host1 is owner of a "masquerade" MAC. Upon switchover, LB2 on Host2 becomes Active and new owner of "masquerade" MAC.
"Masquerade" MAC is an F5 proprietary mechanism which is somewhat the MAC equivalent of a virtual IP address (VIP).
Upon switchover, what is expected to happen is:
  1) LB2 sends syn to OVS2, and OVS2 adds new entry in its MAC address table for the "masquerade" MAC (source MAC address of the frame) 
     and corresponding switch port
  2) OVS1 gets updated to have the "masquerade" MAC removed from its MAC table.
     This gets done through OVS distributing the MAC table via OpenFlow flows, if not mistaken.

In step 1) above, we do see that the OVS2 MAC table gets updated for the "masquerade" MAC and switch port.
In step 2) however, we still see in OVS1 the "masquerade" MAC entry in MAC table.

We also saw that computes hosting servers beyond the LB's were getting port updates 
for LB1 and LB2 on both FDB/CAM tables.

This was observed on system during maintenance window, without live traffic on the LBs,
and manually forcing a LB switchover.

What was observed with live traffic (before maintenance window debugging session), was that after switchover to LB2, return traffic was still getting sent to standby LB1, but only for a few minutes. F5 thinks that this got "fixed" after an entry (that we did not debug) has expired in the FDB/CAM table.

  System uses VxLAN network configuraion.
  RHOSP version: 16.1
  openvswitch2.13-2.13.0-71


How reproducible:


Steps to Reproduce:


Actual results:


Expected results:


Additional info: