Bug 2217470

Summary: Routes not withdrawn when bridge-mappings are removed
Product: Red Hat OpenStack Reporter: Eduardo Olivares <eolivare>
Component: ovn-bgp-agentAssignee: Luis Tomas Bolivar <ltomasbo>
Status: MODIFIED --- QA Contact: Candido Campos <ccamposr>
Severity: low Docs Contact:
Priority: medium    
Version: 17.1 (Wallaby)CC: dalvarez, lmartins, ltomasbo
Target Milestone: z2Keywords: Triaged
Target Release: 17.1Flags: ifrangs: needinfo? (ltomasbo)
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovn-bgp-agent-0.4.1-17.1.20230629160950.c930d52.el9osttrunk Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Eduardo Olivares 2023-06-26 11:13:14 UTC
Description of problem:
The bridge-mappings configured on a controller node where the ovn-bgp-agent is exposing several IPs (e.g. FIPs, in case DVR is disabled) are removed:
[root@ctrl-2-0 ~]# ovs-vsctl set Open_vSwitch . external_ids:ovn-bridge-mappings=\"\" 

The FIPs from this controller are moved to another controller because OVN moves the cr-lrp ports to a chassis where bridge-mappings are correctly configured.

The ovn-bgp-agent running on the second controller detects the FIPs have been exposed on this new node and successfully exposes them from there.
However, the ovn-bgp-agent running on the first controller cannot withdraw those FIPs because it cannot access its bridge-mappings:
2023-06-26T10:53:42.772236691+00:00 stdout F 2023-06-26 10:53:42.770 670456 ERROR ovn_bgp_agent.agent [-] Unexpected exception while running the sync: list index out of range: IndexError: list index out of range
2023-06-26T10:53:42.772236691+00:00 stdout F 2023-06-26 10:53:42.770 670456 ERROR ovn_bgp_agent.agent Traceback (most recent call last):
2023-06-26T10:53:42.772236691+00:00 stdout F 2023-06-26 10:53:42.770 670456 ERROR ovn_bgp_agent.agent   File "/usr/lib/python3.9/site-packages/ovn_bgp_agent/agent.py", line 53, in sync
2023-06-26T10:53:42.772236691+00:00 stdout F 2023-06-26 10:53:42.770 670456 ERROR ovn_bgp_agent.agent     self.agent_driver.sync()
2023-06-26T10:53:42.772236691+00:00 stdout F 2023-06-26 10:53:42.770 670456 ERROR ovn_bgp_agent.agent   File "/usr/lib/python3.9/site-packages/oslo_concurrency/lockutils.py", line 360, in inner
2023-06-26T10:53:42.772236691+00:00 stdout F 2023-06-26 10:53:42.770 670456 ERROR ovn_bgp_agent.agent     return f(*args, **kwargs)
2023-06-26T10:53:42.772236691+00:00 stdout F 2023-06-26 10:53:42.770 670456 ERROR ovn_bgp_agent.agent   File "/usr/lib/python3.9/site-packages/ovn_bgp_agent/drivers/openstack/ovn_bgp_driver.py", line 173, in sync
2023-06-26T10:53:42.772236691+00:00 stdout F 2023-06-26 10:53:42.770 670456 ERROR ovn_bgp_agent.agent     bridge = bridge_mapping.split(":")[1]
2023-06-26T10:53:42.772236691+00:00 stdout F 2023-06-26 10:53:42.770 670456 ERROR ovn_bgp_agent.agent IndexError: list index out of range


Due to this, the route to any of those FIPs is exposed via BGP from two different nodes, causing a conflict and connectivity issues.

The issue is resolved when the bridge-mappings are correctly configured on the first controller again:
[root@ctrl-2-0 ~]# ovs-vsctl set Open_vSwitch . external_ids:ovn-bridge-mappings=\"provider1:br-ex,provider2:br-vlan\"


Version-Release number of selected component (if applicable):
RHOS-17.1-RHEL-9-20230621.n.1

How reproducible:
100%

Steps to Reproduce:
1. remove the ovn-bridge-mappings configured on an overcloud node (see commands above)
2.
3.