Bug 2217470 - Routes not withdrawn when bridge-mappings are removed [NEEDINFO]
Summary: Routes not withdrawn when bridge-mappings are removed
Keywords:
Status: MODIFIED
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: ovn-bgp-agent
Version: 17.1 (Wallaby)
Hardware: Unspecified
OS: Unspecified
medium
low
Target Milestone: z2
: 17.1
Assignee: Luis Tomas Bolivar
QA Contact: Candido Campos
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-06-26 11:13 UTC by Eduardo Olivares
Modified: 2023-08-03 15:46 UTC (History)
3 users (show)

Fixed In Version: ovn-bgp-agent-0.4.1-17.1.20230629160950.c930d52.el9osttrunk
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:
ifrangs: needinfo? (ltomasbo)


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad.net 2025057 0 None None None 2023-06-26 12:32:15 UTC
OpenStack gerrit 886617 0 None MERGED Ensure agent is protected again wrong/missing bridge mappings 2023-06-29 07:33:34 UTC
Red Hat Issue Tracker OSP-26045 0 None None None 2023-06-26 11:16:49 UTC

Description Eduardo Olivares 2023-06-26 11:13:14 UTC
Description of problem:
The bridge-mappings configured on a controller node where the ovn-bgp-agent is exposing several IPs (e.g. FIPs, in case DVR is disabled) are removed:
[root@ctrl-2-0 ~]# ovs-vsctl set Open_vSwitch . external_ids:ovn-bridge-mappings=\"\" 

The FIPs from this controller are moved to another controller because OVN moves the cr-lrp ports to a chassis where bridge-mappings are correctly configured.

The ovn-bgp-agent running on the second controller detects the FIPs have been exposed on this new node and successfully exposes them from there.
However, the ovn-bgp-agent running on the first controller cannot withdraw those FIPs because it cannot access its bridge-mappings:
2023-06-26T10:53:42.772236691+00:00 stdout F 2023-06-26 10:53:42.770 670456 ERROR ovn_bgp_agent.agent [-] Unexpected exception while running the sync: list index out of range: IndexError: list index out of range
2023-06-26T10:53:42.772236691+00:00 stdout F 2023-06-26 10:53:42.770 670456 ERROR ovn_bgp_agent.agent Traceback (most recent call last):
2023-06-26T10:53:42.772236691+00:00 stdout F 2023-06-26 10:53:42.770 670456 ERROR ovn_bgp_agent.agent   File "/usr/lib/python3.9/site-packages/ovn_bgp_agent/agent.py", line 53, in sync
2023-06-26T10:53:42.772236691+00:00 stdout F 2023-06-26 10:53:42.770 670456 ERROR ovn_bgp_agent.agent     self.agent_driver.sync()
2023-06-26T10:53:42.772236691+00:00 stdout F 2023-06-26 10:53:42.770 670456 ERROR ovn_bgp_agent.agent   File "/usr/lib/python3.9/site-packages/oslo_concurrency/lockutils.py", line 360, in inner
2023-06-26T10:53:42.772236691+00:00 stdout F 2023-06-26 10:53:42.770 670456 ERROR ovn_bgp_agent.agent     return f(*args, **kwargs)
2023-06-26T10:53:42.772236691+00:00 stdout F 2023-06-26 10:53:42.770 670456 ERROR ovn_bgp_agent.agent   File "/usr/lib/python3.9/site-packages/ovn_bgp_agent/drivers/openstack/ovn_bgp_driver.py", line 173, in sync
2023-06-26T10:53:42.772236691+00:00 stdout F 2023-06-26 10:53:42.770 670456 ERROR ovn_bgp_agent.agent     bridge = bridge_mapping.split(":")[1]
2023-06-26T10:53:42.772236691+00:00 stdout F 2023-06-26 10:53:42.770 670456 ERROR ovn_bgp_agent.agent IndexError: list index out of range


Due to this, the route to any of those FIPs is exposed via BGP from two different nodes, causing a conflict and connectivity issues.

The issue is resolved when the bridge-mappings are correctly configured on the first controller again:
[root@ctrl-2-0 ~]# ovs-vsctl set Open_vSwitch . external_ids:ovn-bridge-mappings=\"provider1:br-ex,provider2:br-vlan\"


Version-Release number of selected component (if applicable):
RHOS-17.1-RHEL-9-20230621.n.1

How reproducible:
100%

Steps to Reproduce:
1. remove the ovn-bridge-mappings configured on an overcloud node (see commands above)
2.
3.


Note You need to log in before you can comment on or make changes to this bug.