Bug 2053026 - [ovn] Stale ports in OVN database
Summary: [ovn] Stale ports in OVN database
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-networking-ovn
Version: 16.1 (Train)
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: z9
: 16.1 (Train on RHEL 8.2)
Assignee: Jakub Libosvar
QA Contact: Maor
URL:
Whiteboard:
: 2053585 2102636 (view as bug list)
Depends On: 2122791
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-02-10 13:38 UTC by Daniel Alvarez Sanchez
Modified: 2022-12-07 20:26 UTC (History)
9 users (show)

Fixed In Version: python-networking-ovn-7.3.1-1.20220212033901.4e24f4c.el8ost
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 2122791 (view as bug list)
Environment:
Last Closed: 2022-12-07 20:25:48 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1960006 0 None None None 2022-02-10 13:38:14 UTC
OpenStack gerrit 827834 0 None MERGED [ovn] Prevent stale ports in the OVN database 2023-08-28 18:13:09 UTC
Red Hat Issue Tracker OSP-12608 0 None None None 2022-02-10 13:50:19 UTC
Red Hat Product Errata RHBA-2022:8795 0 None None None 2022-12-07 20:26:06 UTC

Description Daniel Alvarez Sanchez 2022-02-10 13:38:15 UTC
This BZ is to track the backport of: https://review.opendev.org/c/openstack/neutron/+/827834


There are situations where, under a lot of control plane activity, OVN ports will stale and won't get cleaned up (unless the neutron-ovn-db-sync tool is run manually).

A possible scenario for this is:

a) Port creation
  a.1) Port created in Neutron DB
  a.b) Port created in OVN Northbound (NB) database.
  a.c) NB ovsdb-server will notify of the port creation to all the connected workers
  a.d) Each worker will eventually process this event and update their in-memory copy of the NB database

Immediately, the port gets deleted via API but the previous a.d) step hasn't been completed by all workers. Then the port deletion API request falls into one of those workers that haven't yet updated their in-memory OVN NB database copy with the newly created port.

b) Port deletion
  b.1) Port deleted from Neutron DB
  b.2) Port attempted to be deleted from OVN NB but lookup fails and its revision number is deleted [0]

At this point, the port will stale forever in the OVN database causing other issues that we have mitigated (eg. [1]) but ultimately the number of OVN resources may grow to a point that can affect very negatively to the overall cluster stability and performance.

A potential workaround to this problem might be to run the neutron-ovn-db-sync tool periodically to get rid of those but it is not recommended to do so while the API is operational.

[0] https://github.com/openstack/neutron/blob/f5030b0bc25216d80b09f7ac3938c9a902b655e3/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_client.py#L698
[1] https://bugs.launchpad.net/neutron/+bug/1874733

Comment 2 Jakub Libosvar 2022-02-16 14:16:39 UTC
*** Bug 2053585 has been marked as a duplicate of this bug. ***

Comment 4 Jakub Libosvar 2022-06-30 14:05:07 UTC
*** Bug 2102636 has been marked as a duplicate of this bug. ***

Comment 19 errata-xmlrpc 2022-12-07 20:25:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1.9 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:8795


Note You need to log in before you can comment on or make changes to this bug.