Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.

Bug 2139424

Summary: data plane downtime during the first flow installation.
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Mark Michelson <mmichels>
Component: ovn22.03Assignee: OVN Team <ovnteam>
Status: CLOSED ERRATA QA Contact: Jianlin Shi <jishi>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: FDP 22.DCC: ctrautma, jiji
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-11-21 18:25:32 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Mark Michelson 2022-11-02 13:32:51 UTC
This bug was initially created as a copy of Bug #2089416

I am copying this bug because: 
This copy is made for errata purposes. The original issue was reported against ovn-2021, but this is for ovn22.03 RHEL9.


Description of problem:
During our last OpenStack update from 16.1 to 16.2, we encountered a network dataplane outage on instances at step 3.3 from the documentation [2].  It was detected using a ping on multiple instances  and lasted 1 or 2 minutes.
We found two OVN commits that seems relevant to this behaviour :

    https://github.com/ovn-org/ovn/commit/896adfd2d8b3369110e9618bd190d190105372a9

    https://github.com/ovn-org/ovn/commit/d53c599ed05ea3c708a045a9434875458effa21e

We hope these patches will be soon backported into RHOSP OVN to avoid this issue for the next upgrades.

This outage had a big impact for some of our clients, especially those using Kubernetes clusters as nodes were failing and pods were massively re-scheduled which also led to high CPU usage on compute nodes.

[2] https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html-single/keeping_red_hat_openstack_platform_updated/index#proc_updating-ovn-controller-container_updating-overcloud

Comment 3 Jianlin Shi 2022-11-03 04:49:12 UTC
test result is shown in https://bugzilla.redhat.com/show_bug.cgi?id=2139425#c3

Comment 5 errata-xmlrpc 2022-11-21 18:25:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (ovn22.03 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:8571