Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 2077357

Summary:	[release-4.11] 200ms packet delay with OVN controller turn on
Product:	OpenShift Container Platform	Reporter:	Aaron Park <aapark>
Component:	Networking	Assignee:	mcambria <mcambria>
Networking sub component:	ovn-kubernetes	QA Contact:	Anurag saxena <anusaxen>
Status:	CLOSED ERRATA	Docs Contact:
Severity:	urgent
Priority:	urgent	CC:	anusaxen, dcbw, ealcaniz, eglottma, jseunghw, mcambria, openshift-bugs-escalate, xmu, zzhao
Version:	4.9
Target Milestone:	---
Target Release:	4.11.0
Hardware:	Unspecified
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	No Doc Update
Doc Text:		Story Points:	---
Clone Of:
Clones:	2079044 (view as bug list)		Environment:
Last Closed:	2022-08-10 11:08:03 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	2079044

Description Aaron Park 2022-04-21 07:10:10 UTC

Description of problem:

Call failure and packet drop occurred in OCP 4.8.

And after upgrading the cluster to 4.9:
  - Packet drop judged to be caused by upcall is not found
  - Call failure occurred under any conditions until 4/8, but did not occur after 4/11.
  - Unconfirmed any difference between 4/8 and 4/11

Result when applying OVN controller pause
  - 200ms delay does not occur
  - Upcall does not occur, but Out of Order still occurs

The above result was confirmed under the following circumstances
1) Pod etho <---> Out of order occurs when ovs receives a packet sent by pod between ovs interface
2) When an out of order occurs between genev interface <---> genev interface
In both cases, 200ms delay did not occur, and after requesting DUP_ACK 5 times or less, reordering was processed without retransmission.

Result when applying OVN controller turn on
  - '200ms delay' by Out-of-Order occurs again
  - Currently, call failure does not occur in both OVN controller turn on/pause, so it is not possible to determine whether it is related.

Version-Release number of selected component (if applicable):

- OCP 4.9.23(ovn-kubernetes)
- baremetal 
- IPv6

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:

When the reproducibility test is performed, '200ms delay' due to Out of Order occurs.


Expected results:

Find out why there are a lot of upcalls in 'OVN controller turn on' and there should be no '200ms delay'


Additional info:

Comment 9 errata-xmlrpc 2022-08-10 11:08:03 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069