Bug 2077357 - [release-4.11] 200ms packet delay with OVN controller turn on
Summary: [release-4.11] 200ms packet delay with OVN controller turn on
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.9
Hardware: Unspecified
OS: Linux
Target Milestone: ---
: 4.11.0
Assignee: mcambria@redhat.com
QA Contact: Anurag saxena
Depends On:
Blocks: 2079044
TreeView+ depends on / blocked
Reported: 2022-04-21 07:10 UTC by Aaron Park
Modified: 2022-08-17 01:35 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 2079044 (view as bug list)
Last Closed: 2022-08-10 11:08:03 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Private Priority Status Summary Last Updated
Github openshift ovn-kubernetes pull 1052 0 None Merged Bug 2077357: Bump OVN to ovn22.03-22.03.0-24 2022-07-03 02:45:39 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 11:08:24 UTC

Description Aaron Park 2022-04-21 07:10:10 UTC
Description of problem:

Call failure and packet drop occurred in OCP 4.8.

And after upgrading the cluster to 4.9:
  - Packet drop judged to be caused by upcall is not found
  - Call failure occurred under any conditions until 4/8, but did not occur after 4/11.
  - Unconfirmed any difference between 4/8 and 4/11

Result when applying OVN controller pause
  - 200ms delay does not occur
  - Upcall does not occur, but Out of Order still occurs

The above result was confirmed under the following circumstances
1) Pod etho <---> Out of order occurs when ovs receives a packet sent by pod between ovs interface
2) When an out of order occurs between genev interface <---> genev interface
In both cases, 200ms delay did not occur, and after requesting DUP_ACK 5 times or less, reordering was processed without retransmission.

Result when applying OVN controller turn on
  - '200ms delay' by Out-of-Order occurs again
  - Currently, call failure does not occur in both OVN controller turn on/pause, so it is not possible to determine whether it is related.

Version-Release number of selected component (if applicable):

- OCP 4.9.23(ovn-kubernetes)
- baremetal 
- IPv6

How reproducible:

Steps to Reproduce:

Actual results:

When the reproducibility test is performed, '200ms delay' due to Out of Order occurs.

Expected results:

Find out why there are a lot of upcalls in 'OVN controller turn on' and there should be no '200ms delay'

Additional info:

Comment 9 errata-xmlrpc 2022-08-10 11:08:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.