Bug 1965074 - [OVN Kubernetes] ovnkube errors observed on 100 node clusters during uperf testing Fatal error: ofport of patch-br-ex_ip-<node_ip>.us-east-2.compute.internal-to-br-int has changed from [] to 2
Summary: [OVN Kubernetes] ovnkube errors observed on 100 node clusters during uperf te...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.8
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.9.0
Assignee: Mohamed Mahmoud
QA Contact: Kedar Kulkarni
URL:
Whiteboard: perfscale-ovn
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-05-26 18:21 UTC by Kedar Kulkarni
Modified: 2021-10-18 17:32 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-10-18 17:31:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github ovn-org ovn-kubernetes pull 2245 0 None closed Bug 1965074: return an error for empty openflow patch and/or phy ports. 2021-06-24 11:25:02 UTC
Red Hat Product Errata RHSA-2021:3759 0 None None None 2021-10-18 17:32:04 UTC

Description Kedar Kulkarni 2021-05-26 18:21:55 UTC
Description of problem:
While running uperf testing with 100 node clusters,with ovn-kubernetes networking, with serviceip based testing, errors are observed as shown below in the additional info section.

Version-Release number of selected component (if applicable):
4.8.fc.5

How reproducible:
Almost 100%

Steps to Reproduce:
1.Deploy OCP cluster and scale to 100 nodes
2.Run Uperf Serviceip workloads from https://github.com/cloud-bulldozer/e2e-benchmarking/blob/master/workloads/network-perf/run_serviceip_network_test_fromgit.sh
3.Observe ovnkube-node logs(for the nodes where uperf pods are running) and uperf client logs.

Actual results:
Errors seen with OVN as below.

Expected results:
No errors with OVN, uperf tests succeed.

Additional info:
Uperf Logs Snippet:
 
05-25 12:58:15.193  Error connecting to 172.30.165.228
05-25 12:58:15.193  
05-25 12:58:15.193  ** TCP: Cannot connect to 172.30.165.228:20000 Connection refused

OCP Logs snippet:


oc describe pods/ovnkube-node-52b4m
...
State:       Running
      Started:   Wed, 26 May 2021 15:53:00 +0000        
    Last State:  Terminated                             
      Reason:    Error   
      Message:   fq8r 8d1658d06b394fe0e4816a1df7af2b22fa68529afeaede2b339327c6d4b7cce4] ADD finished CNI request [openshift-network-diagnostics/network-check-target-8fq8r 8d1658d06b394fe0e4816a1df7af2b22fa68529afeaede2b339327c6d4b7cce4], result "{\"Resul
t\":{\"interfaces\":[{\"name\":\"8d1658d06b394fe\",\"mac\":\"92:ee:f2:7b:d5:ef\"},{\"name\":\"eth0\",\"mac\":\"0a:58:0a:82:b2:04\",\"sandbox\":\"/var/run/netns/c341fa4b-ae89-45fa-8c40-e6e18867bd2d\"}],\"ips\":[{\"version\":\"4\",\"interface\":1,\"address
\":\"10.130.178.4/23\",\"gateway\":\"10.130.178.1\"}],\"dns\":{}},\"PodIFInfo\":null}", err <nil>
I0526 15:52:58.128819    3095 cni.go:223] [openshift-multus/network-metrics-daemon-fb96j 46163220accd1fca4afc9c223c93bf8d278c4b422faea9f99e70278e5bfe094c] ADD finished CNI request [openshift-multus/network-metrics-daemon-fb96j 46163220accd1fca4afc9c223c9
3bf8d278c4b422faea9f99e70278e5bfe094c], result "{\"Result\":{\"interfaces\":[{\"name\":\"46163220accd1fc\",\"mac\":\"46:12:55:6d:36:34\"},{\"name\":\"eth0\",\"mac\":\"0a:58:0a:82:b2:03\",\"sandbox\":\"/var/run/netns/14a8cee6-3b6e-4fbf-83b9-05549ba8d607\"
}],\"ips\":[{\"version\":\"4\",\"interface\":1,\"address\":\"10.130.178.3/23\",\"gateway\":\"10.130.178.1\"}],\"dns\":{}},\"PodIFInfo\":null}", err <nil>
I0526 15:52:58.161183    3095 cni.go:223] [openshift-dns/dns-default-4hczw 8b39cd35520bded638f7dfb7214ec993750afc6c0c1999491b80305340c014ca] ADD finished CNI request [openshift-dns/dns-default-4hczw 8b39cd35520bded638f7dfb7214ec993750afc6c0c1999491b80305
340c014ca], result "{\"Result\":{\"interfaces\":[{\"name\":\"8b39cd35520bded\",\"mac\":\"8e:63:2f:de:9f:04\"},{\"name\":\"eth0\",\"mac\":\"0a:58:0a:82:b2:05\",\"sandbox\":\"/var/run/netns/3afb7df8-01ec-4157-aa8e-93c7623d3c08\"}],\"ips\":[{\"version\":\"4
\",\"interface\":1,\"address\":\"10.130.178.5/23\",\"gateway\":\"10.130.178.1\"}],\"dns\":{}},\"PodIFInfo\":null}", err <nil>
E0526 15:52:59.723990    3095 healthcheck.go:188] Fatal error: ofport of patch-br-ex_ip-10-0-190-217.us-east-2.compute.internal-to-br-int has changed from [] to 2

Comment 4 Kedar Kulkarni 2021-08-09 18:16:04 UTC
Hi,

I ran the same tests as mentioned in the initial bz comment, on version 4.9.0-0.nightly-2021-08-07-175228 , and I didn't see any errors as mentioned originally.

Based on that info, I am closing the bz as Verified.

THanks,
KK.

Comment 7 errata-xmlrpc 2021-10-18 17:31:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759


Note You need to log in before you can comment on or make changes to this bug.