1995330 – ovn-kubernetes load-balancer operations are very expensive

Bug 1995330 - ovn-kubernetes load-balancer operations are very expensive

Summary: ovn-kubernetes load-balancer operations are very expensive

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	4.9
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	4.9.0
Assignee:	Casey Callendrello
QA Contact:	Anurag saxena
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-08-18 20:21 UTC by Casey Callendrello
Modified:	2021-10-18 17:47 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-10-18 17:47:26 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift ovn-kubernetes pull 666	0	None	None	None	2021-08-18 20:22:29 UTC
Red Hat Product Errata	RHSA-2021:3759	0	None	None	None	2021-10-18 17:47:38 UTC

Description Casey Callendrello 2021-08-18 20:21:36 UTC

ovn-kubernetes creates a single load balancer shared between services. It turns out, updating this LB for every service is very expensive.

So, we should switch to load-balancers per service. There is an upstream fix for this, but we need to backport it.

Comment 2 zhaozhanqi 2021-08-23 03:30:59 UTC

Verified this bug 4.9.0-0.nightly-2021-08-22-070405


1. new project z1

2. Create test pod and service

$ oc get pod -n z1 -o wide
NAME            READY   STATUS    RESTARTS   AGE     IP            NODE                                         NOMINATED NODE   READINESS GATES
test-rc-7s6rp   1/1     Running   0          5m57s   10.131.0.49   ip-10-0-142-138.us-east-2.compute.internal   <none>           <none>
test-rc-vhdc2   1/1     Running   0          5m57s   10.129.2.8    ip-10-0-198-103.us-east-2.compute.internal   <none>           <none>

$ oc get svc
NAME           TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)     AGE
test-service   ClusterIP   172.30.61.246   <none>        27017/TCP   6m17s

3. check the 


sh-4.4# ovn-nbctl list load-balancer fecd8a73-6bb8-459f-b54b-f7e2dec8ba1e
_uuid               : fecd8a73-6bb8-459f-b54b-f7e2dec8ba1e
external_ids        : {"k8s.ovn.org/kind"=Service, "k8s.ovn.org/owner"="z1/test-service"}
health_check        : []
ip_port_mappings    : {}
name                : "Service_z1/test-service_TCP_cluster"
options             : {event="false", reject="true", skip_snat="false"}
protocol            : tcp
selection_fields    : []
vips                : {"172.30.61.246:27017"="10.129.2.8:8080,10.131.0.49:8080"}

Comment 5 errata-xmlrpc 2021-10-18 17:47:26 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759

Note You need to log in before you can comment on or make changes to this bug.