Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.

Bug 2078986

Summary:

[OVN SCALE] Scalability issues due to arp responder logical flows

Product:

Red Hat Enterprise Linux Fast Datapath

Reporter:

Dumitru Ceara <dceara>

Component:

OVN

Assignee:

Ales Musil <amusil>

Status:

CLOSED WONTFIX

QA Contact:

Jianlin Shi <jishi>

Severity:

high

Docs Contact:

Priority:

high

Version:

FDP 22.C

CC:

amusil, ctrautma, dcbw, jiji, mmichels, surya

Target Milestone:

---

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2023-08-04 14:14:09 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

2084668

Bug Blocks:

Attachments:

Description	Flags
density-light-120node NB DB	none

Description Dumitru Ceara 2022-04-26 16:09:41 UTC

Created attachment 1875125 [details]
density-light-120node NB DB

Description of problem:

In a large scale deployment, e.g., during a density-light OpenShift
scale test running a cluster of 120 nodes and 13K pods, northd spends
a large amount of time processing and generating logical flows that
are used to reply to ARP requests.

With the attached database, focusing on a single logical port that
corresponds to an OCP POD (13b39b78-node-density-20220329_node-density-8311):

    port 13b39b78-node-density-20220329_node-density-8311
        addresses: ["0a:58:0a:a8:00:4f 10.168.0.79"]

There are two types of ARP responder flows:

1. In the logical switch pipeline:

  table=18(ls_in_arp_rsp      ), priority=100  , match=(arp.tpa == 10.168.0.79 && arp.op == 1 && inport == "13b39b78-node-density-20220329_node-density-8311"), action=(next;)
  table=18(ls_in_arp_rsp      ), priority=50   , match=(arp.tpa == 10.168.0.79 && arp.op == 1), action=(eth.dst = eth.src; eth.src = 0a:58:0a:a8:00:4f; arp.op = 2; /* ARP reply */ arp.tha = arp.sha; arp.sha = 0a:58:0a:a8:00:4f; arp.tpa = arp.spa; arp.spa = 10.168.0.79; outport = inport; flags.loopback = 1; output;)

These flows above can probably be skipped if all the VIF logical ports
that are part of that logical switch are claimed by the same chassis.
In such cases ARP requests will never leave br-int and there's no point
to try to optimize packet flow with an explicit ARP responder flow.  We
can just as easily let the VIF that owns the IP reply to the ARP itself.

2. In the logical router pipeline:

  table=15(lr_in_arp_resolve  ), priority=100  , match=(outport == "rtos-ip-10-0-177-133.us-west-2.compute.internal" && reg0 == 10.168.0.79), action=(eth.dst = 0a:58:0a:a8:00:4f; next;)

These flows can probably be skipped if the logical router is configured
to dynamically resolve unknown next-hops, i.e., if the logical router
is configured with NB.Logical_Router.options:dynamic_neigh_routers=true.

In ovn-kubernetes the ovn_cluster_router does *not* have
dynamic_neigh_routers=true but there should be no reason to not enable
it.

All in all, measuring the impact of avoiding generating these two types
of logical flows in ovn-northd when running with the attached database,
we see that one ovn-northd event processing loop iteration is reduced by
~300ms (from ~1500ms to ~1200ms).

Comment 3 Dan Williams 2023-08-04 13:50:20 UTC

Upstream patchset for MAC binding aging: http://patchwork.ozlabs.org/project/ovn/list/?series=366554&state=%2A&archive=both