1835386 – [RFE] Remove join logical switch requirement for GW router

The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.

Bug 1835386 - [RFE] Remove join logical switch requirement for GW router

Summary: [RFE] Remove join logical switch requirement for GW router

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Enterprise Linux Fast Datapath
Classification:	Red Hat
Component:	OVN
Sub Component:
Version:	RHEL 8.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	low
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	OVN Team
QA Contact:	Jianlin Shi
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-05-13 17:39 UTC by Tim Rozet
Modified:	2021-05-04 20:48 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-05-04 20:48:48 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Tim Rozet 2020-05-13 17:39:20 UTC

Description of problem:
Today we are required to use a join logical switch between a gateway router and an distributed router to connect the 2 entities:

gw router =====join switch====distributed router

At scale this ends up creating tons of join switches just so we can connect the gw router to the distributed router, which are the only 2 entities on this switch. There is no networking reason why this switch should exist, and these routers ideally should be directly connected.

The result of this is less confusion about the OVN logical networking topology as well as reducing many unnecessary join switch configurations.

Comment 2 Mark Michelson 2020-05-14 12:42:09 UTC

Re-asking in a public comment what I left in the previous private comment:

What is missing from OVN to allow for the join switch to be removed? What does the join switch accomplish that Logical_Router_Static_Route or Logical_Router_Policy can't?

Comment 3 Dan Williams 2020-05-22 01:04:41 UTC

AFAIK the "join" switch is used to connect the single distributed cluster router (to which all node switches are connected) to each node's gateway router. Unless we can have DR -> GR direct connections, that means we need a join switch in the middle... But Tim has thoughts here I'm sure.

Comment 4 Mark Michelson 2020-05-22 21:00:31 UTC

I guess I'm just missing why something would be preventing DR -> GR direct connections right now. AFAIK, the join switch basically does nothing special other than being a connector between the distributed router and each chassis' gateway router. It doesn't have any load balancers, or ACLs or anything else that would manipulate traffic.

If I understand requirements correctly, you can think of it as the following:

* N->S traffic that enters the gateway router from the "external" switch is load balanced (or DNATted) and then sent to the join switch. Ideally, you would want the traffic to be sent directly from the gateway router to the distributed router instead.
* S->N traffic that goes from a pod to the distributed router is sent to the join switch and then to the local gateway router to be SNATted and sent to the "external" switch. Ideally, you would want the traffic to skip the join switch and go directly to the gateway router.

The question is, have I missed some other types of traffic that the join switch handles? Have I missed any functionality that the join switch has besides connecting the two routers?

I'm curious if something like the following would work:


For S->N traffic:
* On the distributed router, add a static route between it and each gateway router. Static routes can route based on source IP address. Therefore, you could set these static routes to route based on the node switch's subnet. This would allow S->N traffic to egress out the local gateway router. For the dual-stack case, you'd need to set up one route for IPv4 and one route for IPv6.
* If you have a case where you need S->N traffic to egress out a different node's gateway router, you have a couple of options:
    1) If the choice of gateway router is based on destination IP address, then set up a static route with destination based routing. The destination-based route will supersede the source-based route.
    2) If the choice of gateway is based on something more complicated, then set up a Logical_Router_Policy. This allows for ACL-style match statements to be used to make a determination of where to send the packet.

For N->S traffic:
I'll be honest, I'm not sure if you actually need anything here other than connecting the gateway router to the distributed router. I *think* OVN will set up the flows so that packets destined to any of the distributed router's subnets should be routed to the distributed router. But if I'm wrong, then you could set up static routes for each subnet, or set up a single Logical_Router_Policy to match all subnets (perhaps using an address set).


Is there anything I'm missing?

Comment 5 Tim Rozet 2020-05-27 13:57:01 UTC

Hi Mark,
There is no reason to have the join switch, other than:
1) because you don't want to make a bunch of small subnets for p2p connections between each GR and DR
2) because the OVN documentation tells you to use a join switch:
Gateway routers are typically used in between distributed logical
routers and physical networks. The distributed logical router and the
logical switches behind it, to which VMs and containers attach, effec‐
tively reside on each hypervisor. The distributed router and the gate‐
way router are connected by another logical switch, sometimes referred
to as a join logical switch. On the other side, the gateway router con‐
nects to another logical switch that has a localnet port connecting to
the physical network.

The 3rd concern from the community is that by somehow using a bunch of subnets and more ports on the DR, that will somehow increase the number of OF flows. It is an unknown at this point, but imho we shouldn't be adding networking workarounds to satisfy OVN OF usage...it should be the other way around. If we do find removing the logical switch somehow increases number of flows by some exponential amount, we should fix that as well.

In ovn-k8s we were using a single join switch and connecting all nodes to it. So you would have like 500 GRs connected to a single join switch then connected to a DR. This causes too large of a broadcast domain and "ARP table explosion", so the solution was to use 1 join switch per node. So now if we have 500 nodes, we have 500 useless join switches in ovn-k8s architecture...

Comment 6 Dan Williams 2021-05-03 13:23:02 UTC

Note that ovn-kubernetes was able to move back to a single join switch when some of the ARP flow explosions were fixed. But I think there's still a (somewhat less critical now) case to be made for elimination of the single join switch anyway, since we don't really need/care about ARP there...

Comment 7 Dan Williams 2021-05-04 20:48:48 UTC

OCP needs the join switch for egress firewall so this is no longer relevant.

Note You need to log in before you can comment on or make changes to this bug.