Bug 1621919 - [Netvirt] ODL should take into account bridge/provider mappings when scheduling routing
Summary: [Netvirt] ODL should take into account bridge/provider mappings when scheduli...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: opendaylight
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: z4
: 13.0 (Queens)
Assignee: Aswin Suryanarayanan
QA Contact: Noam Manos
URL:
Whiteboard: Netvirt
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-08-23 21:34 UTC by Tim Rozet
Modified: 2019-01-16 17:57 UTC (History)
11 users (show)

Fixed In Version: opendaylight-8.3.0-6.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-01-16 17:56:58 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenDaylight gerrit 75571 0 None None None 2018-09-04 19:09:19 UTC
Red Hat Product Errata RHBA-2019:0093 0 None None None 2019-01-16 17:57:08 UTC

Description Tim Rozet 2018-08-23 21:34:51 UTC
ODL North/South routing works by dedicating a compute node to do SNAT/DNAT while FIP routing is done local to the compute node where the instance is. The problem is with scheduling, any node will be chosen regardless of whether or not that node actually has external network access. The same is true about FIP assuming that the node with the instance where the FIP will be placed has external network access.

For example, take Computes A and B. Compute A has external access via an external network created for physnet datacentre, which maps to br-ex bridge. Compute B has no br-ex bridge or external network access.

In this example when SNAT/DNAT is scheduled to a compute node, it should check whether or not that node is part of the external network it is attaching to by examining the provider mappings in hostconfig.

For FIP (static NAT), the same issue occurs. If an instance is scheduled on Compute B, and a floating IP is associated with it, ODL will assume that Compute B must have external network access. Really in this case the flows should be installed so that FIP will be handled by a node that actually does have external access (Compute A in this example).

Comment 1 Dan Sneddon 2018-08-23 22:03:47 UTC
A use case for this is a multi-site deployment with different external and provider network(s) at each edge site. Edge site A uses br-ex-a bridge, while edge site B uses br-ex-b bridge. When scheduling a compute node to do SNAT/DNAT for a particular network, ODL should check to see that the physical network used by the network exists on the compute node.

For instance, suppose deployment-wide bridge mappings are "physnet_a:br-ex-a;physnet_b:br-ex-b", and there are two networks, net_a has physical_network=physnet_a, and net_b has physnet_b. When scheduling a SNAT/DNAT worker for net_a, ODL should include only compute nodes with bridge br-ex-a, and similarly only include compute nodes with br-ex-b as candidates for net_b.

Comment 2 Aswin Suryanarayanan 2018-08-25 08:44:19 UTC
Currently ODL assumes all the computes have external network access. To make it suitable to the topology mentioned above we can consider doing the below changes.

1)For SNAT, the centralized switch is currently scheduled from a pool of available switches. Currently the pool contains all the switches . We can now have separate pools based on the provider mappings. While selecting a NAPT switch the provider mapping on external network shall be used to select the pool and then a switch will be selected from the pool. Thus ensuring the selected switch will have the expected bridges.

2) For FIP before programming the flows on the local compute node we can check if the compute have the necessary bridge mappings. The data structure in (1) where we divided nodes into pools can be used for that. If the we find the node has the appropriate bridge mappings we can continue to program the flows locally. If not we can reuse the NAPT switch selected as per (1) for doing the FIP translations. This does not involve adding any new flows to the local compute, but we will not add any FIP flows to the local compute but all will be added to the remote compute. This includes the ARP responder flows for the FIP as well. The current pipline is capable of carrying the packet from local node (which should be a non-NAPT switch) to NAPT switch the do the FIP forward and reverse translations and the retuen packet will reach back to the local node.

Comment 3 Tim Rozet 2018-08-27 16:07:26 UTC
This sounds good to me Aswin.

Comment 4 Mike Kolesnik 2018-08-28 13:33:23 UTC
(In reply to Aswin Suryanarayanan from comment #2)
> 
> 2) For FIP before programming the flows on the local compute node we can
> check if the compute have the necessary bridge mappings. The data structure
> in (1) where we divided nodes into pools can be used for that. If the we
> find the node has the appropriate bridge mappings we can continue to program
> the flows locally. If not we can reuse the NAPT switch selected as per (1)
> for doing the FIP translations. This does not involve adding any new flows
> to the local compute, but we will not add any FIP flows to the local compute
> but all will be added to the remote compute. This includes the ARP responder
> flows for the FIP as well. The current pipline is capable of carrying the
> packet from local node (which should be a non-NAPT switch) to NAPT switch
> the do the FIP forward and reverse translations and the retuen packet will
> reach back to the local node.

In this case you'll also need to consider that these FIP flows need to move should the NAPT switch migrate for any reason, or that node goes down (which I'm not sure we have a good way to check). Also what happens if the FIP flow gets left over on the old node? How will that affect the network?

To me this sounds like a different bug which should be tracked on it's own (even though the root cause is the same).

Comment 5 Aswin Suryanarayanan 2018-08-29 08:17:49 UTC
(In reply to Mike Kolesnik from comment #4)
> (In reply to Aswin Suryanarayanan from comment #2)
> > 
> > 2) For FIP before programming the flows on the local compute node we can
> > check if the compute have the necessary bridge mappings. The data structure
> > in (1) where we divided nodes into pools can be used for that. If the we
> > find the node has the appropriate bridge mappings we can continue to program
> > the flows locally. If not we can reuse the NAPT switch selected as per (1)
> > for doing the FIP translations. This does not involve adding any new flows
> > to the local compute, but we will not add any FIP flows to the local compute
> > but all will be added to the remote compute. This includes the ARP responder
> > flows for the FIP as well. The current pipline is capable of carrying the
> > packet from local node (which should be a non-NAPT switch) to NAPT switch
> > the do the FIP forward and reverse translations and the retuen packet will
> > reach back to the local node.
> 
> In this case you'll also need to consider that these FIP flows need to move
> should the NAPT switch migrate for any reason, or that node goes down (which
> I'm not sure we have a good way to check). Also what happens if the FIP flow
> gets left over on the old node? How will that affect the network?
> 
> To me this sounds like a different bug which should be tracked on it's own
> (even though the root cause is the same).

With this bug we can ensure that the flows shall be configured in the node which has external connectivity. As you mentioned, the fail-over part for the FIP flows in NAPT switch can be a new bug.

Comment 7 Aswin Suryanarayanan 2018-09-21 10:47:23 UTC
The patch is not merged upstream.

Comment 23 errata-xmlrpc 2019-01-16 17:56:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0093


Note You need to log in before you can comment on or make changes to this bug.