Bug 1579417 - [Netvirt][NAT] Connectivity from non-FIP instances to an external IP fails when using VLAN setup
Summary: [Netvirt][NAT] Connectivity from non-FIP instances to an external IP fails wh...
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: opendaylight
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
Target Milestone: rc
: 13.0 (Queens)
Assignee: Sridhar Gaddam
QA Contact: Itzik Brown
Whiteboard: odl_netvirt,odl_nat
Depends On:
Blocks: 1528948 1585449
TreeView+ depends on / blocked
Reported: 2018-05-17 15:21 UTC by Itzik Brown
Modified: 2018-10-18 08:07 UTC (History)
10 users (show)

Fixed In Version: opendaylight-8.0.0-11.el7ost
Doc Type: Known Issue
Doc Text:
SNAT support requires configuring VXLAN tunnels regardless of the encapsulation used in the tenant networks. It is also necessary to configure the MTU correctly when using VLAN tenant networks, since the VXLAN Tunnel header is added to the payload and this could cause the packet to exceed the default MTU (1500 Bytes). The VXLAN tunnels have to be properly configured in order for the SNAT traffic to flow through them. When using VLAN tenant networks, use one of the following methods to configure MTU so that SNAT traffic can flow through the VXLAN tunnels:: * Configure VLAN tenant based networks to use an MTU of 1450 on a per network configuration. * Set NeutronGlobalPhysnetMtu heat parameter to 1450. Note: the implication of this means all flat/VLAN provider networks will have a 1450 MTU, which may not be desirable (especially for external provider networks). * Configure tenant network underlay with MTU of 1550 (or higher). This includes setting the MTU in the NIC templates for tenant network NIC.
Clone Of:
: 1585449 (view as bug list)
Last Closed: 2018-06-01 20:20:08 UTC
Target Upstream Version:

Attachments (Terms of Use)
Logs capture (5.36 MB, text/plain)
2018-05-17 15:33 UTC, Itzik Brown
no flags Details

Description Itzik Brown 2018-05-17 15:21:10 UTC
Description of problem:
When launching an instance and on a node other than the one that holds that NAPT switch - there is no connectivity to an external IP.
The setup is bare metal setup with VLAN. One controller and 2 compute nodes.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.Create an external and internal network
2.Create a router and attach the external and internal networks
3.Note the compute node that the NAPT switch is on.
4.Launch an instance on a compute node other than the one in step 3.
5.Check connectivity to an external IP.

Actual results:

Expected results:

Additional info:

Comment 1 Itzik Brown 2018-05-17 15:33:50 UTC
Created attachment 1438016 [details]
Logs capture

Comment 5 Ariel Adam 2018-05-28 08:41:12 UTC
Moved to POST given the solution was merged upstream

Comment 13 Sridhar Gaddam 2018-05-31 05:57:07 UTC
On further analysis:

Currently, SNAT traffic always goes through overlay tunnels by design, even when the tenant networking is VLAN.

This is a design oversight on Netvirt and the solution is not simple and would have to be taken up as an RFE.

To get SNAT to work on VLAN setup we need to:
1. Make sure the tunnels are properly set up, so that SNAT traffic can flow through them.
2. Make sure the VMs get MTU 1450, otherwise the overlay encap would make the packet bigger and likely to drop due to default MTU 1500.

Given this information it seems we should:
a. Close the existing bug as VERIFIED with release notes detailing this situation.
b. Open a bug/RFE to fix this design mess in a future release.

In the setup that @Itzik was using, the issue was [1 - above]. After the tunnels were properly setup, SNAT traffic was going fine.

Comment 20 Sridhar Gaddam 2018-06-03 05:26:38 UTC
The following RHBZ is opened to track the RFE.

Comment 21 Ariel Adam 2018-06-03 06:28:10 UTC
This is not a bug fix, it's a design which should be implemented in Fluorine or a following release.
Since OSP13 ODL is using Oxygen (prior to Fluorine) we can't get this feature.

Comment 22 Aswin Suryanarayanan 2018-06-11 13:11:58 UTC
Yes , this more a feature  than a bug and needs design considerations.

Note You need to log in before you can comment on or make changes to this bug.