Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1579417

Summary: [Netvirt][NAT] Connectivity from non-FIP instances to an external IP fails when using VLAN setup
Product: Red Hat OpenStack Reporter: Itzik Brown <itbrown>
Component: opendaylightAssignee: Sridhar Gaddam <sgaddam>
Status: CLOSED NEXTRELEASE QA Contact: Itzik Brown <itbrown>
Severity: urgent Docs Contact:
Priority: high    
Version: 13.0 (Queens)CC: aadam, asuryana, itbrown, jamsmith, mkolesni, nyechiel, oblaut, sgaddam, takito, trozet
Target Milestone: rcKeywords: Triaged
Target Release: 13.0 (Queens)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: odl_netvirt,odl_nat
Fixed In Version: opendaylight-8.0.0-11.el7ost Doc Type: Known Issue
Doc Text:
SNAT support requires configuring VXLAN tunnels regardless of the encapsulation used in the tenant networks. It is also necessary to configure the MTU correctly when using VLAN tenant networks, since the VXLAN Tunnel header is added to the payload and this could cause the packet to exceed the default MTU (1500 Bytes). The VXLAN tunnels have to be properly configured in order for the SNAT traffic to flow through them. When using VLAN tenant networks, use one of the following methods to configure MTU so that SNAT traffic can flow through the VXLAN tunnels:: * Configure VLAN tenant based networks to use an MTU of 1450 on a per network configuration. * Set NeutronGlobalPhysnetMtu heat parameter to 1450. Note: the implication of this means all flat/VLAN provider networks will have a 1450 MTU, which may not be desirable (especially for external provider networks). * Configure tenant network underlay with MTU of 1550 (or higher). This includes setting the MTU in the NIC templates for tenant network NIC.
Story Points: ---
Clone Of:
: 1585449 (view as bug list) Environment:
N/A
Last Closed: 2018-06-01 20:20:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1528948, 1585449    
Attachments:
Description Flags
Logs capture none

Description Itzik Brown 2018-05-17 15:21:10 UTC
Description of problem:
When launching an instance and on a node other than the one that holds that NAPT switch - there is no connectivity to an external IP.
The setup is bare metal setup with VLAN. One controller and 2 compute nodes.

Version-Release number of selected component (if applicable):
opendaylight-8.0.0-10.el7ost.noarch

How reproducible:


Steps to Reproduce:
1.Create an external and internal network
2.Create a router and attach the external and internal networks
3.Note the compute node that the NAPT switch is on.
4.Launch an instance on a compute node other than the one in step 3.
5.Check connectivity to an external IP.

Actual results:


Expected results:


Additional info:

Comment 1 Itzik Brown 2018-05-17 15:33:50 UTC
Created attachment 1438016 [details]
Logs capture

Comment 5 Ariel Adam 2018-05-28 08:41:12 UTC
Moved to POST given the solution was merged upstream

Comment 13 Sridhar Gaddam 2018-05-31 05:57:07 UTC
On further analysis:
--------------------

Currently, SNAT traffic always goes through overlay tunnels by design, even when the tenant networking is VLAN.

This is a design oversight on Netvirt and the solution is not simple and would have to be taken up as an RFE.

To get SNAT to work on VLAN setup we need to:
1. Make sure the tunnels are properly set up, so that SNAT traffic can flow through them.
2. Make sure the VMs get MTU 1450, otherwise the overlay encap would make the packet bigger and likely to drop due to default MTU 1500.

Given this information it seems we should:
a. Close the existing bug as VERIFIED with release notes detailing this situation.
b. Open a bug/RFE to fix this design mess in a future release.

In the setup that @Itzik was using, the issue was [1 - above]. After the tunnels were properly setup, SNAT traffic was going fine.

Comment 20 Sridhar Gaddam 2018-06-03 05:26:38 UTC
The following RHBZ is opened to track the RFE.
https://bugzilla.redhat.com/show_bug.cgi?id=1585316

Comment 21 Ariel Adam 2018-06-03 06:28:10 UTC
This is not a bug fix, it's a design which should be implemented in Fluorine or a following release.
Since OSP13 ODL is using Oxygen (prior to Fluorine) we can't get this feature.

Comment 22 Aswin Suryanarayanan 2018-06-11 13:11:58 UTC
Yes , this more a feature  than a bug and needs design considerations.