Bug 1378052 - Unable to reach instance via floating IP due to MTU mismatch in virtual environment with network isolation
Summary: Unable to reach instance via floating IP due to MTU mismatch in virtual envir...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 10.0 (Newton)
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: beta
: 10.0 (Newton)
Assignee: Assaf Muller
QA Contact: Toni Freger
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-09-21 12:15 UTC by Marius Cornea
Modified: 2017-12-27 09:23 UTC (History)
17 users (show)

Fixed In Version: openstack-tripleo-heat-templates-5.0.0-0.20160907212643.90c852e.2.el7ost, openstack-neutron-9.0.0-0.20160907193737.dc6508a.1.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-12-14 16:03:05 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2016:2948 normal SHIPPED_LIVE Red Hat OpenStack Platform 10 enhancement update 2016-12-14 19:55:27 UTC
OpenStack gerrit 368233 None None None 2016-09-22 14:52:40 UTC
OpenStack gerrit 368553 None None None 2016-09-22 14:21:03 UTC
Red Hat Bugzilla 1376336 None CLOSED Tracker: OVS agent is not removing VLAN tags before tunnels when configured with native OF interface 2019-06-13 07:35:19 UTC

Internal Links: 1376336

Description Marius Cornea 2016-09-21 12:15:14 UTC
Description of problem:
Unable to reach instance via floating IP due to MTU mismatch in virtual environment with network isolation:

Version-Release number of selected component (if applicable):
openstack-tripleo-heat-templates-5.0.0-0.20160907212643.90c852e.1.el7ost.noarch

How reproducible:
100%

Steps to Reproduce:
1. Deploy overcloud 
2. Create a tenant network, router and an external netowrk
3. Launch instance on the tenant network and attach it a floating ip
4. Try to reach instance via floating IP

Actual results:
Unable to ssh:
[stack@undercloud-0 ~]$ ssh fedora@172.16.18.146
Connection closed by 172.16.18.146

Expected results:
Successful SSH connection.  

Additional info:
Here are the created network details:
http://paste.openstack.org/show/582392/

The MTU for the external network is 1500 and the one for the tenant network is 1450.

This is a default virt environment and the deployment uses network isolation. To workaround this issue I had to pass the following parameter when deploying:

NeutronGlobalPhysnetMtu: 1496

which resulted in the external network having a MTU of 1500 and the tenant network an mtu of 1446

Previously this used to work by default, without any adjustments.

Comment 2 Marius Cornea 2016-09-21 12:17:01 UTC
(In reply to Marius Cornea from comment #0)
> NeutronGlobalPhysnetMtu: 1496
> 
> which resulted in the external network having a MTU of 1500 and the tenant
> network an mtu of 1446

Correction here - the external network results with a 1496 MTU

Comment 4 Ihar Hrachyshka 2016-09-22 13:32:37 UTC
External 1500 to internal 1450 should still work, because on L3 boundary (router) fragmentation should happen if needed. The fact that 4 bytes reduction helps to fix the issue suggests that it may be the upstream https://bugs.launchpad.net/neutron/+bug/1622017 that was fixed in RC1: https://review.openstack.org/#/c/368553/

Could you please retest with a newer python-neutron package that would include the patch I mentioned above? Alternatively, you can validate if switching to ovs-ofctl of_interface helps.

Comment 5 Marius Cornea 2016-09-22 14:08:39 UTC
(In reply to Ihar Hrachyshka from comment #4)
> External 1500 to internal 1450 should still work, because on L3 boundary
> (router) fragmentation should happen if needed. The fact that 4 bytes
> reduction helps to fix the issue suggests that it may be the upstream
> https://bugs.launchpad.net/neutron/+bug/1622017 that was fixed in RC1:
> https://review.openstack.org/#/c/368553/
> 
> Could you please retest with a newer python-neutron package that would
> include the patch I mentioned above? Alternatively, you can validate if
> switching to ovs-ofctl of_interface helps.

After applying https://review.openstack.org/#/c/368553/ to the overcloud image I was able to successfully SSH to the instance with the default MTU:

http://paste.openstack.org/show/582585/

Comment 6 Ihar Hrachyshka 2016-09-22 14:10:23 UTC
Comment 5 also suggests that the workaround with using an alternative driver for of_interface would help too. It's up to us how we proceed. I suggest a backport for neutron.

Comment 7 Assaf Muller 2016-09-22 14:21:03 UTC
Already fixed and merged, will be available in OSP 10 puddle based off RC1.

Comment 9 Assaf Muller 2016-09-22 15:44:00 UTC
We ended up backporting the fix so it's available in an RPM based off M3. The cherry pick will not be required once the RPM is rebased to be based off RC1.

Comment 10 Alexander Chuzhoy 2016-09-22 23:11:45 UTC
Verified:
Environment:
openstack-tripleo-heat-templates-5.0.0-0.20160907212643.90c852e.2.el7ost.noarch
openstack-neutron-9.0.0-0.20160907193737.dc6508a.1.el7ost.noarch

Was able to ssh into the launched instance.

Comment 11 Alexander Chuzhoy 2016-09-22 23:22:12 UTC
The MTU of the created VXLAN tenant network is: 1446

Comment 14 errata-xmlrpc 2016-12-14 16:03:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-2948.html


Note You need to log in before you can comment on or make changes to this bug.