Bug 1553839 - if ovs_hybrid_plug=false for a VM instance neutron port, the MTU is not always set correctly
Summary: if ovs_hybrid_plug=false for a VM instance neutron port, the MTU is not alway...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova
Version: 10.0 (Newton)
Hardware: x86_64
OS: Linux
urgent
high
Target Milestone: z9
: 10.0 (Newton)
Assignee: Sahid Ferdjaoui
QA Contact: OSP DFG:Compute
URL:
Whiteboard:
Depends On:
Blocks: 1486344 1547074 1702331 1702564
TreeView+ depends on / blocked
 
Reported: 2018-03-09 17:01 UTC by Matt Flusche
Modified: 2023-03-21 18:45 UTC (History)
29 users (show)

Fixed In Version: openstack-nova-14.1.0-25.el7ost
Doc Type: Bug Fix
Doc Text:
Previously, the MTU of TAP devices was not configured. As a result, the network could be configured with a different MTU than a guest TAP device. With this update, you can configure libvirt when you create the TAP device for the guest. Nova passes the correct parameter to libvirt, and the TAP device now has the same configuration as the network.
Clone Of:
Environment:
Last Closed: 2018-09-17 16:50:53 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 553072 0 None MERGED add mtu to libvirt xml for ethernet and bridge types 2021-01-06 18:44:46 UTC
Red Hat Issue Tracker OSP-4958 0 None None None 2021-12-10 15:52:08 UTC
Red Hat Product Errata RHSA-2018:2714 0 None None None 2018-09-17 16:52:15 UTC

Description Matt Flusche 2018-03-09 17:01:08 UTC
Description of problem:

Note: This is likely an issue with python-os-vif but wanted to get an initial review from the neutron team.

if ovs_hybrid_plug=false in the binding:vif_details the port's MTU is not considered when plugging an instance's port.


Version-Release number of selected component (if applicable):
openstack-neutron-9.4.1-12.el7ost.noarch
python-os-vif-1.2.1-3.el7ost.noarch


How reproducible:
100%

Steps to Reproduce:
1. Typical OSP10 deployment with global_physnet_mtu=9000
2. create neutron network and deploy instance. 
3. verify MTU is set correctly for instance with ovs_hybrid_plug=true for the instance's port (default).  Inspect tap interface and instance's MTU.
4. Delete instance
5. On the compute node, comment out the firewall driver in /etc/neutron/plugins/ml2/openvswitch_agent.ini 
[securitygroup]
#firewall_driver = neutron.agent.linux.iptables_firewall.OVSHybridIptablesFirewallDriver

6. Reboot the compute node (or delete br-int and restart neutron-openvswitch agent), to get br-int back to default config
7. Deploy instance again
8. Verify MTU is incorrect on tap interface and large frames are not allowed.

Actual results:
Broken MTU for instance

Expected results:
Correct MTU set for instance's tap interface.

Additional info:
If br-int happens to get set to a larger MTU (by having OVSHybridIptablesFirewallDriver enabled at some point then disabling it without rebooting or recreating br-int) the instance's MTU will be set correctly.  So the MTU issue can be intermittent.

Comment 1 Ihar Hrachyshka 2018-03-12 18:09:37 UTC
My understanding is that the issue is that os-vif before Ocata ignored MTU as specified by Neutron. The patch to fix it is: https://review.openstack.org/#/c/370667/ Note that it changed payload format for Networks so I am not sure if it's safe to backport it as-is. I will leave this exercise to Compute team.

Comment 2 Ihar Hrachyshka 2018-03-12 18:12:44 UTC
Hm nevermind, I now see that the patch I identified was backported in a recent os-vif OSP release:

+* Mon Jul 31 2017 Sahid Orentino Ferdjaoui <sahid.ferdjaoui> 1.2.1-2
+- introduces MTU support for vhost-user (rhbz#1447081)
+- vif_plug_ovs: Always set MTU when plugging devices
+- remove use of contextlib and with nested (rhbz#1447081)
+- add support for vhost-user reconnect (rhbz#1447081)
+- Add MTU to Network model and use it in plugging (rhbz#1447081)
+- Adds Windows support for OvsPlugin (rhbz#1447081)
+- os-vif: add new port profiles to enable fast path vhostuser (rhbz#1471657)
+- os-vif: add vif_name to VIFVHostUser class (rhbz#1471657)
+

There is something else going on here then.

Comment 3 Ihar Hrachyshka 2018-03-12 18:48:42 UTC
I definitely see in one of compute logs attached to the customer case that Nova is aware of the correct MTU to use for a hybrid=off ovs port:

2018-03-07 03:27:43.827 892486 DEBUG nova.network.base_api [req-ab51dd4f-1b8a-4138-88ae-18131fa7ca69 - - - - -] [instance: 0effecd4-0825-48a4-adb9-7796d7034e49] Updating instance_info_cache with network_info: [{"profile": {}, "ovs_interfaceid": "8a965b4e-84aa-4f24-8187-76e8e22b7534", "preserve_on_delete": false, "network": {"bridge": "br-int", "subnets": [{"ips": [{"meta": {}, "version": 4, "type": "fixed", "floating_ips": [], "address": "192.168.200.6"}], "version": 4, "meta": {"dhcp_server": "192.168.200.2"}, "dns": [], "routes": [], "cidr": "192.168.200.0/24", "gateway": {"meta": {}, "version": 4, "type": "gateway", "address": "192.168.200.1"}}], "meta": {"injected": false, "tenant_id": "46de3b4a139b4a35abb2c6cd4ea65ceb", "mtu": 8500}, "id": "8655ec01-ac0e-4655-9d69-958b1b07072a", "label": "Test_Network_DELETE"}, "devname": "tap8a965b4e-84", "vnic_type": "normal", "qbh_params": null, "meta": {}, "details": {"port_filter": true, "ovs_hybrid_plug": false}, "address": "fa:16:3e:30:76:b6", "active": true, "type": "ovs", "id": "8a965b4e-84aa-4f24-8187-76e8e22b7534", "qbg_params": null}] update_instance_cache_with_nw_info /usr/lib/python2.7/site-packages/nova/network/base_api.py:43

So I believe it's nova / os-vif issue that the tap devices are not set mtus for.

Comment 4 Matt Flusche 2018-03-12 22:11:28 UTC
Thanks for looking and providing feedback Ihar.

What I see is that for ovs_hybrid_plug=true, os-vif sets the MTU for the qvo & qvb interfaces.  This sets the linux brige to the correct MTU which is carried to the instance's tap interface when it is attached.

When ovs_hybrid_plug=false, the tap interface is just attached to br-int without setting a MTU.  It inherits the MTU of br-int whatever that may be.

It would seem that nova should set the MTU in the guest xml definition.

Comment 5 Matt Flusche 2018-03-13 15:44:45 UTC
This is broken in OSP 12 also.

Comment 7 Siggy Sigwald 2018-05-24 19:42:36 UTC
I have another customer experiencing a similar issue. both global_phys_network and path_mtu are set to 9000 with advertise = true but for some reason the tap devices are set to 1500 MTU. I've added the case to the BZ. Logs in private comment.

Comment 9 Sahid Ferdjaoui 2018-05-25 08:34:31 UTC
The fix is still under review upstream. I will update the BZ for any progress.

Comment 17 Alex McLeod 2018-09-03 07:58:11 UTC
Hi there,

If this bug requires doc text for errata release, please set the 'Doc Type' and provide draft text according to the template in the 'Doc Text' field.

The documentation team will review, edit, and approve the text.

If this bug does not require doc text, please set the 'requires_doc_text' flag to -.

Thanks,
Alex

Comment 20 errata-xmlrpc 2018-09-17 16:50:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2714


Note You need to log in before you can comment on or make changes to this bug.