Description of problem: Note: This is likely an issue with python-os-vif but wanted to get an initial review from the neutron team. if ovs_hybrid_plug=false in the binding:vif_details the port's MTU is not considered when plugging an instance's port. Version-Release number of selected component (if applicable): openstack-neutron-9.4.1-12.el7ost.noarch python-os-vif-1.2.1-3.el7ost.noarch How reproducible: 100% Steps to Reproduce: 1. Typical OSP10 deployment with global_physnet_mtu=9000 2. create neutron network and deploy instance. 3. verify MTU is set correctly for instance with ovs_hybrid_plug=true for the instance's port (default). Inspect tap interface and instance's MTU. 4. Delete instance 5. On the compute node, comment out the firewall driver in /etc/neutron/plugins/ml2/openvswitch_agent.ini [securitygroup] #firewall_driver = neutron.agent.linux.iptables_firewall.OVSHybridIptablesFirewallDriver 6. Reboot the compute node (or delete br-int and restart neutron-openvswitch agent), to get br-int back to default config 7. Deploy instance again 8. Verify MTU is incorrect on tap interface and large frames are not allowed. Actual results: Broken MTU for instance Expected results: Correct MTU set for instance's tap interface. Additional info: If br-int happens to get set to a larger MTU (by having OVSHybridIptablesFirewallDriver enabled at some point then disabling it without rebooting or recreating br-int) the instance's MTU will be set correctly. So the MTU issue can be intermittent.
My understanding is that the issue is that os-vif before Ocata ignored MTU as specified by Neutron. The patch to fix it is: https://review.openstack.org/#/c/370667/ Note that it changed payload format for Networks so I am not sure if it's safe to backport it as-is. I will leave this exercise to Compute team.
Hm nevermind, I now see that the patch I identified was backported in a recent os-vif OSP release: +* Mon Jul 31 2017 Sahid Orentino Ferdjaoui <sahid.ferdjaoui> 1.2.1-2 +- introduces MTU support for vhost-user (rhbz#1447081) +- vif_plug_ovs: Always set MTU when plugging devices +- remove use of contextlib and with nested (rhbz#1447081) +- add support for vhost-user reconnect (rhbz#1447081) +- Add MTU to Network model and use it in plugging (rhbz#1447081) +- Adds Windows support for OvsPlugin (rhbz#1447081) +- os-vif: add new port profiles to enable fast path vhostuser (rhbz#1471657) +- os-vif: add vif_name to VIFVHostUser class (rhbz#1471657) + There is something else going on here then.
I definitely see in one of compute logs attached to the customer case that Nova is aware of the correct MTU to use for a hybrid=off ovs port: 2018-03-07 03:27:43.827 892486 DEBUG nova.network.base_api [req-ab51dd4f-1b8a-4138-88ae-18131fa7ca69 - - - - -] [instance: 0effecd4-0825-48a4-adb9-7796d7034e49] Updating instance_info_cache with network_info: [{"profile": {}, "ovs_interfaceid": "8a965b4e-84aa-4f24-8187-76e8e22b7534", "preserve_on_delete": false, "network": {"bridge": "br-int", "subnets": [{"ips": [{"meta": {}, "version": 4, "type": "fixed", "floating_ips": [], "address": "192.168.200.6"}], "version": 4, "meta": {"dhcp_server": "192.168.200.2"}, "dns": [], "routes": [], "cidr": "192.168.200.0/24", "gateway": {"meta": {}, "version": 4, "type": "gateway", "address": "192.168.200.1"}}], "meta": {"injected": false, "tenant_id": "46de3b4a139b4a35abb2c6cd4ea65ceb", "mtu": 8500}, "id": "8655ec01-ac0e-4655-9d69-958b1b07072a", "label": "Test_Network_DELETE"}, "devname": "tap8a965b4e-84", "vnic_type": "normal", "qbh_params": null, "meta": {}, "details": {"port_filter": true, "ovs_hybrid_plug": false}, "address": "fa:16:3e:30:76:b6", "active": true, "type": "ovs", "id": "8a965b4e-84aa-4f24-8187-76e8e22b7534", "qbg_params": null}] update_instance_cache_with_nw_info /usr/lib/python2.7/site-packages/nova/network/base_api.py:43 So I believe it's nova / os-vif issue that the tap devices are not set mtus for.
Thanks for looking and providing feedback Ihar. What I see is that for ovs_hybrid_plug=true, os-vif sets the MTU for the qvo & qvb interfaces. This sets the linux brige to the correct MTU which is carried to the instance's tap interface when it is attached. When ovs_hybrid_plug=false, the tap interface is just attached to br-int without setting a MTU. It inherits the MTU of br-int whatever that may be. It would seem that nova should set the MTU in the guest xml definition.
This is broken in OSP 12 also.
I have another customer experiencing a similar issue. both global_phys_network and path_mtu are set to 9000 with advertise = true but for some reason the tap devices are set to 1500 MTU. I've added the case to the BZ. Logs in private comment.
The fix is still under review upstream. I will update the BZ for any progress.
Hi there, If this bug requires doc text for errata release, please set the 'Doc Type' and provide draft text according to the template in the 'Doc Text' field. The documentation team will review, edit, and approve the text. If this bug does not require doc text, please set the 'requires_doc_text' flag to -. Thanks, Alex
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2714