Bug 1451342 - configure guest MTU based on underlying network
Summary: configure guest MTU based on underlying network
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Network
Version: ---
Hardware: Unspecified
OS: Unspecified
medium
medium vote
Target Milestone: ovirt-4.2.5
: ---
Assignee: Dominik Holler
QA Contact: Michael Burman
URL:
Whiteboard:
Depends On: 1412234 1452756 1590327
Blocks: 1510336
TreeView+ depends on / blocked
 
Reported: 2017-05-16 12:45 UTC by Dominik Holler
Modified: 2018-09-17 13:43 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Feature: The feature adds the ability to manage the MTU of VM networks in a centralized way. This extends the ability of oVirt to manage the MTU of host networks. Reason: This feature enables the usage of big MTUs ("Jumbo Frames") for OVN networks. This improves the network throughput for OVN networks. Result: The MTU of the network is propagated the whole way down to the guest in the VM.
Clone Of:
Environment:
Last Closed: 2018-07-31 15:25:08 UTC
oVirt Team: Network
rule-engine: ovirt-4.2+
ylavi: exception+
mburman: testing_plan_complete?
mburman: testing_ack+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 76887 0 'None' MERGED Define MTU explicitly 2021-02-05 15:00:25 UTC
oVirt gerrit 90247 0 'None' MERGED webadmin: Remove property MTUOverrideSupported 2021-02-05 15:00:25 UTC
oVirt gerrit 90248 0 'None' MERGED utils: Move getMtuActualValue to utils 2021-02-05 15:00:24 UTC
oVirt gerrit 90249 0 'None' MERGED core: Propagate MTU to VMs 2021-02-05 15:00:25 UTC
oVirt gerrit 90268 0 'None' MERGED webadmin: Allow MTU for external networks 2021-02-05 15:00:24 UTC
oVirt gerrit 90324 0 'None' MERGED backend, frontend: Hide magic value for default MTU 2021-02-05 15:00:25 UTC
oVirt gerrit 90325 0 'None' MERGED backend: Allow MTU for external networks 2021-02-05 15:00:26 UTC
oVirt gerrit 90327 0 'None' MERGED backend, packing: Add default MTU for tunnelled networks 2021-02-05 15:00:26 UTC
oVirt gerrit 90347 0 'None' MERGED core: Update machine type to rhel 7.4 version 2021-02-05 15:00:26 UTC
oVirt gerrit 90352 0 'None' MERGED Support MTU on OpenStack Networking API 2021-02-05 15:00:26 UTC
oVirt gerrit 92533 0 'None' MERGED Add note about update of external networks 2021-02-05 15:00:26 UTC
oVirt gerrit 92535 0 'None' MERGED core: Update external network MTU during sync 2021-02-05 15:00:26 UTC
oVirt gerrit 92546 0 'None' MERGED core: Apply MTU on autodefined network 2021-02-05 15:00:27 UTC
oVirt gerrit 92560 0 'None' MERGED core: Do not set MTU for passthrough 2021-02-05 15:00:27 UTC
oVirt gerrit 92610 0 'None' MERGED webadmin: Remove property MTUOverrideSupported 2021-02-05 15:00:27 UTC
oVirt gerrit 92611 0 'None' MERGED backend, frontend: Hide magic value for default MTU 2021-02-05 15:00:27 UTC
oVirt gerrit 92612 0 'None' MERGED utils: Move getMtuActualValue to utils 2021-02-05 15:00:27 UTC
oVirt gerrit 92613 0 'None' MERGED backend, packing: Add default MTU for tunnelled networks 2021-02-05 15:00:27 UTC
oVirt gerrit 92614 0 'None' MERGED core: Propagate MTU to VMs 2021-02-05 15:00:27 UTC
oVirt gerrit 92615 0 'None' MERGED backend: Allow MTU for external networks 2021-02-05 15:00:27 UTC
oVirt gerrit 92616 0 'None' MERGED webadmin: Allow MTU for external networks 2021-02-05 15:00:28 UTC
oVirt gerrit 92618 0 'None' MERGED Support MTU on OpenStack Networking API 2021-02-05 15:00:28 UTC
oVirt gerrit 92619 0 'None' MERGED core: Update external network MTU during sync 2021-02-05 15:00:28 UTC
oVirt gerrit 92620 0 'None' MERGED core: Apply MTU on autodefined network 2021-02-05 15:00:28 UTC
oVirt gerrit 92621 0 'None' MERGED core: Do not set MTU for passthrough 2021-02-05 15:00:28 UTC
oVirt gerrit 94393 0 'None' MERGED Add note about update of external networks 2021-02-05 15:00:29 UTC

Description Dominik Holler 2017-05-16 12:45:46 UTC
Description of problem:

It is not possible to transport a big amount of data between two VMs connected by a logical network provided by ovirt-provider-ovn, if the two VMs are located on different host.

Version-Release number of selected component (if applicable):
ovirt-provider-ovn-driver-1.0-6.el7.centos.noarch
openvswitch-ovn-central-2.7.0-1.el7.centos.x86_64
openvswitch-ovn-host-2.7.0-1.el7.centos.x86_64
openvswitch-ovn-common-2.7.0-1.el7.centos.x86_64
openvswitch-2.7.0-1.el7.centos.x86_64
python-openvswitch-2.7.0-1.el7.centos.noarch


How reproducible:


Steps to Reproduce:
1. Create VMs in oVirt on two differnet hosts
2. Create logical network on ovirt-provider-ovn 
3. Connect the two VMs via the logical network
4. Ensure that the setup is correct by pinging between the VMs
5. Transport of a big amount of data between the two VMs by:
   a) waiting for data on the first VM by:
        nc -l 9999 > /dev/null
   b) creating random data in second VM by: 
        dd if=/dev/urandom of=data bs=4k count=256k
   c) sending the data from second to first VM by:
        time nc $IP_OF_FIRST_VM 9999 < data    

Actual results:

nc does not succeed.

Expected results:

nc succeeds, if both VMs are on different host the same way like both VMs are on the same host.


Additional info:

Comment 1 Dan Kenigsberg 2017-05-16 13:10:16 UTC
Most likely, this is a problem with the underlying OVN, not ovirt-provider-ovn.
Dominik, would you provide more information on how nc "does not succeed"?

Mor, can you reproduce this, attaching openvswitch logs, possibly running it in debug mode?

Comment 2 Dominik Holler 2017-05-16 14:22:28 UTC
> Dominik, would you provide more information on how nc "does not succeed"?

Actual results:

nc is blocking.

Comment 3 Mor 2017-05-17 07:40:17 UTC
(In reply to Dan Kenigsberg from comment #1)
> Most likely, this is a problem with the underlying OVN, not
> ovirt-provider-ovn.
> Dominik, would you provide more information on how nc "does not succeed"?
> 
> Mor, can you reproduce this, attaching openvswitch logs, possibly running it
> in debug mode?

I can confirm that it is reproducible on my environment, with default MTU of 1500 the transfer test (I used iperf) did not even start. When I set the MTU size on the interface to be 1400, the test worked as expected, with results of ~840Mbps. 

I also tested it on OVN network without subnet, and it also relevant.

I see that we plan to fix it on the subnet entity, but maybe we need should think more generally? and also provide documentation for this issue as well? We need to support different environments running on various network configurations that could affect MTU value.

P.S: In the automations we test packet size of 1300 over the tunnel, and I quickly adjust that to higher values. When I started testing OVN 2.6, I remember testing higher values successfully.

Comment 4 Dan Kenigsberg 2017-05-17 07:52:26 UTC
shouldn't OVN fragmentize the guest packets in such a case?

Comment 5 Dan Kenigsberg 2017-05-23 15:45:30 UTC
(In reply to Dan Kenigsberg from comment #4)
> shouldn't OVN fragmentize the guest packets in such a case?

According to Lance, OVS does not do so by default.

We'd better propagate the mtu from the tunnel to the vNIC via libvirt's new <mtu> element.

Comment 6 Dominik Holler 2017-05-23 15:50:13 UTC
The functionality of libvirt's new <mtu> element depends on #1412234 and #1452756

Comment 7 Dan Kenigsberg 2017-07-12 11:28:26 UTC
We no longer use the too-big default MTU since a smaller one is advertised by OVN DHCP. Let us keep this bug in order to set a nice MTU per interface based on the underlying network for the interface.

Comment 8 Michal Skrivanek 2018-04-17 07:45:39 UTC
please re-target is you require machine type update in 4.2. 4.2 GA was released with i440fx-7.3.0 and that's how it needs to stay until 4.3

Comment 9 Dominik Holler 2018-06-12 11:03:23 UTC
During 4.2.z the new functionality would be effective only for users manually choosing machineType>=7.4.0 for the VM or as default in their engine-config by

engine-config --set "ClusterEmulatedMachines=pc-i440fx-rhel7.4.0,pc-i440fx-2.9,pseries-rhel7.5.0,s390-ccw-virtio-2.6" --cver=4.2

The related code changes without changing the default machine type in 4.2
should be ready in 4.2.5.

Comment 10 RHV bug bot 2018-07-02 15:34:07 UTC
INFO: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Open patch attached]

For more info please contact: infra

Comment 11 Michael Burman 2018-07-05 14:47:52 UTC
Tested flows:

1) libvirt flow - native networks - PASS 
mtu is passed to xml and to guest's interface 
 <mtu size='9000'/>

ping with custom MTU is working

2) libvirt flow - physnet networks + OVN in OVS cluster - BLOCKED BZ 1598461
OVS switch type doesn't support custom MTU

3) libvirt flow - auto define - BLOCKED BZ 1598461
OVS switch type doesn't support custom MTU

4) OVN dhcp flow - PASS
dhcp MTU is passed to xml and to the guest's interface
ping with custom MTU is working
- In order to ping VMs on different host, we need to set the tunnel network(ovirtmgmt as default) with higher MTU as well
mtu of ovn network + 58 geneve overhead = MTU of underlying network

5) Hotplug flow - PASS

Not sure if this can be verified at the moment as OVS and custom MTU doesn't work.

Comment 12 Dan Kenigsberg 2018-07-05 20:15:08 UTC
(In reply to Michael Burman from comment #11)

> 
> 3) libvirt flow - auto define - BLOCKED BZ 1598461
> OVS switch type doesn't support custom MTU

Thanks for filing it.

> 
> Not sure if this can be verified at the moment as OVS and custom MTU doesn't
> work.

Since OVS switchType is still under TechPreview, and MTU feature is dearly required by production-ready OVN, I believe we should accept the feature in its partial state.

Comment 13 Michael Burman 2018-07-08 07:38:54 UTC
(In reply to Dan Kenigsberg from comment #12)
> (In reply to Michael Burman from comment #11)
> 
> > 
> > 3) libvirt flow - auto define - BLOCKED BZ 1598461
> > OVS switch type doesn't support custom MTU
> 
> Thanks for filing it.
> 
> > 
> > Not sure if this can be verified at the moment as OVS and custom MTU doesn't
> > work.
> 
> Since OVS switchType is still under TechPreview, and MTU feature is dearly
> required by production-ready OVN, I believe we should accept the feature in
> its partial state.

Fine with me. Based on comments 11 and 12 moving this to verified.
Verified on - rhvm-4.2.5.1_SNAPSHOT-71.g54dde01.0.scratch.master.el7ev.noarch

Comment 14 Dan Kenigsberg 2018-07-16 11:39:15 UTC
Dominik, should we formally document this feature now? If so, how?

Comment 15 Dominik Holler 2018-07-16 12:06:40 UTC
Good idea. In the
Administration Guide
the sections
6.1.7. Logical Network General Settings Explained
and
6.1.2. Creating a New Logical Network in a Data Center or Cluster
would require an update.

Comment 16 Sandro Bonazzola 2018-07-31 15:25:08 UTC
This bugzilla is included in oVirt 4.2.5 release, published on July 30th 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.5 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.