Hide Forgot
Description of problem: It is not possible to transport a big amount of data between two VMs connected by a logical network provided by ovirt-provider-ovn, if the two VMs are located on different host. Version-Release number of selected component (if applicable): ovirt-provider-ovn-driver-1.0-6.el7.centos.noarch openvswitch-ovn-central-2.7.0-1.el7.centos.x86_64 openvswitch-ovn-host-2.7.0-1.el7.centos.x86_64 openvswitch-ovn-common-2.7.0-1.el7.centos.x86_64 openvswitch-2.7.0-1.el7.centos.x86_64 python-openvswitch-2.7.0-1.el7.centos.noarch How reproducible: Steps to Reproduce: 1. Create VMs in oVirt on two differnet hosts 2. Create logical network on ovirt-provider-ovn 3. Connect the two VMs via the logical network 4. Ensure that the setup is correct by pinging between the VMs 5. Transport of a big amount of data between the two VMs by: a) waiting for data on the first VM by: nc -l 9999 > /dev/null b) creating random data in second VM by: dd if=/dev/urandom of=data bs=4k count=256k c) sending the data from second to first VM by: time nc $IP_OF_FIRST_VM 9999 < data Actual results: nc does not succeed. Expected results: nc succeeds, if both VMs are on different host the same way like both VMs are on the same host. Additional info:
Most likely, this is a problem with the underlying OVN, not ovirt-provider-ovn. Dominik, would you provide more information on how nc "does not succeed"? Mor, can you reproduce this, attaching openvswitch logs, possibly running it in debug mode?
> Dominik, would you provide more information on how nc "does not succeed"? Actual results: nc is blocking.
(In reply to Dan Kenigsberg from comment #1) > Most likely, this is a problem with the underlying OVN, not > ovirt-provider-ovn. > Dominik, would you provide more information on how nc "does not succeed"? > > Mor, can you reproduce this, attaching openvswitch logs, possibly running it > in debug mode? I can confirm that it is reproducible on my environment, with default MTU of 1500 the transfer test (I used iperf) did not even start. When I set the MTU size on the interface to be 1400, the test worked as expected, with results of ~840Mbps. I also tested it on OVN network without subnet, and it also relevant. I see that we plan to fix it on the subnet entity, but maybe we need should think more generally? and also provide documentation for this issue as well? We need to support different environments running on various network configurations that could affect MTU value. P.S: In the automations we test packet size of 1300 over the tunnel, and I quickly adjust that to higher values. When I started testing OVN 2.6, I remember testing higher values successfully.
shouldn't OVN fragmentize the guest packets in such a case?
(In reply to Dan Kenigsberg from comment #4) > shouldn't OVN fragmentize the guest packets in such a case? According to Lance, OVS does not do so by default. We'd better propagate the mtu from the tunnel to the vNIC via libvirt's new <mtu> element.
The functionality of libvirt's new <mtu> element depends on #1412234 and #1452756
We no longer use the too-big default MTU since a smaller one is advertised by OVN DHCP. Let us keep this bug in order to set a nice MTU per interface based on the underlying network for the interface.
please re-target is you require machine type update in 4.2. 4.2 GA was released with i440fx-7.3.0 and that's how it needs to stay until 4.3
During 4.2.z the new functionality would be effective only for users manually choosing machineType>=7.4.0 for the VM or as default in their engine-config by engine-config --set "ClusterEmulatedMachines=pc-i440fx-rhel7.4.0,pc-i440fx-2.9,pseries-rhel7.5.0,s390-ccw-virtio-2.6" --cver=4.2 The related code changes without changing the default machine type in 4.2 should be ready in 4.2.5.
INFO: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason: [Open patch attached] For more info please contact: infra
Tested flows: 1) libvirt flow - native networks - PASS mtu is passed to xml and to guest's interface <mtu size='9000'/> ping with custom MTU is working 2) libvirt flow - physnet networks + OVN in OVS cluster - BLOCKED BZ 1598461 OVS switch type doesn't support custom MTU 3) libvirt flow - auto define - BLOCKED BZ 1598461 OVS switch type doesn't support custom MTU 4) OVN dhcp flow - PASS dhcp MTU is passed to xml and to the guest's interface ping with custom MTU is working - In order to ping VMs on different host, we need to set the tunnel network(ovirtmgmt as default) with higher MTU as well mtu of ovn network + 58 geneve overhead = MTU of underlying network 5) Hotplug flow - PASS Not sure if this can be verified at the moment as OVS and custom MTU doesn't work.
(In reply to Michael Burman from comment #11) > > 3) libvirt flow - auto define - BLOCKED BZ 1598461 > OVS switch type doesn't support custom MTU Thanks for filing it. > > Not sure if this can be verified at the moment as OVS and custom MTU doesn't > work. Since OVS switchType is still under TechPreview, and MTU feature is dearly required by production-ready OVN, I believe we should accept the feature in its partial state.
(In reply to Dan Kenigsberg from comment #12) > (In reply to Michael Burman from comment #11) > > > > > 3) libvirt flow - auto define - BLOCKED BZ 1598461 > > OVS switch type doesn't support custom MTU > > Thanks for filing it. > > > > > Not sure if this can be verified at the moment as OVS and custom MTU doesn't > > work. > > Since OVS switchType is still under TechPreview, and MTU feature is dearly > required by production-ready OVN, I believe we should accept the feature in > its partial state. Fine with me. Based on comments 11 and 12 moving this to verified. Verified on - rhvm-4.2.5.1_SNAPSHOT-71.g54dde01.0.scratch.master.el7ev.noarch
Dominik, should we formally document this feature now? If so, how?
Good idea. In the Administration Guide the sections 6.1.7. Logical Network General Settings Explained and 6.1.2. Creating a New Logical Network in a Data Center or Cluster would require an update.
This bugzilla is included in oVirt 4.2.5 release, published on July 30th 2018. Since the problem described in this bug report should be resolved in oVirt 4.2.5 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.