Bug 1598461 - [OVS] - Support setting a custom MTU on OVS host
Summary: [OVS] - Support setting a custom MTU on OVS host
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: Core
Version: 4.20.23
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ovirt-4.2.7
: ---
Assignee: Edward Haas
QA Contact: Michael Burman
URL:
Whiteboard:
Depends On:
Blocks: OpenVswitch_Support
TreeView+ depends on / blocked
 
Reported: 2018-07-05 14:30 UTC by Michael Burman
Modified: 2018-11-02 14:37 UTC (History)
5 users (show)

Fixed In Version: vdsm-0:4.20.22-47.git72c9d7f.el7ev.x86_64
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-11-02 14:37:47 UTC
oVirt Team: Network
Embargoed:
rule-engine: ovirt-4.2+
rule-engine: exception+


Attachments (Terms of Use)
logs (572.71 KB, application/x-gzip)
2018-07-05 14:35 UTC, Michael Burman
no flags Details
terminal log of session to manual fix the issue (1.94 KB, text/plain)
2018-07-05 19:05 UTC, Dominik Holler
no flags Details
failedqa logs (937.63 KB, application/x-gzip)
2018-08-14 14:42 UTC, Michael Burman
no flags Details
screenshot (118.73 KB, image/png)
2018-08-14 14:43 UTC, Michael Burman
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 93253 0 ovirt-4.2 MERGED net: ovs: Make ovs driver instantiation a singletone 2021-01-06 10:11:02 UTC
oVirt gerrit 93254 0 ovirt-4.2 MERGED net: ovs: Remove the network_*_setup factories 2021-01-06 10:10:59 UTC
oVirt gerrit 93255 0 ovirt-4.2 MERGED net: ovs: Use two stages in Nets*Setup and remove context 2021-01-06 10:10:59 UTC
oVirt gerrit 93256 0 ovirt-4.2 MERGED net: link.iface now supports setting the mtu 2021-01-06 10:11:38 UTC
oVirt gerrit 93257 0 ovirt-4.2 MERGED net: ovs: Support setting the mtu on an ovs base net 2021-01-06 10:11:38 UTC
oVirt gerrit 93799 0 master MERGED net tests: Add mtu validation for vlans 2021-01-06 10:11:38 UTC
oVirt gerrit 93803 0 master MERGED net: Report correct mtu on the faked vlan iface 2021-01-06 10:11:00 UTC
oVirt gerrit 93824 0 ovirt-4.2 MERGED net: Report correct mtu on the faked vlan iface 2021-01-06 10:11:03 UTC
oVirt gerrit 93856 0 master MERGED net: Disabling IPv6 on an iface with no IPv6 should pass silently 2021-01-06 10:11:03 UTC
oVirt gerrit 93870 0 master MERGED net: A southbound iface with no nets on top should have mtu=1500 2021-01-06 10:11:00 UTC
oVirt gerrit 93871 0 master MERGED net: Update the southbound iface mtu to the max mtus of all nets 2021-01-06 10:11:00 UTC
oVirt gerrit 93924 0 ovirt-4.2 MERGED net: Checking if ipv6 is disabled, should not assume sysctl path 2021-01-06 10:11:03 UTC
oVirt gerrit 93925 0 ovirt-4.2 MERGED net: Disabling IPv6 on an iface with no IPv6 should pass silently 2021-01-06 10:11:03 UTC
oVirt gerrit 93926 0 ovirt-4.2 MERGED net: A southbound iface with no nets on top should have mtu=1500 2021-01-06 10:11:03 UTC
oVirt gerrit 93927 0 ovirt-4.2 MERGED net: Update the southbound iface mtu to the max mtus of all nets 2021-01-06 10:11:01 UTC

Description Michael Burman 2018-07-05 14:30:49 UTC
Description of problem:
[OVS] - Support setting a custom MTU on OVS host

Currently we don't support setting a custom MTU on a host in OVS switch type cluster, this is preventing the testing of BZ 1451342 for physnet and auto define flows in OVS cluster.

Trying to set a custom MTU on host in OVS cluster ends up with out-of-sync network and it can't be sync, because vdsm doesn't support it as it seems.

Version-Release number of selected component (if applicable):
vdsm-4.20.33-1.el7ev.x86_64
4.2.5.1_SNAPSHOT-71.g54dde01.0.scratch.master.el7ev
openvswitch-ovn-common-2.9.0-47.el7fdp.3.x86_64
openvswitch-selinux-extra-policy-1.0-3.el7fdp.noarch
openvswitch-2.9.0-47.el7fdp.3.x86_64
openvswitch-ovn-host-2.9.0-47.el7fdp.3.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Add clean host to OVS switch type cluster
2. Create network with custom MTU 9000
3. Attach the network to the host

Actual results:
Network is out-of-sync and custom MTU doesn't applied on the host

Expected results:
Should work

Additional info:
See also BZ 1451342

Comment 1 Michael Burman 2018-07-05 14:35:46 UTC
Created attachment 1456786 [details]
logs

Comment 2 Dan Kenigsberg 2018-07-05 15:05:57 UTC
Can you share the output of `vdsm-client Host getCapabilities` after setting up a network with a high MTU, as well as that of `ovs-vsctl show` ?

Comment 3 Michael Burman 2018-07-05 15:24:32 UTC
(In reply to Dan Kenigsberg from comment #2)
> Can you share the output of `vdsm-client Host getCapabilities` after setting
> up a network with a high MTU, as well as that of `ovs-vsctl show` ?

Nothing reported there:

  Bridge "vdsmbr_GJjZ3BKk"
        Port "enp12s0f0"
            Interface "enp12s0f0"
        Port "custom9000"
            Interface "custom9000"
                type: internal
        Port "vdsmbr_GJjZ3BKk"
            Interface "vdsmbr_GJjZ3BKk"
                type: internal

 "custom9000": {
            "ipv6autoconf": false, 
            "addr": "", 
            "dhcpv6": false, 
            "ipv6addrs": [], 
            "mtu": 1500, 
            "dhcpv4": false, 
            "netmask": "", 
            "ipv4defaultroute": false, 
            "stp": false, 
            "ipv4addrs": [], 
            "ipv6gateway": "::", 
            "gateway": "", 
            "opts": {}, 
            "ports": [
                "enp12s0f0"
            ]
        }

  "custom9000": {
            "iface": "custom9000", 
            "ipv6autoconf": false, 
            "addr": "", 
            "nics": [
                "enp12s0f0"
            ], 
            "dhcpv6": false, 
            "ipv6addrs": [], 
            "switch": "ovs", 
            "bridged": true, 
            "mtu": 1500, 
            "ports": [
                "enp12s0f0"
            ], 
            "dhcpv4": false, 
            "netmask": "", 
            "ipv4defaultroute": false, 
            "stp": false, 
            "ipv4addrs": [], 
            "ipv6gateway": "::", 
            "gateway": "", 
            "bond": ""
        }
    },

Comment 4 Dominik Holler 2018-07-05 19:05:30 UTC
Created attachment 1456830 [details]
terminal log of session to manual fix the issue

As a workaround, the MTU can be set manually on the ethernet interface by:
ovs-vsctl set int eth1 mtu_request=9000

Comment 5 Dominik Holler 2018-07-06 14:49:59 UTC
https://gerrit.ovirt.org/#/c/88613/ looks like setting MTU for OVS is currently only supported for ovirt-4.3, but not ovirt-4.2.

Comment 6 Dan Kenigsberg 2018-07-13 22:25:25 UTC
It should not be hard to backport, according to Edy.

Comment 7 Michael Burman 2018-08-07 08:44:09 UTC
Same result on 4.2.6.1_SNAPSHOT-89.g295078e.0.scratch.master.el7ev
and vdsm-4.20.36-1.el7ev.x86_64

Comment 8 Dan Kenigsberg 2018-08-07 11:10:24 UTC
 RHV Bugzilla Automation and Verification Bot 2018-08-06 15:03:36 IDT 

made a mistake. This code is not yet available for QE.

Comment 9 Michael Burman 2018-08-08 14:03:12 UTC
What is this version? look very strange..the latest we have is vdsm-4.20.36-1.el7ev.x86_64

Comment 10 Michael Burman 2018-08-13 05:27:04 UTC
I have no version with this fix. please move to ON_QA when a version is available.

Comment 11 Dan Kenigsberg 2018-08-13 13:35:58 UTC
the version *is* available on http://satellite6-ops.rhev-ci-vms.eng.rdu2.redhat.com/pulp/repos/RHEVM/Library/custom/RHV_Snapshot_Nightly_Release/rhv-snapshot-4_2-rpms/vdsm-4.20.22-47.git72c9d7f.el7ev.x86_64.rpm

it is newer than 4.20.36, but its n-v-r does not show it.
you can either `yum downgrade` to the specified version, or wait until CI solves this bug in versioning of vdsm nightlies.

Comment 12 Michael Burman 2018-08-14 14:39:31 UTC
Tested on vdsm-4.20.37-1.el7ev.x86_64 and 
4.2.6.3_SNAPSHOT-93.g584f531.0.scratch.master.el7ev
This is failedQA

- When attaching the network with custom MTU on OVS host, it is remain as out-of-sync for ever, even when sending refresh caps.

- But vdsm report everything correctly.

"cust5": {
            "ipv6autoconf": false, 
            "addr": "", 
            "dhcpv6": false, 
            "ipv6addrs": [], 
            "mtu": 5000, 

"net-2": {
            "ipv6autoconf": false, 
            "addr": "", 
            "dhcpv6": false, 
            "ipv6addrs": [], 
            "mtu": 9000,

"cus6": {
            "ipv6autoconf": false, 
            "addr": "", 
            "dhcpv6": false, 
            "ipv6addrs": [], 
            "mtu": 5000, 

So looks like vdsm code is ok, but engine side is not ok. I would like to keep this track on this specific bug. Attaching logs. 


- Steps
1. Attach network with MTU 9000 to OVS host

- The network/s are out-of-sync in engine for ever
- Refresh caps doesn't help
- vdsm report correct custom MTU

So vdsm code is ok, engine not

Comment 13 Michael Burman 2018-08-14 14:42:51 UTC
Created attachment 1475884 [details]
failedqa logs

Comment 14 Michael Burman 2018-08-14 14:43:26 UTC
Created attachment 1475886 [details]
screenshot

Comment 15 Dan Kenigsberg 2018-08-15 06:47:07 UTC
Can you spot any difference between what Vdsm reports on an OvS cluster to a LinuxBridge cluster? Can you hover over the UI and see why Engine thinks it's out-of-sync?

Comment 16 Michael Burman 2018-08-15 07:11:23 UTC
(In reply to Dan Kenigsberg from comment #15)
> Can you spot any difference between what Vdsm reports on an OvS cluster to a
> LinuxBridge cluster? Can you hover over the UI and see why Engine thinks
> it's out-of-sync?

It's out-of-sync because of MTU difference of course, between what set on host and on the DC. 

Another issue is that the MTU remain on the physical interface when detaching the network from the host and this a difference between the clusters. Also vdsm continue to report the old MTU on the NIC when the network is detached from the host. Even when attaching a network with default MTU, the custom still stay on the NIC. The behavior is clearly not ok. 

So it basically means, that once attached a network with custom MTU to a NIC, it will remain there for ever, even on detach network and on new attachment in the future.

Comment 17 Dan Kenigsberg 2018-08-15 12:46:51 UTC
Can you spot any difference between what *Vdsm* reports on an OvS cluster to a LinuxBridge cluster ? Can you compare the outputs of getCapabilities ?

Comment 18 Michael Burman 2018-08-15 13:00:53 UTC
(In reply to Dan Kenigsberg from comment #17)
> Can you spot any difference between what *Vdsm* reports on an OvS cluster to
> a LinuxBridge cluster ? Can you compare the outputs of getCapabilities ?

I can't see something different, except what i already wrote.
on OVS vdsm keep reporting the custom MTU on the NIC even when the network is detached, on legacy no.

Comment 19 Edward Haas 2018-08-15 16:03:54 UTC
I am having hard time figuring out what is not working exactly.

We have these following tests covered in the functional tests:
- test_add_net_with_mtu
- test_removing_a_net_updates_the_mtu
- test_adding_a_net_updates_the_mtu
- test_add_slave_to_a_bonded_network_with_non_default_mtu

See: https://github.com/oVirt/vdsm/blob/master/tests/network/functional/link_mtu_test.py

I'm guessing based on your results that one or more scenarios are not covered and actually fail.
Can you please help pin point these scenarios?
(this is focusing on the VDSM side at the API level)

Comment 20 Michael Burman 2018-08-15 16:14:16 UTC
(In reply to Edward Haas from comment #19)
> I am having hard time figuring out what is not working exactly.
> 
> We have these following tests covered in the functional tests:
> - test_add_net_with_mtu
> - test_removing_a_net_updates_the_mtu
> - test_adding_a_net_updates_the_mtu
> - test_add_slave_to_a_bonded_network_with_non_default_mtu
> 
> See:
> https://github.com/oVirt/vdsm/blob/master/tests/network/functional/
> link_mtu_test.py
> 
> I'm guessing based on your results that one or more scenarios are not
> covered and actually fail.
> Can you please help pin point these scenarios?
> (this is focusing on the VDSM side at the API level)

In your tests you don't check the UI, so you missed it. The networks are out of sync, on the host it is ok(vdsm)
- You not testing it with vlan network
- You not checking if the MTU remain on the NIC after network detach

Comment 21 Edward Haas 2018-08-16 09:20:01 UTC
(In reply to Michael Burman from comment #20)
> 
> In your tests you don't check the UI, so you missed it. The networks are out
> of sync, on the host it is ok(vdsm)

The tests are VDSM API ones indeed, Engine side was assumed not to have influence because it should work no different from a Linux bridge network. (Engine is not supposed to have any special code that differentiate it)

> - You not testing it with vlan network
Yes we do, this is the only way to share multiple networks on the same NIC.

> - You not checking if the MTU remain on the NIC after network detach
We check that the MTU changed on the base nic (southbound) per the maximum MTU from all the remaining networks on it.
If all networks have been removed from the same base nic, we do not restore the default mtu for that nic, because we do not manage it any more (and therefore, do not care about it).

With OVS, the reported VLAN interfaces and bridges are mocks (mimicking how Engines sees the model per the Linux bridge deployment), VDSM does this magic and maybe the mtu is not declared correctly. I'll try to check if it is related.

Comment 22 Michael Burman 2018-08-16 09:39:57 UTC
(In reply to Edward Haas from comment #21)
> (In reply to Michael Burman from comment #20)
> > 
> > In your tests you don't check the UI, so you missed it. The networks are out
> > of sync, on the host it is ok(vdsm)
> 
> The tests are VDSM API ones indeed, Engine side was assumed not to have
> influence because it should work no different from a Linux bridge network.
> (Engine is not supposed to have any special code that differentiate it)
> 
> > - You not testing it with vlan network
> Yes we do, this is the only way to share multiple networks on the same NIC.
> 
> > - You not checking if the MTU remain on the NIC after network detach
> We check that the MTU changed on the base nic (southbound) per the maximum
> MTU from all the remaining networks on it.
> If all networks have been removed from the same base nic, we do not restore
> the default mtu for that nic, because we do not manage it any more (and
> therefore, do not care about it).
> 
> With OVS, the reported VLAN interfaces and bridges are mocks (mimicking how
> Engines sees the model per the Linux bridge deployment), VDSM does this
> magic and maybe the mtu is not declared correctly. I'll try to check if it
> is related.

Obviously something on engine side is missing.
Note that on linux host you do return to default MTU when removing the networks, so it should be the same i think.

Comment 23 Michael Burman 2018-08-22 08:31:49 UTC
Bug shouldn't be on MODIFIED

Tested on 4.2.6.4-0.0.master.20180821115903.git1327b2f.el7 with vdsm-4.20.37-3.git924eec4.el7.x86_64 and failedQA

Scenario PASS -
1. Attach VM network with custom MTU on OVS host - PASS

Scenarios failed - 
1. Attach vlan network with custom MTU on OVS host - FAIL 
Network remain as out-of-sync for ever(even when host and vdsm report correctly)
2. Detach network with custom MTU from OVS host - FAIL 
custom MTU remain on the host NIC, which is wrong and should return to it's default MTU on network detach(just like on linux type host)

Comment 24 Michael Burman 2018-08-30 08:56:13 UTC
Verified upstream on - vdsm-4.20.39-5.giteee4cd2.el7.x86_64 with 4.2.6.5-0.0.master.20180828115009.gite4659c4.el7

Comment 25 Raz Tamir 2018-09-05 08:56:46 UTC
QE verification bot: the bug was verified upstream

Comment 26 Sandro Bonazzola 2018-11-02 14:37:47 UTC
This bugzilla is included in oVirt 4.2.7 release, published on November 2nd 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.7 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.