Description of problem: [OVS] - Support setting a custom MTU on OVS host Currently we don't support setting a custom MTU on a host in OVS switch type cluster, this is preventing the testing of BZ 1451342 for physnet and auto define flows in OVS cluster. Trying to set a custom MTU on host in OVS cluster ends up with out-of-sync network and it can't be sync, because vdsm doesn't support it as it seems. Version-Release number of selected component (if applicable): vdsm-4.20.33-1.el7ev.x86_64 4.2.5.1_SNAPSHOT-71.g54dde01.0.scratch.master.el7ev openvswitch-ovn-common-2.9.0-47.el7fdp.3.x86_64 openvswitch-selinux-extra-policy-1.0-3.el7fdp.noarch openvswitch-2.9.0-47.el7fdp.3.x86_64 openvswitch-ovn-host-2.9.0-47.el7fdp.3.x86_64 How reproducible: 100% Steps to Reproduce: 1. Add clean host to OVS switch type cluster 2. Create network with custom MTU 9000 3. Attach the network to the host Actual results: Network is out-of-sync and custom MTU doesn't applied on the host Expected results: Should work Additional info: See also BZ 1451342
Created attachment 1456786 [details] logs
Can you share the output of `vdsm-client Host getCapabilities` after setting up a network with a high MTU, as well as that of `ovs-vsctl show` ?
(In reply to Dan Kenigsberg from comment #2) > Can you share the output of `vdsm-client Host getCapabilities` after setting > up a network with a high MTU, as well as that of `ovs-vsctl show` ? Nothing reported there: Bridge "vdsmbr_GJjZ3BKk" Port "enp12s0f0" Interface "enp12s0f0" Port "custom9000" Interface "custom9000" type: internal Port "vdsmbr_GJjZ3BKk" Interface "vdsmbr_GJjZ3BKk" type: internal "custom9000": { "ipv6autoconf": false, "addr": "", "dhcpv6": false, "ipv6addrs": [], "mtu": 1500, "dhcpv4": false, "netmask": "", "ipv4defaultroute": false, "stp": false, "ipv4addrs": [], "ipv6gateway": "::", "gateway": "", "opts": {}, "ports": [ "enp12s0f0" ] } "custom9000": { "iface": "custom9000", "ipv6autoconf": false, "addr": "", "nics": [ "enp12s0f0" ], "dhcpv6": false, "ipv6addrs": [], "switch": "ovs", "bridged": true, "mtu": 1500, "ports": [ "enp12s0f0" ], "dhcpv4": false, "netmask": "", "ipv4defaultroute": false, "stp": false, "ipv4addrs": [], "ipv6gateway": "::", "gateway": "", "bond": "" } },
Created attachment 1456830 [details] terminal log of session to manual fix the issue As a workaround, the MTU can be set manually on the ethernet interface by: ovs-vsctl set int eth1 mtu_request=9000
https://gerrit.ovirt.org/#/c/88613/ looks like setting MTU for OVS is currently only supported for ovirt-4.3, but not ovirt-4.2.
It should not be hard to backport, according to Edy.
Same result on 4.2.6.1_SNAPSHOT-89.g295078e.0.scratch.master.el7ev and vdsm-4.20.36-1.el7ev.x86_64
RHV Bugzilla Automation and Verification Bot 2018-08-06 15:03:36 IDT made a mistake. This code is not yet available for QE.
What is this version? look very strange..the latest we have is vdsm-4.20.36-1.el7ev.x86_64
I have no version with this fix. please move to ON_QA when a version is available.
the version *is* available on http://satellite6-ops.rhev-ci-vms.eng.rdu2.redhat.com/pulp/repos/RHEVM/Library/custom/RHV_Snapshot_Nightly_Release/rhv-snapshot-4_2-rpms/vdsm-4.20.22-47.git72c9d7f.el7ev.x86_64.rpm it is newer than 4.20.36, but its n-v-r does not show it. you can either `yum downgrade` to the specified version, or wait until CI solves this bug in versioning of vdsm nightlies.
Tested on vdsm-4.20.37-1.el7ev.x86_64 and 4.2.6.3_SNAPSHOT-93.g584f531.0.scratch.master.el7ev This is failedQA - When attaching the network with custom MTU on OVS host, it is remain as out-of-sync for ever, even when sending refresh caps. - But vdsm report everything correctly. "cust5": { "ipv6autoconf": false, "addr": "", "dhcpv6": false, "ipv6addrs": [], "mtu": 5000, "net-2": { "ipv6autoconf": false, "addr": "", "dhcpv6": false, "ipv6addrs": [], "mtu": 9000, "cus6": { "ipv6autoconf": false, "addr": "", "dhcpv6": false, "ipv6addrs": [], "mtu": 5000, So looks like vdsm code is ok, but engine side is not ok. I would like to keep this track on this specific bug. Attaching logs. - Steps 1. Attach network with MTU 9000 to OVS host - The network/s are out-of-sync in engine for ever - Refresh caps doesn't help - vdsm report correct custom MTU So vdsm code is ok, engine not
Created attachment 1475884 [details] failedqa logs
Created attachment 1475886 [details] screenshot
Can you spot any difference between what Vdsm reports on an OvS cluster to a LinuxBridge cluster? Can you hover over the UI and see why Engine thinks it's out-of-sync?
(In reply to Dan Kenigsberg from comment #15) > Can you spot any difference between what Vdsm reports on an OvS cluster to a > LinuxBridge cluster? Can you hover over the UI and see why Engine thinks > it's out-of-sync? It's out-of-sync because of MTU difference of course, between what set on host and on the DC. Another issue is that the MTU remain on the physical interface when detaching the network from the host and this a difference between the clusters. Also vdsm continue to report the old MTU on the NIC when the network is detached from the host. Even when attaching a network with default MTU, the custom still stay on the NIC. The behavior is clearly not ok. So it basically means, that once attached a network with custom MTU to a NIC, it will remain there for ever, even on detach network and on new attachment in the future.
Can you spot any difference between what *Vdsm* reports on an OvS cluster to a LinuxBridge cluster ? Can you compare the outputs of getCapabilities ?
(In reply to Dan Kenigsberg from comment #17) > Can you spot any difference between what *Vdsm* reports on an OvS cluster to > a LinuxBridge cluster ? Can you compare the outputs of getCapabilities ? I can't see something different, except what i already wrote. on OVS vdsm keep reporting the custom MTU on the NIC even when the network is detached, on legacy no.
I am having hard time figuring out what is not working exactly. We have these following tests covered in the functional tests: - test_add_net_with_mtu - test_removing_a_net_updates_the_mtu - test_adding_a_net_updates_the_mtu - test_add_slave_to_a_bonded_network_with_non_default_mtu See: https://github.com/oVirt/vdsm/blob/master/tests/network/functional/link_mtu_test.py I'm guessing based on your results that one or more scenarios are not covered and actually fail. Can you please help pin point these scenarios? (this is focusing on the VDSM side at the API level)
(In reply to Edward Haas from comment #19) > I am having hard time figuring out what is not working exactly. > > We have these following tests covered in the functional tests: > - test_add_net_with_mtu > - test_removing_a_net_updates_the_mtu > - test_adding_a_net_updates_the_mtu > - test_add_slave_to_a_bonded_network_with_non_default_mtu > > See: > https://github.com/oVirt/vdsm/blob/master/tests/network/functional/ > link_mtu_test.py > > I'm guessing based on your results that one or more scenarios are not > covered and actually fail. > Can you please help pin point these scenarios? > (this is focusing on the VDSM side at the API level) In your tests you don't check the UI, so you missed it. The networks are out of sync, on the host it is ok(vdsm) - You not testing it with vlan network - You not checking if the MTU remain on the NIC after network detach
(In reply to Michael Burman from comment #20) > > In your tests you don't check the UI, so you missed it. The networks are out > of sync, on the host it is ok(vdsm) The tests are VDSM API ones indeed, Engine side was assumed not to have influence because it should work no different from a Linux bridge network. (Engine is not supposed to have any special code that differentiate it) > - You not testing it with vlan network Yes we do, this is the only way to share multiple networks on the same NIC. > - You not checking if the MTU remain on the NIC after network detach We check that the MTU changed on the base nic (southbound) per the maximum MTU from all the remaining networks on it. If all networks have been removed from the same base nic, we do not restore the default mtu for that nic, because we do not manage it any more (and therefore, do not care about it). With OVS, the reported VLAN interfaces and bridges are mocks (mimicking how Engines sees the model per the Linux bridge deployment), VDSM does this magic and maybe the mtu is not declared correctly. I'll try to check if it is related.
(In reply to Edward Haas from comment #21) > (In reply to Michael Burman from comment #20) > > > > In your tests you don't check the UI, so you missed it. The networks are out > > of sync, on the host it is ok(vdsm) > > The tests are VDSM API ones indeed, Engine side was assumed not to have > influence because it should work no different from a Linux bridge network. > (Engine is not supposed to have any special code that differentiate it) > > > - You not testing it with vlan network > Yes we do, this is the only way to share multiple networks on the same NIC. > > > - You not checking if the MTU remain on the NIC after network detach > We check that the MTU changed on the base nic (southbound) per the maximum > MTU from all the remaining networks on it. > If all networks have been removed from the same base nic, we do not restore > the default mtu for that nic, because we do not manage it any more (and > therefore, do not care about it). > > With OVS, the reported VLAN interfaces and bridges are mocks (mimicking how > Engines sees the model per the Linux bridge deployment), VDSM does this > magic and maybe the mtu is not declared correctly. I'll try to check if it > is related. Obviously something on engine side is missing. Note that on linux host you do return to default MTU when removing the networks, so it should be the same i think.
Bug shouldn't be on MODIFIED Tested on 4.2.6.4-0.0.master.20180821115903.git1327b2f.el7 with vdsm-4.20.37-3.git924eec4.el7.x86_64 and failedQA Scenario PASS - 1. Attach VM network with custom MTU on OVS host - PASS Scenarios failed - 1. Attach vlan network with custom MTU on OVS host - FAIL Network remain as out-of-sync for ever(even when host and vdsm report correctly) 2. Detach network with custom MTU from OVS host - FAIL custom MTU remain on the host NIC, which is wrong and should return to it's default MTU on network detach(just like on linux type host)
Verified upstream on - vdsm-4.20.39-5.giteee4cd2.el7.x86_64 with 4.2.6.5-0.0.master.20180828115009.gite4659c4.el7
QE verification bot: the bug was verified upstream
This bugzilla is included in oVirt 4.2.7 release, published on November 2nd 2018. Since the problem described in this bug report should be resolved in oVirt 4.2.7 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.