Bug 1676708 - [UI] hint after updating mtu on networks connected to running VMs and indicate vNICs out of sync
Summary: [UI] hint after updating mtu on networks connected to running VMs and indicat...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: General
Version: 4.3.0
Hardware: Unspecified
OS: Linux
medium
medium
Target Milestone: ovirt-4.4.6
: 4.4.6.4
Assignee: eraviv
QA Contact: Michael Burman
URL:
Whiteboard:
Depends On:
Blocks: 1113630 1766414 1848986
TreeView+ depends on / blocked
 
Reported: 2019-02-12 22:34 UTC by Sergey
Modified: 2021-05-05 05:35 UTC (History)
6 users (show)

Fixed In Version: ovirt-engine-4.4.6.4
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1766414 (view as bug list)
Environment:
Last Closed: 2021-05-05 05:35:54 UTC
oVirt Team: Network
Embargoed:
pm-rhel: ovirt-4.4+


Attachments (Terms of Use)
engine and vdsm logs (124.44 KB, application/gzip)
2019-02-14 10:34 UTC, Sergey
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 113770 0 master MERGED core: vnics out of sync after network update 2021-04-13 06:42:16 UTC

Description Sergey 2019-02-12 22:34:59 UTC
Description of problem:
After updating network MTU from 1500 to 9000, while VM connected to this network all MTUs (for bridge and for vnet VM devices) updated to correct values, but after migrating VM to another host MTU again 1500, restarting VM fixes this, and migration now sets 9000 mtu.
This can lead to serious connection problems between VMs for example when MTU was changed, everything is working, but after some time you decide to upgrade hosts, put them in maintenance, VM migrates to another host and network connectivity randomly fails.

before migration:
vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq master test-net-mtu state UNKNOWN mode DEFAULT group default qlen 1000
after migration:
vnet12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master test-net-mtu state UNKNOWN mode DEFAULT group default qlen 1000

Version-Release number of selected component (if applicable):
ovirt-engine 4.3.0 (also 4.2.8)

How reproducible:
100%

Steps to Reproduce:
1. create network with default 1500 mtu
2. start VM with NIC attached to this network
3. change MTU for network to 9000
4. check, that MTU also changed for vm vnic:
 virsh -r domiflist test-vm
 ip link show vnet_from_first_command
must be 9000
5. migrate VM to another host
6. check vnet MTU with same commands from step 4, now it's 1500 again

Actual results:
MTU for vnet is 1500 after migration

Expected results:
MTU for vnet is 9000 after migration
OR alternatively:
migration must be prohibited until VM restarted
OR
engine should mark VM as having next run config

Additional info:

Comment 1 Dominik Holler 2019-02-13 21:53:46 UTC
Sergey, would you please share the vdsm.log of the source and destination host, and most important, the engine.log containing the migration?

Comment 2 Michael Burman 2019-02-14 10:17:29 UTC
QE can't reproduce on 4.3.0.4-0.1.el7

Please note that it is not supported to update network's MTU while it used by VM, the change will fail on vdsm side:
"VDSM host_mixed_3 command HostSetupNetworksVDS failed: Bridge mtu has interfaces set([u'vnet0']) connected"

You need first unplug the vNIC from the VM, update the network's MTU, wait the change applied successfully on the host(UI notification), plug the vNIC back. Now the MTU updated successfully and preserved after migration.

Comment 3 Sergey 2019-02-14 10:34:14 UTC
Created attachment 1534745 [details]
engine and vdsm logs

Attached engine and vdsm logs from src and dst host, don't pay attention to errors about failed network creation, I've created it on wrong interface in our test env.
Migrating VM name: empty-no-os
Net name: test-vlan-noconn
Net VDSM Name: on68b632b6f2134

Before migration:
34: vnet1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc pfifo_fast master on68b632b6f2134 state UNKNOWN group default qlen 1000

After:
36: vnet1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master on68b632b6f2134 state UNKNOWN group default qlen 1000

Comment 4 Sergey 2019-02-14 12:05:28 UTC
(In reply to Michael Burman from comment #2)
> QE can't reproduce on 4.3.0.4-0.1.el7
> 
> Please note that it is not supported to update network's MTU while it used
> by VM, the change will fail on vdsm side:
> "VDSM host_mixed_3 command HostSetupNetworksVDS failed: Bridge mtu has
> interfaces set([u'vnet0']) connected"
> 
> You need first unplug the vNIC from the VM, update the network's MTU, wait
> the change applied successfully on the host(UI notification), plug the vNIC
> back. Now the MTU updated successfully and preserved after migration.

But in fact I've tested on 2 installations and both gave no errors while updating MTU, and actually changed MTU on host side, on both I've used
"Linux Bridge" switch type and VLAN network, maybe network type(connected or vlan) is critical here.

Comment 5 Dominik Holler 2019-02-14 12:12:26 UTC
(In reply to Sergey from comment #4)
> (In reply to Michael Burman from comment #2)
> > QE can't reproduce on 4.3.0.4-0.1.el7
> > 
> > Please note that it is not supported to update network's MTU while it used
> > by VM, the change will fail on vdsm side:
> > "VDSM host_mixed_3 command HostSetupNetworksVDS failed: Bridge mtu has
> > interfaces set([u'vnet0']) connected"
> > 
> > You need first unplug the vNIC from the VM, update the network's MTU, wait
> > the change applied successfully on the host(UI notification), plug the vNIC
> > back. Now the MTU updated successfully and preserved after migration.
> 
> But in fact I've tested on 2 installations and both gave no errors while
> updating MTU, and actually changed MTU on host side, on both I've used
> "Linux Bridge" switch type and VLAN network, maybe network type(connected or
> vlan) is critical here.


The behavior of the host should not depend on the network type.
Looks like the updated MTU was never propagated to libvirt and the guest OS.
The expected behavior is documented in
https://ovirt.org/develop/release-management/features/network/managed_mtu_for_vm_networks.html#update-mtu-flow

Do you have a suggestion what would help you to know that the unplug/plug step is required?

Comment 6 Sergey 2019-02-14 13:22:45 UTC
(In reply to Dominik Holler from comment #5)
> The behavior of the host should not depend on the network type.
> Looks like the updated MTU was never propagated to libvirt and the guest OS.
> The expected behavior is documented in
> https://ovirt.org/develop/release-management/features/network/
> managed_mtu_for_vm_networks.html#update-mtu-flow
> 
> Do you have a suggestion what would help you to know that the unplug/plug
> step is required?

Thanks for a link, now I can see, that it should not work,
when mtu on VM device changed to 9000 without any actions on VM, and ping with large packets started to flow(after also changing MTU inside),
it made me believe, that migration also should work without problems, it was the only feature to get MTU update fully function from my point of view  :)

Maybe warning message, when saving net with changed mtu, stating that NIC unplug/plug or VM shutdown/poweron required to change MTU, also it can include list of affected VMs.
Or "next run config", but next run has a drawback, it won't clear after unplugging/plugging NIC.

Comment 8 Michael Burman 2021-04-21 11:42:43 UTC
Verified on - rhvm-4.4.6.5-447.gd80dda7.9.el8ev.noarch

Comment 9 Sandro Bonazzola 2021-05-05 05:35:54 UTC
This bugzilla is included in oVirt 4.4.6 release, published on May 4th 2021.

Since the problem described in this bug report should be resolved in oVirt 4.4.6 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.