Bug 1168246

Summary: [Rhevm-upgrade 3.4>3.5] Upgrade breaks custom MTU to default MTU
Product: Red Hat Enterprise Virtualization Manager Reporter: Michael Burman <mburman>
Component: ovirt-engineAssignee: Alona Kaplan <alkaplan>
Status: CLOSED NOTABUG QA Contact: Michael Burman <mburman>
Severity: urgent Docs Contact:
Priority: high    
Version: 3.5.0CC: ecohen, gklein, iheim, lpeer, lsurette, lvernia, mburman, rbalakri, Rhev-m-bugs, yeylon
Target Milestone: ---   
Target Release: 3.5.0   
Hardware: x86_64   
OS: Linux   
Whiteboard: network
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-11-30 11:37:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Network RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1164308, 1164311    

Description Michael Burman 2014-11-26 13:37:09 UTC
Description of problem:
[Rhevm-upgrade 3.4>3.5] Upgrade breaks custom MTU to default MTU.
Networks that had custom MTU(for example 5000) and was attached to host before rhevm upgrade, after the upgrade the MTU on those networks changed to default MTU(1500) and those networks became unsynchronized networks. 

Version-Release number of selected component (if applicable):
3.5.0-0.22.el6ev
vdsm-4.16.7.5-1.el6ev

Steps to Reproduce:
1. rhevm 3.4 > configure custom MTU on network, for example 5000 
2. run rhevm upgrade 3.4>3.5
3. go to SetupNetworks and to networks tab and check MTU

Actual results:
In SetupNetworks, the network became unsynchronized, and in networks tab, the MTU changed back to default MTU(1500) 

Expected results:
MTU shouldn't brake after rhevm upgrade

Comment 1 Lior Vernia 2014-11-27 15:00:46 UTC
Having looked at the code, there really isn't any changes to MTU from an upgrade script point of view - it is only that the engine interprets the same DB value differently.

So my current hypothesis is as follows: the networks that appear as out-of-sync had had default MTU configured on them prior to the upgrade, but on their specific hosts had gotten an MTU that's different from the one configured as default after the upgrade (typically 1500).

Could you try to reproduce and check this? You would need to have a network with default MTU on your 3.4 deployment, and configure it on a host. Then after the upgrade, set the default MTU to something different, and the network should be out-of-sync. And this would not be a bug.

Any other situation should not lead to out-of-sync networks, if my hypothesis is correct. Networks with custom MTU should stay with the custom MTU and be in sync. It should not be related to whether the network was attached by label, nor to my understanding whether it was attached to a NIC shared by other networks.

Please let us know!

Comment 2 Michael Burman 2014-11-30 07:05:42 UTC
Lior, 

Your hypothesis is not right, the networks that appear as out-of-sync had a custom MTU(5000) configured on them prior to the upgrade and also the hosts(5000).

Comment 3 Alona Kaplan 2014-11-30 10:34:36 UTC
Trying to reproduce the bug, that's what I found out-
-In case a network with default MTU shares the nic with another networks, it gets (by the vdsm) the mtu of the base (parent) interface.
-Since the base interface gets the mtu of the network with the max mtu value, it means that networks with default mtu get the mtu of the network with the max mtu value that share the same base nic with them.
- In 3.5 as a fix to bug- https://bugzilla.redhat.com/1043808 setting the actual value of a default mtu was moved the the engine side. DefaultMTU config value was added to the engine. It means that all the networks that are sent to the vdsm will have explicit MTU. If the network is marked to have default mtu when adding/updating the network, the mtu value that will be sent to the vdsm is- DefaultMTU.
Also, a new code was added to the unsync logic- if a network is marked to have default mtu, but in fact its mtu on the host is different than DefaultMTU, it will be marked as unsync.
- When upgrading the engine to 3.5, all the networks with default MTU (on the dc level), which the actual mtu (host level) set on them is different than DefaultMTU config value are marked as unsynced.

So, in case the networks that appear as out-of-sync had default MTU configured on them prior to the upgrade, it is not a bug.

Michael, you described in the bug that those networks had custom mtu prior (on dc and host level) the update, and after the update they changed to have default mtu (just on dc level, that's why they are unsynced).
I couldn't reproduce this behaviour. Please try to reproduce it again, making sure the networks are configured with default mtu prior the update.

Comment 4 Michael Burman 2014-11-30 11:05:49 UTC
Alona, 

I couldn't reproduce this behavior too, with networks configured with custom MTU(5000) before upgrade. 
When trying to reproduce this with simple upgrade 3.4>3.5 , custom MTU doesn't change and networks stays synced. 
But have to say, that the origin bug was on a mixed environment, with much more dc's, clusters, hosts and storage domain's.

Comment 5 Lior Vernia 2014-11-30 11:37:19 UTC
So the behavior we've managed to introduce (of networks sharing interfaces with other custom-MTU networks) is confusing but expected (and inevitable). Please re-open if you manage to reproduce the other phenomena.