Bug 1071660

Summary: MTU Mismatch between Host Bridge interface and VM Tap interface
Product: [Retired] oVirt Reporter: Jonas Israelsson <jonas>
Component: vdsmAssignee: Dan Kenigsberg <danken>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Gil Klein <gklein>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.4CC: acathrow, bazulay, danken, gklein, iheim, jonas, laine, masayag, mgoldboi, s.kieske, yeylon
Target Milestone: ---   
Target Release: 3.4.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: network
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-03-22 15:58:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
San net config
none
The supervdsm log none

Description Jonas Israelsson 2014-03-02 18:15:07 UTC
Description of problem:

I have in oVirt defined a network called (and used as) SAN where also I have enabled Jumbo Frames. 

My Host physical interface em2 is hooked up to this net hence got an MTU of 9000.

em2       Link encap:Ethernet  HWaddr D0:67:E5:F9:2E:1C
          inet6 addr: fe80::d267:e5ff:fef9:2e1c/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:9000  Metric:1
          RX packets:503209263 errors:0 dropped:0 overruns:0 frame:0
          TX packets:483040537 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:2395173035170 (2.1 TiB)  TX bytes:408352341106 (380.3 GiB) 

Still I see VM TAP interfaces attached to this bridge getting a default MTU of 1500, and therefore the NIC within the VM is anything but stable. 

Here is such a TAP interface, created when the VM starts,

# Host Virtual Nic
vnet26    Link encap:Ethernet  HWaddr FE:1A:4A:2F:D2:A8
           inet6 addr: fe80::fc1a:4aff:fe2f:d2a8/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:8 errors:0 dropped:0 overruns:0 frame:0
           TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:500
           RX bytes:648 (648.0 b)  TX bytes:468 (468.0 b)

If I manually set the MTU on that interface everything work as expected.

The engine is currently on Beta2 but have been upgraded several times. This behaviour was introduced somewhere along the way, since I have several machines running all (still) have Jumbo Frames on the the interface attached to this bridge. Newly created VMs get 1500 though. I also found out just now that if I reboot a VM that too gets the incorrect MTU on that NIC. It's as if that MTU override is lost in space.


Version-Release number of selected component (if applicable):
Engine: 3.5.0-0.0.master.20140130172954.git9257b30.el6
Node: oVirt Node Hypervisor release 3.0.3 (0.999.201401231512draft.el6)

How reproducible:
Not sure :-)
Try to do a MTU override on a network and attach a VM to it.

Comment 1 Jonas Israelsson 2014-03-02 18:36:25 UTC
Hmmm, I just noticed the version number of the Engine. I have not seen that before, does it mean I have managed to add some nightly ? I was quite sure I only had added 3.4 repos.

Comment 2 Jonas Israelsson 2014-03-02 18:43:16 UTC
[root@dashboard yum.repos.d]# rpm -qa | grep -i ovirt
ovirt-engine-lib-3.5.0-0.0.master.20140130172954.git9257b30.el6.noarch
ovirt-engine-setup-base-3.5.0-0.0.master.20140130172954.git9257b30.el6.noarch
ovirt-engine-setup-plugin-ovirt-engine-3.5.0-0.0.master.20140130172954.git9257b30.el6.noarch
ovirt-host-deploy-1.2.0-0.2.master.20140120.gitdeb0453.el6.noarch
ovirt-host-deploy-java-1.2.0-0.2.master.20140120.gitdeb0453.el6.noarch
ovirt-engine-setup-3.5.0-0.0.master.20140130172954.git9257b30.el6.noarch
ovirt-iso-uploader-3.5.0-0.0.master.20140120.gitd18e6f7.el6.noarch
ovirt-engine-websocket-proxy-3.4.0-0.7.beta2.el6.noarch
ovirt-engine-webadmin-portal-3.4.0-0.7.beta2.el6.noarch
ovirt-engine-userportal-3.4.0-0.7.beta2.el6.noarch
ovirt-engine-3.4.0-0.7.beta2.el6.noarch
ovirt-engine-sdk-python-3.5.0.0-1.20140128.gitf272277.el6.noarch
ovirt-engine-setup-plugin-ovirt-engine-common-3.5.0-0.0.master.20140130172954.git9257b30.el6.noarch
ovirt-engine-setup-plugin-websocket-proxy-3.5.0-0.0.master.20140130172954.git9257b30.el6.noarch
ovirt-log-collector-3.5.0-0.0.master.20140130.git2122d59.el6.noarch
ovirt-image-uploader-3.5.0-0.0.master.20140120.git7801707.el6.noarch
ovirt-release-el6-10.0.1-3.noarch
ovirt-engine-cli-3.4.0.3-1.el6.noarch
ovirt-engine-dbscripts-3.4.0-0.7.beta2.el6.noarch
ovirt-engine-backend-3.4.0-0.7.beta2.el6.noarch
ovirt-engine-restapi-3.4.0-0.7.beta2.el6.noarch
ovirt-engine-tools-3.4.0-0.7.beta2.el6.noarch

Comment 3 Dan Kenigsberg 2014-03-02 22:31:34 UTC
Jonas, could you specify your libvirt version and the output of `ifconfig <bridge-of-vnet26>` and that of `virsh -r dumpxml <name-of-your-vm>` ?

Laine, shouldn't libvirt copy the MTU form the bridge to the tap device?

Comment 4 Jonas Israelsson 2014-03-02 23:00:07 UTC
Sorry, I should have checked this, I did not know the architecture

SAN       Link encap:Ethernet  HWaddr D0:67:E5:F9:2E:1C  
          inet addr:192.168.43.11  Bcast:192.168.43.255  Mask:255.255.255.0
          inet6 addr: fe80::d267:e5ff:fef9:2e1c/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:386710758 errors:0 dropped:0 overruns:0 frame:0
          TX packets:384811517 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:2343055570732 (2.1 TiB)  TX bytes:396696696938 (369.4 GiB)

So I guess what seem to have happened here is that the engine-settings on this bridge is somehow lost ?

Comment 5 Jonas Israelsson 2014-03-02 23:01:19 UTC
Created attachment 869710 [details]
San net config

Comment 6 Jonas Israelsson 2014-03-02 23:05:03 UTC
SAN             8000.d067e5f92e1c       no              em2
                                                        vnet0
                                                        vnet11
                                                        vnet13
                                                        vnet16
                                                        vnet19
                                                        vnet21
                                                        vnet26
                                                        vnet4
                                                        vnet6
                                                        vnet8
                                                        vnet9

Comment 7 Jonas Israelsson 2014-03-02 23:29:48 UTC
Not allowed to raise the MTU beyond 1500 on the bridge.

[root@ft admin]# ifconfig SAN mtu 9000
SIOCSIFMTU: Invalid argument

Comment 8 Dan Kenigsberg 2014-03-03 00:30:22 UTC
Have you changed the MTU while the network has already been in use by VMs? Does the issue reproduce on new hosts or after boot?

Could you share your ifcfg-SAN, and supervdsm.log with the setupNetworks() setting it?

Comment 9 Jonas Israelsson 2014-03-03 06:44:23 UTC
>Have you changed the MTU while the network has already been in use by VMs? 

No I have after installing this system not changed anything related to the network config.

> Does the issue reproduce on new hosts or after boot?

I currently only have a single host and it can not easily be rebooted. I'll do my best trying to make time for maintenance and inform my users. 

> Could you share your ifcfg-SAN, 

[root@ft network-scripts]# more ifcfg-SAN 
# Generated by VDSM version 4.14.1-2.el6
DEVICE=SAN
ONBOOT=yes
TYPE=Bridge
DELAY=0
IPADDR=192.168.43.11
NETMASK=255.255.255.0
BOOTPROTO=none
MTU=9000
NM_CONTROLLED=no
STP=no


> and supervdsm.log with the setupNetworks() setting it?

Attached

Comment 10 Jonas Israelsson 2014-03-03 06:47:23 UTC
Created attachment 869801 [details]
The supervdsm log

Comment 11 Dan Kenigsberg 2014-03-03 10:52:20 UTC
I see no apparent clue on why the bridge ignored the MTU setting. I'd appreciate a reproduction - maybe with another network on the same host.

Would an explicit
  ifconfig vnet0 mtu 9000
for each vnet* would make it possible to set the mtu on the bridge and provide an immediate workwaround?

Comment 12 Jonas Israelsson 2014-03-03 11:07:35 UTC
> I see no apparent clue on why the bridge ignored the MTU setting. I'd 
> appreciate a reproduction - maybe with another network on the same host.

I do have a free NIC in this host that I could try and play around with..

>Would an explicit
>  ifconfig vnet0 mtu 9000
>for each vnet* would make it possible to set the mtu on the bridge and provide >an immediate workwaround?

I did try this yesterday, it made no difference.

I have now rebooted the host and the bridge now is back to MTU 9000

SAN       Link encap:Ethernet  HWaddr D0:67:E5:F9:2E:1C  
          inet addr:192.168.43.11  Bcast:192.168.43.255  Mask:255.255.255.0
          inet6 addr: fe80::d267:e5ff:fef9:2e1c/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:9000  Metric:1
          RX packets:28617 errors:0 dropped:0 overruns:0 frame:0
          TX packets:21750 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:191312618 (182.4 MiB)  TX bytes:5328456 (5.0 MiB)

I fired up my first VM and that too now get the right MTU

vnet0     Link encap:Ethernet  HWaddr FE:1A:4A:2F:D2:A6  
          inet6 addr: fe80::fc1a:4aff:fe2f:d2a6/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:9000  Metric:1
          RX packets:8 errors:0 dropped:0 overruns:0 frame:0
          TX packets:9 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:500 
          RX bytes:648 (648.0 b)  TX bytes:648 (648.0 b)


I'll do my best to try and reproduce this again.

Comment 13 Dan Kenigsberg 2014-03-22 15:58:34 UTC
Please reopen this bug when it reproduces or substantial new information is found.