Bug 697021

Summary: Patch needed to allow MTU >1500 on vif prior to connecting to bridge
Product: Red Hat Enterprise Linux 5 Reporter: Madison Kelly <mkelly>
Component: kernel-xenAssignee: Paolo Bonzini <pbonzini>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 5.6CC: dhoward, drjones, herrold, jzheng, leiwang, lersek, pasik, pbonzini, plyons, qguan, qwan, xen-maint, yuzhang
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-2.6.18-283.el5 Doc Type: Bug Fix
Doc Text:
Prior to this update, MTU was constrained to 1500 unless Scatter/Gather I/O (SG) was supported by the NIC; in the case of netback, this would mean unless SG was supported by the front-end. Because the hotplugging scripts ran before features have been negotiated with the front-end, at that point SG would still be disabled, breaking anything using larger MTUs, (for example, cluster communication using that NIC). This update inverts the behavior and assumes SG to be present until negotiations prove otherwise (in such a case, MTU is automatically reduced).
Story Points: ---
Clone Of:
: 733416 (view as bug list) Environment:
Last Closed: 2012-02-21 03:34:23 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 720347, 746225    
Bug Blocks: 514489, 697310, 733416, 733651, 738389    
Attachments:
Description Flags
Patch for arbitrary vif MTU before connect to bridge against 2.6.18-238.9 kernel
none
Patch to 2.6.18-238.9.1's kernel spec
none
Actual patch to kernel-2.6.spec (first patch was full spec file)
none
Updated userland patch which works with multiple, out of sync domU ethX to dom0 xenbrY devices none

Description Madison Kelly 2011-04-15 16:01:22 UTC
Description of problem:

A patch to Xen 3.0.3 does not have a patch applied to allow for arbitrary MTU size of vif's. This causes the bridges to drop it's MTU to 1500 when the vif connects, breaking anything using larger MTUs, like cluster communication, using that bridge.

The patch is discussed here:

http://lists.xensource.com/archives/html/xen-devel/2011-02/msg00413.html

Version-Release number of selected component (if applicable):

xen-3.0.3-120.el5_6.1
kernel-2.6.18-238.5.1.el5

How reproducible:

100%

Steps to Reproduce:
1. Install Xen, edit /etc/xen/scripts/vif-bridge to match that referenced by the email.
2. Try to start a domU. Without the patch, the domU start will fail.
3.
  
Actual results:

MTU on vif's can't be set > 1500 before connecting to the bridge.

Expected results:

MTU on vif's can be set > 1500 before connecting to the bridge.

Additional info:

Comment 1 Madison Kelly 2011-04-17 05:17:35 UTC
Created attachment 492662 [details]
Patch for arbitrary vif MTU before connect to bridge against 2.6.18-238.9 kernel

With the help of Pasi Kärkkäinen, I've attempted to adapt the patch discussed in comment #1 for the latest EL5.6 kernel, 2.6.18-238.9.1. I've compiled the kernel with this patch, but have not yet tested it. I will do so in the next day or so. In the meantime, I'm hoping those more qualified could review and test the patch.

Comment 2 Madison Kelly 2011-04-17 05:19:25 UTC
Created attachment 492663 [details]
Patch to 2.6.18-238.9.1's kernel spec

This is the patch for the kernel-2.6.spec to apply the patch. This adds the patch after all pre-existing patches were applied and is entered as Patch28000.

Comment 3 Madison Kelly 2011-04-17 05:21:32 UTC
Sorry, comment #3 is the full spec file, not a patch. I'll post the patch after this comment.

Comment 4 Madison Kelly 2011-04-17 05:22:52 UTC
Created attachment 492664 [details]
Actual patch to kernel-2.6.spec (first patch was full spec file)

Attachment #2 [details] was the full spec file. This is the actual patch.

Comment 5 Madison Kelly 2011-04-17 05:41:10 UTC
Could someone with proper access add Olaf (olaf) to the CC list, given that he is the original author of the patch? Thanks.

Comment 6 Madison Kelly 2011-04-17 16:49:14 UTC
rhbz #697310 is the vif-bridge component of this patch.

Comment 8 Madison Kelly 2011-04-22 23:15:40 UTC
I've finished testing 697310 and I can now successfully create and live-migrate domUs with large MTUs with this kernel and the two patches applied from that bug. If someone could review/test and, if it's ok, merge, I'd much appreciate it.

Comment 9 Paolo Bonzini 2011-04-26 07:24:50 UTC
Thanks digimer.  Yes, we plan to merge the patches, but unfortunately the development phase has already ended for RHEL5.7 so it will take some time.

Comment 10 Madison Kelly 2011-04-26 13:22:38 UTC
So this patch won't be applied until the 5.8 release then, eh? If so, I'll try to keep a patched RPM available for 5.6 and 5.7 for those who wish to use this patch in the mean time and update these two tickets as I go, if that will help.

Thanks for replying. :)

Comment 11 Madison Kelly 2011-04-30 20:59:06 UTC
Created attachment 495998 [details]
Updated userland patch which works with multiple, out of sync domU ethX to dom0 xenbrY devices

I found a bug where if the user was starting a domU connected to out-of-order bridges (ie: xenbr0 and xenbr2 which mapped to eth0 and eth1 in domU), the patched userland tools would try to get the MTU of the 'xenbr' with the sequence number matching the interface in the domU.

With the help of Paolo and Pasi, I was able to solve this by no longer having xen setup the bridges (which is not recommended anyway) and altering the xen-network-common.sh and xen-network-common-bonding.sh scripts to set the MTU of the vif prior to connecting to the bridge in their add_to_bridge() functions.

Note that this was tested with xend-config.sxp set to '(network-script /bin/true)' and the bridge built in /etc/sysconfig/network-scripts/ifcfg-xenbrX.

Comment 12 Madison Kelly 2011-05-01 01:50:09 UTC
Comment on attachment 495998 [details]
Updated userland patch which works with multiple, out of sync domU ethX to dom0 xenbrY devices

Please ignore this, it was meant for bug #697310

Comment 19 RHEL Program Management 2011-08-04 04:15:01 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 31 Jarod Wilson 2011-08-30 19:22:18 UTC
Patch(es) available in kernel-2.6.18-283.el5
You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5
Detailed testing feedback is always welcomed.

Comment 32 Paolo Bonzini 2011-08-31 08:38:28 UTC
> - on both machines, ensure the Xen bridge has its MTU set to 9000, for example
> by adding this (huge hack) to /etc/init.d/xend at the end of function
> await_daemons_up
> 
> sleep 5
> ip link set mtu 9000 dev vif0.1
> ip link set mtu 9000 dev peth0
> ip link set mtu 9000 dev xenbr0

This is not necessary if you have the patch for bug 733417.

Comment 36 Martin Prpič 2011-10-27 09:21:09 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Prior to this update, MTU was constrained to 1500 unless Scatter/Gather I/O (SG) was supported by the NIC; in the case of netback, this would mean unless SG was supported by the front-end. Because the hotplugging scripts ran before features have been negotiated with the front-end, at that point SG would still be disabled, breaking anything using larger MTUs, (for example, cluster communication using that NIC). This update inverts the behavior and assumes SG to be present until negotiations prove otherwise (in such a case, MTU is automatically reduced).

Comment 37 Qixiang Wan 2011-11-15 08:55:06 UTC
Tested with the same steps as bug 738608 comment 6.
 
Host is kernel-xen-2.6.18-297.el5 x86_64, xen-3.0.3-135.el5.

Tested against the RHEL5.8 20111030.0 (kernel-xen-2.6.18-294.el5) and RHEL6.2 20111109.1 (kernel-2.6.32-220.el6) PV guests (both i386 and x86_64). The guests all kept the right MTU value after migration. Change this bug to VERIFIED.

Comment 38 errata-xmlrpc 2012-02-21 03:34:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-0150.html