Bug 125122

Summary:	(NET) Jumbo frames do not work properly with UDP
Product:	Red Hat Enterprise Linux 3	Reporter:	Tom Sightler <ttsig>
Component:	kernel	Assignee:	Jeff Garzik <jgarzik>
Status:	CLOSED WONTFIX	QA Contact:	Brian Brock <bbrock>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	3.0	CC:	k.georgiou, marco, pdemauro, peterm, petrides, riel, tao
Target Milestone:	---
Target Release:	---
Hardware:	i686
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2007-10-19 19:25:01 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Tom Sightler 2004-06-02 20:31:47 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040510

Description of problem:
After upgrading several machines to RHEL3 we noticed that NFS
connection would hang if we used UDP over a jumbo frame gigabit
connection.  TCP connection worked fine.

After further investigation I believe we have discovered what might be
a global problem with UDP and Jumbo frames.  If you configure two
RHEL3 boxes for Jumbo frames and run tracepath (which uses UDP) from
one to the other you get output like this:

# tracepath 10.10.91.62
 1:  10.10.91.61 (10.10.91.61)  asymm 65   0.283ms pmtu 9000
 1:  10.10.91.61 (10.10.91.61)  asymm 65   0.088ms pmtu 8166
 1:  10.10.91.61 (10.10.91.61)  asymm 65   0.064ms pmtu 4352
 1:  10.10.91.61 (10.10.91.61)  asymm 65   0.054ms pmtu 2002
 1:  10.10.91.61 (10.10.91.61)  asymm 65   0.039ms pmtu 1492
 1:  10.10.91.62 (10.10.91.62)             0.263ms reached
     Resume: pmtu 1492 hops 1 back 1

However, I think you should get output like this:

# tracepath 10.10.91.62
 1?: [LOCALHOST]     pmtu 9000
 1:  10.10.91.62 (10.10.91.62)             0.431ms reached
     Resume: pmtu 9000 hops 1 back 1

The second output is generated from a RHEL3 system running the stock
2.4.21 kernel.  You also get the same output from RHEL2.1 and a stock
2.4.26 kernel.

After going through the patch list for the RHEL3 kernel I discovered
that the patch which seems to cause this error is the
linux-2.4.21-ipsec.patch.  Building a kernel with this patch removed
returns the system back to what I believe is the correct behaviour.

Version-Release number of selected component (if applicable):
kernel-smp-2.4.21-15.EL

How reproducible:
Always

Steps to Reproduce:
1. Connect two systems with jumbo capable NIC's with a crossover cable
(we tested tg3, acenic, and e1000 drivers)
2. Configure the interfaces for Jumbo frames (we used and MTU of 9000)
3. Run tracepath between the two systems.


Actual Results:  tracepath show the link with a PMTU of 1492

Expected Results:  tracepath should show the link with a PMTU of 9000

Additional info:

I tested this on a large number of kernels and only the RHEL3 kernels
showed this issue.  Backing out the ipsec patches in the RHEL3 kernel
seems to eliminate this issue.  Even if the ipsec patches are applied
(not even compiled in) this issue still occurs.

I'm pretty confident this is incorrect behaviour, but someone might
prove me wrong.

Comment 2 Eric Paris 2004-08-13 19:00:32 UTC

I wrote a test program which just basically dumps large UDP frames on
the wire.  No problems whatso ever, I will attach the little program
if needed.  I do however see the exact same thing that is shown above
when tracepath is run.  None of the packets with MTU > than 1500 from
tracepath show up in a tcpdump of the sending interface.  Looks like
it is a bug with the actual pmtu discovery frames not even making it
to the interface.  The issue is not actually with UDP frames.

Comment 3 Tom Sightler 2004-08-13 20:29:07 UTC

OK, I'll accept that, but it still seems a little different that what
I experienced way back when I filed the bug, however, I have just
tested and found that disabling PMTU discovery does indeed return
NFS/UDP behaviour back to normal.

If I'm understanding what you are saying, this basically means that
jumbo frames don't really work at all on RHEL3 (TCP/UDP/whatever)
since the system always does PMTU discovery unless explicitly turned
off system wide.  Is that right?

Thanks,
Tom

Comment 5 none 2005-07-29 10:46:08 UTC

I enabled jumbo frames (9000 bytes MTU) on two Dell PowerEdge 1850
servers running Red Hat Enterprise Linux 3 ES Update 5 x86_64 using
the following commands:

ifconfig eth1 mtu 9000
service network restart

After the MTU change, I was able to transfer (scp) very small files
between the servers, ping also worked fine, but when trying to
transfer bigger files, the copy process got stuck.
Also, the NFS mounts didn't work anymore.

I reverted the MTU change and all got back to work as before.

The switch is a Dell PowerConnect 5324 and it should support jumbo
frames just fine.

Is this a RHEL3 issue?

Thanks.

Comment 6 RHEL Program Management 2007-10-19 19:25:01 UTC

This bug is filed against RHEL 3, which is in maintenance phase.
During the maintenance phase, only security errata and select mission
critical bug fixes will be released for enterprise products. Since
this bug does not meet that criteria, it is now being closed.
 
For more information of the RHEL errata support policy, please visit:
http://www.redhat.com/security/updates/errata/
 
If you feel this bug is indeed mission critical, please contact your
support representative. You may be asked to provide detailed
information on how this bug is affecting you.