From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040510 Description of problem: After upgrading several machines to RHEL3 we noticed that NFS connection would hang if we used UDP over a jumbo frame gigabit connection. TCP connection worked fine. After further investigation I believe we have discovered what might be a global problem with UDP and Jumbo frames. If you configure two RHEL3 boxes for Jumbo frames and run tracepath (which uses UDP) from one to the other you get output like this: # tracepath 10.10.91.62 1: 10.10.91.61 (10.10.91.61) asymm 65 0.283ms pmtu 9000 1: 10.10.91.61 (10.10.91.61) asymm 65 0.088ms pmtu 8166 1: 10.10.91.61 (10.10.91.61) asymm 65 0.064ms pmtu 4352 1: 10.10.91.61 (10.10.91.61) asymm 65 0.054ms pmtu 2002 1: 10.10.91.61 (10.10.91.61) asymm 65 0.039ms pmtu 1492 1: 10.10.91.62 (10.10.91.62) 0.263ms reached Resume: pmtu 1492 hops 1 back 1 However, I think you should get output like this: # tracepath 10.10.91.62 1?: [LOCALHOST] pmtu 9000 1: 10.10.91.62 (10.10.91.62) 0.431ms reached Resume: pmtu 9000 hops 1 back 1 The second output is generated from a RHEL3 system running the stock 2.4.21 kernel. You also get the same output from RHEL2.1 and a stock 2.4.26 kernel. After going through the patch list for the RHEL3 kernel I discovered that the patch which seems to cause this error is the linux-2.4.21-ipsec.patch. Building a kernel with this patch removed returns the system back to what I believe is the correct behaviour. Version-Release number of selected component (if applicable): kernel-smp-2.4.21-15.EL How reproducible: Always Steps to Reproduce: 1. Connect two systems with jumbo capable NIC's with a crossover cable (we tested tg3, acenic, and e1000 drivers) 2. Configure the interfaces for Jumbo frames (we used and MTU of 9000) 3. Run tracepath between the two systems. Actual Results: tracepath show the link with a PMTU of 1492 Expected Results: tracepath should show the link with a PMTU of 9000 Additional info: I tested this on a large number of kernels and only the RHEL3 kernels showed this issue. Backing out the ipsec patches in the RHEL3 kernel seems to eliminate this issue. Even if the ipsec patches are applied (not even compiled in) this issue still occurs. I'm pretty confident this is incorrect behaviour, but someone might prove me wrong.
I wrote a test program which just basically dumps large UDP frames on the wire. No problems whatso ever, I will attach the little program if needed. I do however see the exact same thing that is shown above when tracepath is run. None of the packets with MTU > than 1500 from tracepath show up in a tcpdump of the sending interface. Looks like it is a bug with the actual pmtu discovery frames not even making it to the interface. The issue is not actually with UDP frames.
OK, I'll accept that, but it still seems a little different that what I experienced way back when I filed the bug, however, I have just tested and found that disabling PMTU discovery does indeed return NFS/UDP behaviour back to normal. If I'm understanding what you are saying, this basically means that jumbo frames don't really work at all on RHEL3 (TCP/UDP/whatever) since the system always does PMTU discovery unless explicitly turned off system wide. Is that right? Thanks, Tom
I enabled jumbo frames (9000 bytes MTU) on two Dell PowerEdge 1850 servers running Red Hat Enterprise Linux 3 ES Update 5 x86_64 using the following commands: ifconfig eth1 mtu 9000 service network restart After the MTU change, I was able to transfer (scp) very small files between the servers, ping also worked fine, but when trying to transfer bigger files, the copy process got stuck. Also, the NFS mounts didn't work anymore. I reverted the MTU change and all got back to work as before. The switch is a Dell PowerConnect 5324 and it should support jumbo frames just fine. Is this a RHEL3 issue? Thanks.
This bug is filed against RHEL 3, which is in maintenance phase. During the maintenance phase, only security errata and select mission critical bug fixes will be released for enterprise products. Since this bug does not meet that criteria, it is now being closed. For more information of the RHEL errata support policy, please visit: http://www.redhat.com/security/updates/errata/ If you feel this bug is indeed mission critical, please contact your support representative. You may be asked to provide detailed information on how this bug is affecting you.