Bug 484796 - tulip driver MTU problems when using dot1q vlans
Summary: tulip driver MTU problems when using dot1q vlans
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.2
Hardware: All
OS: Linux
low
high
Target Milestone: rc
: ---
Assignee: Ivan Vecera
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-02-09 23:02 UTC by Andreas Thienemann
Modified: 2009-09-02 08:58 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-09-02 08:58:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Backport of Tulip patch to 2.6.18-92.1.22.el5 (7.41 KB, patch)
2009-02-11 00:53 UTC, Andreas Thienemann
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2009:1243 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 5.4 kernel security and bug fix update 2009-09-01 08:53:34 UTC

Description Andreas Thienemann 2009-02-09 23:02:34 UTC
Description of problem:
The tulip driver in the kernel 2.6.18-92.1.13.el5 as shipped with 5.2 exhibits MTU problems with large packets when 802.1q vlan tagging is used.

Version-Release number of selected component (if applicable):
module version 1.1.13
kernel version 2.6.18-92.1.13.el5

How reproducible:
always

Steps to Reproduce:
1. Configure tulip device to use vlan tagging. e.g. vconfig add eth7 200
2. Configure vlan device to have a testing IP Address. e.g. ifconfig eth7.200 192.168.23.2
3. Configure switch to enable tagged vlans on the connected port.
4. Ping another device through the interface: ping -n -M do -s 1469 192.168.23.1

Actual results:

# ping -n -M do -s 1469 192.168.23.1
PING 192.168.23.1 (192.168.23.1) 1469(1497) bytes of data.

--- 192.168.23.1 ping statistics ---
11 packets transmitted, 0 received, 100% packet loss, time 10013ms


Expected results:

# ping -n -M do -s 1468 192.168.23.1
PING 192.168.23.1 (192.168.23.1) 1468(1496) bytes of data.
1476 bytes from 192.168.23.1: icmp_seq=0 ttl=64 time=1.08 ms
1476 bytes from 192.168.23.1: icmp_seq=1 ttl=64 time=1.16 ms

--- 192.168.23.1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 98104ms
rtt min/avg/max/mdev = 1.037/5.657/15.947/5.665 ms, pipe 2


Additional info:

Running the same command on a 8139too supported interface with dot1q works as expected:
# ping -n -M do -s 1472 172.16.5.253
PING 172.16.5.253 (172.16.5.253) 1472(1500) bytes of data.
1480 bytes from 172.16.5.253: icmp_seq=1 ttl=64 time=0.781 ms
1480 bytes from 172.16.5.253: icmp_seq=2 ttl=64 time=0.777 ms
1480 bytes from 172.16.5.253: icmp_seq=3 ttl=64 time=0.767 ms

--- 172.16.5.253 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1999ms
rtt min/avg/max/mdev = 0.767/0.775/0.781/0.005 ms

Watching the output of the command "ping -n -M do -c 1 -s 1468 192.168.23.2; ping -n -M do -c 1 -s 1469 192.168.23.2" shows that the magic limit for ethernet layer 2 packets is 1514.

[root@bb1 ~]# tcpdump -i eth7 -e -n vlan
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth7, link-type EN10MB (Ethernet), capture size 96 bytes
23:38:01.315924 00:80:c8:b9:39:2c > 00:e0:81:60:b5:a0, ethertype 802.1Q (0x8100), length 1514: vlan 200, p 0, ethertype IPv4, 192.168.23.1 > 192.168.23.2: ICMP echo request, id 7689, seq 1, length 1476
23:38:01.316945 00:e0:81:60:b5:a0 > 00:80:c8:b9:39:2c, ethertype 802.1Q (0x8100), length 1514: vlan 200, p 0, ethertype IPv4, 192.168.23.2 > 192.168.23.1: ICMP echo reply, id 7689, seq 1, length 1476
23:38:01.322251 00:80:c8:b9:39:2c > 00:e0:81:60:b5:a0, ethertype 802.1Q (0x8100), length 1515: vlan 200, p 0, ethertype IPv4, 192.168.23.1 > 192.168.23.2: ICMP echo request, id 7945, seq 1, length 1477

There has been a patch for this issue posted to linux-netdev: http://kerneltrap.org/mailarchive/linux-netdev/2008/4/15/1442034

Please see to backporting this fix into future kernel updates. kthxbye. :)

Setting severity to high, as the current behaviour of silently dropping packets without sending Fragmentation needed packets breaks networking in many cases.

Comment 1 Andreas Thienemann 2009-02-11 00:53:50 UTC
Created attachment 331505 [details]
Backport of Tulip patch to 2.6.18-92.1.22.el5

This is a backport to the current 2.6.18 release from Red Hat.

[root@bb1 ~]# ping -n -M do -s 1472 192.168.23.2
PING 192.168.23.2 (192.168.23.2) 1472(1500) bytes of data.
1480 bytes from 192.168.23.2: icmp_seq=1 ttl=64 time=1.07 ms
1480 bytes from 192.168.23.2: icmp_seq=2 ttl=64 time=1.09 ms
1480 bytes from 192.168.23.2: icmp_seq=3 ttl=64 time=1.07 ms

--- 192.168.23.2 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1999ms
rtt min/avg/max/mdev = 1.075/1.082/1.094/0.028 ms
[root@bb1 ~]#

Tests show this patch to correct the MTU issue with tulip cards.

Comment 3 RHEL Program Management 2009-03-04 10:24:20 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 4 Ivan Vecera 2009-03-04 14:29:01 UTC
The proposed patch was posted to netdev@... but it wasn't pulled. I'm going to ask the maintainers for the reason.

Comment 5 Ivan Vecera 2009-03-05 10:27:44 UTC
The maintainers have no problem with including the patch. Note the original patch contained several style errors so I have submitted the corrected one.
See: http://marc.info/?l=linux-netdev&m=123624856505509&w=2

Comment 6 Ivan Vecera 2009-03-09 09:00:35 UTC
Fortunately the proposed patch was accepted by upstream. I'm going to send it for review.

Comment 8 Don Zickus 2009-03-23 15:53:11 UTC
in kernel-2.6.18-136.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Please do NOT transition this bugzilla state to VERIFIED until our QE team
has sent specific instructions indicating when to do so.  However feel free
to provide a comment indicating that this fix has been verified.

Comment 11 errata-xmlrpc 2009-09-02 08:58:11 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1243.html


Note You need to log in before you can comment on or make changes to this bug.