Bug 809908 - pch_gbe transfer length error with VLAN tagging and default MTU
Summary: pch_gbe transfer length error with VLAN tagging and default MTU
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 16
Hardware: i686
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Veaceslav Falico
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-04-04 16:14 UTC by Carsten Clasohm
Modified: 2014-09-30 23:44 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 851682 (view as bug list)
Environment:
Last Closed: 2012-09-04 19:00:18 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
sosreport from embedded device (334.86 KB, application/x-xz)
2012-04-04 16:14 UTC, Carsten Clasohm
no flags Details
lines from /var/log/messages (535 bytes, application/octet-stream)
2012-04-04 16:15 UTC, Carsten Clasohm
no flags Details
count vlan header on length verification (999 bytes, patch)
2012-04-05 05:19 UTC, Veaceslav Falico
no flags Details | Diff
upstream/fedora patch (1.03 KB, patch)
2012-04-10 07:22 UTC, Veaceslav Falico
no flags Details | Diff

Description Carsten Clasohm 2012-04-04 16:14:17 UTC
Created attachment 575180 [details]
sosreport from embedded device

Description of problem:

While testing an embedded device with an Intel EG20T Gigabit Ethernet Controller, we encountered a problem with the pch_gbe kernel module and VLAN tagging.


Version-Release number of selected component (if applicable):

kernel-3.1.0-7.fc16.i686


How reproducible:

always


Steps to Reproduce:
1. configure tagged VLANs
2. ping -s 3000 IP_ADDR
  
Actual results:

100% packet loss, and the following message is displayed on the console each time a ping packet is sent:

[  422.262549] pch_gbe: Transfer length Error: skb len: 1518 > max: 1518


Expected results:

ping should be able to send large packets.


Additional info:

Adding "MTU=1518" in /etc/sysconfig/network-scripts/ifcfg-p4p1 and restarting the network fixes the problem. (The MTU of the VLAN interface is kept at 1500.)

Comment 1 Carsten Clasohm 2012-04-04 16:15:13 UTC
Created attachment 575181 [details]
lines from /var/log/messages

Comment 2 Josh Boyer 2012-04-04 16:28:05 UTC
(In reply to comment #0)
> Created attachment 575180 [details]
> sosreport from embedded device
> 
> Description of problem:
> 
> While testing an embedded device with an Intel EG20T Gigabit Ethernet
> Controller, we encountered a problem with the pch_gbe kernel module and VLAN
> tagging.
> 
> 
> Version-Release number of selected component (if applicable):
> 
> kernel-3.1.0-7.fc16.i686

That's relatively old now.  Can you try with the 3.3.1 kernel update?  If it still has issues with that, you might want to email the netdev list.

Comment 3 Carsten Clasohm 2012-04-04 21:01:34 UTC
We tried to install Fedora 16 on the device and then update the kernel, but were unable to boot afterwards. For some reason, anaconda created a bios_grub partition, and the system said "no operating system found" after the reboot.

Because our priority is on getting pch_gbe to work with RHEL, we have decided to not spend more time on this.

Comment 4 Veaceslav Falico 2012-04-05 05:19:04 UTC
Created attachment 575272 [details]
count vlan header on length verification

Here's the proposed patch (against rhel6, but should apply cleanly to fedora kernel also).

Comment 6 Josh Boyer 2012-04-05 12:18:08 UTC
(In reply to comment #4)
> Created attachment 575272 [details]
> count vlan header on length verification
> 
> Here's the proposed patch (against rhel6, but should apply cleanly to fedora
> kernel also).

Except it doesn't apply at all.  The Fedora kernels are all on 3.3 or newer and this patch is against either RHEL or an older Fedora kernel.

Could you send this fix, with a proper changelog and signed-off-by, to netdev?

Comment 7 Veaceslav Falico 2012-04-10 07:21:42 UTC
(In reply to comment #6)
> Except it doesn't apply at all.  The Fedora kernels are all on 3.3 or newer and
> this patch is against either RHEL or an older Fedora kernel.

Yep, sorry, my bad. Will attach now the corrected patch for upstream/fedora, please test it if you have the hw.

> 
> Could you send this fix, with a proper changelog and signed-off-by, to netdev?

Will do, once it will be tested (I don't have the needed hardware atm).

Comment 8 Veaceslav Falico 2012-04-10 07:22:43 UTC
Created attachment 576409 [details]
upstream/fedora patch

Comment 9 Josh Boyer 2012-04-10 18:21:47 UTC
(In reply to comment #8)
> Created attachment 576409 [details]
> upstream/fedora patch

I'm guessing you can do a scratch build in koji with this for testing purposes?  If not, let me know and I can get one built.  I don't have the hardware either, so best to wait until you have access.

Comment 10 Andy Cress 2012-06-11 18:45:30 UTC
Veaceslav,

It appears from the 2012-04-10 comment that this patch tested ok on F16 with an EG20T PCH, is that right?

We have had a similar problem trying to get the pch_gbe driver working on RHEL6 (for the same customer that Carsten was working with).  
Backporting the 1.00 driver from F15/F16 has issues starting tx/rx after the link up completes.  
We got pretty close with the pch_gbe 0.91-NAPI driver base and adding some patches, but it keeps getting transmit timeouts.
Can you attach the driver source/tar that you mentioned above for el6?   

Andy

Comment 11 Josh Boyer 2012-09-04 17:46:36 UTC
(In reply to comment #8)
> Created attachment 576409 [details]
> upstream/fedora patch

Did that ever get sent upstream?  It doesn't seem to be in the latest kernel tree.

Comment 12 Andy Cress 2012-09-04 18:37:02 UTC
Here's the commit for it:
commit 4487e64de63b8e42efe5a5543871c42c5a5859d9
Author: Andy Cress <andycress>
Date:   Thu Jul 26 06:01:17 2012 +0000
    pch_gbe: vlan skb len fix

Comment 13 Josh Boyer 2012-09-04 19:00:18 UTC
OK.  That's in 3.6-rc1, so it's fixed in F18.


Note You need to log in before you can comment on or make changes to this bug.