Bug 186177 - tcp checksum errors
Summary: tcp checksum errors
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 5
Hardware: athlon
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Brian Brock
URL:
Whiteboard: MassClosed
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-03-22 00:17 UTC by Ted Kaczmarek
Modified: 2008-08-02 23:40 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-01-20 04:39:01 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
tcpdump from Centos 4.2 box (5.69 KB, application/octet-stream)
2006-03-22 00:17 UTC, Ted Kaczmarek
no flags Details

Description Ted Kaczmarek 2006-03-22 00:17:37 UTC
Description of problem:I am seeing tcp checksum errors 


Version-Release number of selected component (if applicable):
kernel-2.6.15-1.1833_FC4


How reproducible: I was debugging an issue I had getting jconsole remote
connections to or from an FC4 dev box I have, appears that jconsole tcp packets
are the only thing I am seeing the issue with.


Steps to Reproduce:
1.start jconsole
2.try to connect to remote server
3.
  
Actual results: excessive tcp checksum failures


Expected results: minimal tcp checksum failure


Additional info: Using java 5 packages built from jpackage supplied spec files
as well as tomcat5 rpms from jpackage. This is a dual Athlon box, seeing same
issue  with all these kernels.
kernel-smp-2.6.15-1.1831_FC4
kernel-smp-2.6.14-1.1656_FC4
kernel-smp-2.6.15-1.1833_FC4

I captured packets locally and on two different remote machines, another FC4 and
one Centos 4.2, the bad checksum where always config from the Atlon SMP FC4.
I saw the same issue whether using e1000 or 3c59x. To be honest I am not sure if
this is really a kernel issue, but I could not google anything to tell me otherwise.

Tyan 2462, MP 2400+ dual, 2 gigs of ram.

00:00.0 Host bridge: Advanced Micro Devices [AMD] AMD-760 MP [IGD4-2P] System
Controller (rev 11)
00:01.0 PCI bridge: Advanced Micro Devices [AMD] AMD-760 MP [IGD4-2P] AGP Bridge
00:07.0 ISA bridge: Advanced Micro Devices [AMD] AMD-766 [ViperPlus] ISA (rev 02)
00:07.1 IDE interface: Advanced Micro Devices [AMD] AMD-766 [ViperPlus] IDE (rev 01)
00:07.3 Bridge: Advanced Micro Devices [AMD] AMD-766 [ViperPlus] ACPI (rev 01)
00:07.4 USB Controller: Advanced Micro Devices [AMD] AMD-766 [ViperPlus] USB
(rev 07)
00:08.0 VGA compatible controller: ATI Technologies Inc Radeon R250 Lf [FireGL
9000] (rev 02)
00:08.1 Display controller: ATI Technologies Inc Radeon R250 Ln [Radeon Mobility
9000 M9] [Secondary] (rev 02)
00:0a.0 Multimedia audio controller: Creative Labs SB Live! EMU10k1 (rev 07)
00:0a.1 Input device controller: Creative Labs SB Live! MIDI/Game Port (rev 07)
00:0b.0 FireWire (IEEE 1394): Texas Instruments TSB12LV23 IEEE-1394 Controller
00:0c.0 Ethernet controller: Intel Corporation 82541GI/PI Gigabit Ethernet
Controller
00:0d.0 SCSI storage controller: Adaptec AIC-7899P U160/m (rev 01)
00:0d.1 SCSI storage controller: Adaptec AIC-7899P U160/m (rev 01)
00:0e.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
00:0f.0 Ethernet controller: 3Com Corporation 3c980-C 10/100baseTX NIC
[Python-T] (rev 78)
00:10.0 Ethernet controller: 3Com Corporation 3c980-C 10/100baseTX NIC
[Python-T] (rev 78)

Comment 1 Ted Kaczmarek 2006-03-22 00:17:37 UTC
Created attachment 126446 [details]
tcpdump from Centos 4.2 box

Comment 2 Anssi Johansson 2006-07-20 01:13:17 UTC
I wonder..
a) is the problem still in the current kernels?
b) might this be related to http://bugzilla.kernel.org/show_bug.cgi?id=4922 ?

For what it's worth, I ran into a suspiciously similar problem back in mid-April
when I was setting up a development server for our web site. I was using Fedora
Core 5 (regular x86 version with SMP) with whatever kernel was the newest at
that time. People were able to browse the web site running on that server, but
eventually the TCP connection stalled, usually when they tried to POST a form to
the web server. Transferring data with 'scp' was also equally painful,
transferring a tiny 50MB file to another computer was impossible as the
connection kept stalling after a few megabytes. I also did some dumps with
tcpdump, and they exhibited similar checksum problems as with the original
reporter. Replacing the cable with a brand new cat6 cable didn't have any effect
on this. I also tried a different switch, nothing changed.

The motherboard is Asus A8N32-SLI, equipped with AMD Athlon 64 X2 4800+
processor, 4GB of RAM and a SATA hard disk.

lspci:
00:00.0 RAM memory: nVidia Corporation C51 Host Bridge (rev a2)
00:00.1 RAM memory: nVidia Corporation C51 Memory Controller 0 (rev a2)
00:00.2 RAM memory: nVidia Corporation C51 Memory Controller 1 (rev a2)
00:00.3 RAM memory: nVidia Corporation C51 Memory Controller 5 (rev a2)
00:00.4 RAM memory: nVidia Corporation C51 Memory Controller 4 (rev a2)
00:00.5 RAM memory: nVidia Corporation C51 Host Bridge (rev a2)
00:00.6 RAM memory: nVidia Corporation C51 Memory Controller 3 (rev a2)
00:00.7 RAM memory: nVidia Corporation C51 Memory Controller 2 (rev a2)
00:02.0 PCI bridge: nVidia Corporation C51 PCI Express Bridge (rev a1)
00:03.0 PCI bridge: nVidia Corporation C51 PCI Express Bridge (rev a1)
00:04.0 PCI bridge: nVidia Corporation C51 PCI Express Bridge (rev a1)
00:09.0 Memory controller: nVidia Corporation CK804 Memory Controller (rev a4)
00:0a.0 ISA bridge: nVidia Corporation CK804 ISA Bridge (rev a4)
00:0a.1 SMBus: nVidia Corporation CK804 SMBus (rev a2)
00:0b.0 USB Controller: nVidia Corporation CK804 USB Controller (rev a2)
00:0b.1 USB Controller: nVidia Corporation CK804 USB Controller (rev a4)
00:0d.0 Multimedia audio controller: nVidia Corporation CK804 AC'97 Audio
Controller (rev a2)
00:0f.0 IDE interface: nVidia Corporation CK804 IDE (rev f3)
00:10.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev f3)
00:11.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev f3)
00:12.0 PCI bridge: nVidia Corporation CK804 PCI Bridge (rev a2)
00:13.0 Bridge: nVidia Corporation CK804 Ethernet Controller (rev a3)
00:16.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:17.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron]
HyperTransport Technology Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM
Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron]
Miscellaneous Control
01:00.0 Mass storage controller: Silicon Image, Inc. SiI 3132 Serial ATA Raid II
Controller (rev 01)
02:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053 PCI-E Gigabit
Ethernet Controller (rev 15)
04:06.0 VGA compatible controller: ATI Technologies Inc RV280 [Radeon 9200 PRO]
(rev 01)
04:06.1 Display controller: ATI Technologies Inc RV280 [Radeon 9200 PRO]
(Secondary) (rev 01)
04:0b.0 FireWire (IEEE 1394): Texas Instruments TSB43AB22/A IEEE-1394a-2000
Controller (PHY/Link)

The motherboard has two ethernet connectors, the first uses the 'sky2' driver
and the second the 'forcedeth' driver. The most interesting thing is that I had
these problems ONLY when using the ethernet connector with the sky2 driver,
swapping the network cable to the forcedeth-controlled ethernet connector made
my problems disappear.

On the other hand, the sky2 driver appears to have quite a lot of problems of
its own, see
https://launchpad.net/distros/ubuntu/+source/linux-source-2.6.15/+bug/38865 and
it appears that there are a few open bug reports about sky2 in Redhat's and
kernel.org's bugzilla as well. However, the other bug reports don't really
mention the checksum errors at all, so I'll write my message here in case my
problem isn't only sky2 related.

At that time I didn't need a second network connection, so I just used the
forcedeth-operated ethernet controller and forgot the sky2 controller. However,
I'm going to need the second controller in some weeks, so I'll probably start
testing the issue more thoroughly next week to see if I can figure something out.

Comment 3 Dave Jones 2006-09-17 02:05:27 UTC
[This comment added as part of a mass-update to all open FC4 kernel bugs]

FC4 has now transitioned to the Fedora legacy project, which will continue to
release security related updates for the kernel.  As this bug is not security
related, it is unlikely to be fixed in an update for FC4, and has been migrated
to FC5.

Please retest with Fedora Core 5.

Thank you.

Comment 4 Dave Jones 2006-10-16 21:04:40 UTC
A new kernel update has been released (Version: 2.6.18-1.2200.fc5)
based upon a new upstream kernel release.

Please retest against this new kernel, as a large number of patches
go into each upstream release, possibly including changes that
may address this problem.

This bug has been placed in NEEDINFO state.
Due to the large volume of inactive bugs in bugzilla, if this bug is
still in this state in two weeks time, it will be closed.

Should this bug still be relevant after this period, the reporter
can reopen the bug at any time. Any other users on the Cc: list
of this bug can request that the bug be reopened by adding a
comment to the bug.

In the last few updates, some users upgrading from FC4->FC5
have reported that installing a kernel update has left their
systems unbootable. If you have been affected by this problem
please check you only have one version of device-mapper & lvm2
installed.  See bug 207474 for further details.

If this bug is a problem preventing you from installing the
release this version is filed against, please see bug 169613.

If this bug has been fixed, but you are now experiencing a different
problem, please file a separate bug for the new problem.

Thank you.

Comment 5 Jon Stanley 2008-01-20 04:39:01 UTC
(this is a mass-close to kernel bugs in NEEDINFO state)

As indicated previously there has been no update on the progress of this bug
therefore I am closing it as INSUFFICIENT_DATA. Please re-open if the issue
still occurs for you and I will try to assist in its resolution. Thank you for
taking the time to report the initial bug.

If you believe that this bug was closed in error, please feel free to reopen
this bug.


Note You need to log in before you can comment on or make changes to this bug.