Bug 157647 - tg3 network broken on some chipsets
tg3 network broken on some chipsets
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
All Linux
medium Severity medium
: ---
: ---
Assigned To: John W. Linville
Brian Brock
Depends On:
  Show dependency treegraph
Reported: 2005-05-13 09:30 EDT by Doug Ledford
Modified: 2007-11-30 17:11 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2005-07-06 15:41:28 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Doug Ledford 2005-05-13 09:30:13 EDT
Description of problem:

Certain versions of tg3 chipset have problems with networking.  Problems exists
in both UP and SMP kernels.  Sample chipset problems are with the built in
Broadcom chipsets on Dell PE2650 machines.  Other tg3 chipsets, such as used in
the Netgear gigabit lan cards I have, don't exhibit the problem.

Version-Release number of selected component (if applicable):

How reproducible:

Every time.

Steps to Reproduce:
1. Install on a Dell PE2650
2. Attempt any meaningful network transfer
3. Watch with tcpdump on the other host, packets will be sent, but they won't be
properly received on the effected machines.
Actual results:
Packet loss, ICMP reassembly timeout messages, piss poor network performance,
generally really sucky networking.

Expected results:
Good gigabit network throughput

Additional info:
This is going through a Netgear gigabit ethernet switch.  Speed may play a
factor.  Tried both NFS via UDP and http via TCP.  Both sucked rocks.  Kernels
tested were 2.6.11-1_1284FC4 and 2.6.11-1_1284FC4smp.  Installed tree was the
re0503.0 tree.  I tried to copy an 80MB file via NFS v3 TCP mount from a RHEL3
server to the FC4 machine and these are the results:

[dledford@pe-fc4 ~]$ time cp /dist/FC4/i386/Fedora/base/stage2.img /tmp

real    7m45.691s
user    0m0.008s
sys     0m0.301s

(this was an interrupted copy, it didn't finish)

The tcpdump log on the two machines showed lots of these entries:

(From RHEL3 server)
09:12:37.934096 dledford.xsintricity.com.nfs > pe.xsintricity.com.1390227174:
reply ERR 1448 (DF)
09:12:37.934225 pe.xsintricity.com.796 > dledford.xsintricity.com.nfs: . ack
37649 win 32447 <nop,nop,timestamp 696056 134637964,nop,nop,sack sack 1
{39097:40545} > (DF)
09:12:38.142921 dledford.xsintricity.com.nfs > pe.xsintricity.com.2638413729:
reply ERR 1448 (DF)
09:12:38.143063 pe.xsintricity.com.796 > dledford.xsintricity.com.nfs: . ack
40545 win 32580 <nop,nop,timestamp 696265 134637985> (DF)
09:12:38.143105 dledford.xsintricity.com.nfs > pe.xsintricity.com.581951787:
reply ERR 1448 (DF)
09:12:38.143117 dledford.xsintricity.com.nfs > pe.xsintricity.com.3204149527:
reply ERR 1448 (DF)
09:12:38.143146 pe.xsintricity.com.1811703941 > dledford.xsintricity.com.nfs:
144 read [|nfs] (DF)
09:12:38.182803 pe.xsintricity.com.796 > dledford.xsintricity.com.nfs: . ack
41993 win 32580 <nop,nop,timestamp 696305 134637985> (DF)
09:12:38.182853 dledford.xsintricity.com.nfs > pe.xsintricity.com.3725932352:
reply ERR 1448 (DF)
09:12:38.182866 dledford.xsintricity.com.nfs > pe.xsintricity.com.2398206208:
reply ERR 1448 (DF)

(From FC4 client)
09:12:38.314218 IP dledford.xsintricity.com.nfs > pe.xsintricity.com.2829902664:
reply ERR 1448
09:12:38.314232 IP pe.xsintricity.com.796 > dledford.xsintricity.com.nfs: . ack
1251073 win 32447 <nop,nop,timestamp 705569 134638915,nop,nop,sack sack 1
{1252521:1255417} >
09:12:38.314258 IP dledford.xsintricity.com.nfs > pe.xsintricity.com.3935615357:
reply ERR 1448
09:12:38.314270 IP pe.xsintricity.com.796 > dledford.xsintricity.com.nfs: . ack
1251073 win 32447 <nop,nop,timestamp 705569 134638915,nop,nop,sack sack 1
{1252521:1256865} >
09:12:38.314357 IP dledford.xsintricity.com.nfs > pe.xsintricity.com.477524051:
reply ERR 1448
09:12:38.314371 IP pe.xsintricity.com.796 > dledford.xsintricity.com.nfs: . ack
1251073 win 32447 <nop,nop,timestamp 705569 134638915,nop,nop,sack sack 1
{1252521:1258313} >
09:12:38.314403 IP dledford.xsintricity.com.nfs > pe.xsintricity.com.2152342361:
reply ERR 1448
09:12:38.314442 IP pe.xsintricity.com.796 > dledford.xsintricity.com.nfs: . ack
1258313 win 32580 <nop,nop,timestamp 705569 134638915>

Looks like it might possibly be a hardware checksumming problem.  These are the
relevant dmesg lines about the tg3 adapter in use on the client:

tg3.c:v3.25 (March 24, 2005)
ACPI: PCI Interrupt 0000:04:06.0[A] -> GSI 28 (level, low) -> IRQ 217
eth0: Tigon3 [partno(BCM95701A10) rev 0105 PHY(5701)] (PCIX:133MHz:64-bit)
10/100/1000BaseT Ethernet 00:06:5b:3f:c0:8c
eth0: RXcsums[1] LinkChgREG[1] MIirq[1] ASF[0] Split[0] WireSpeed[1] TSOcap[0]

These are the tg3 detection messages from the RHEL3 server:

tg3.c:v3.22RH (February 11, 2005)
PCI: Assigned IRQ 5 for device 00:0a.0
divert: allocating divert_blk for eth1
eth1: Tigon3 [partno(AC91002A1) rev 0105 PHY(5701)] (PCI:33MHz:32-bit)
10/100/1000BaseT Ethernet 00:09:5b:8c:86:da
eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[0]
Comment 1 John W. Linville 2005-05-18 09:57:45 EDT
Doug, latest rawhide has tg3 v3.27...could you give that a try as well? 
Comment 3 Dave Jones 2005-06-27 19:14:42 EDT
Mass update for bugs reported against -test:
Updating version field to FC4 final. Please retest with final FC4 release if you
have not already done so. Thanks.
Comment 4 John W. Linville 2005-07-06 15:41:28 EDT
Closing due to lack of response.  Please re-open if this continues to be a 

Note You need to log in before you can comment on or make changes to this bug.