Bug 334411 - Watchdog timeout e1000 (7.3.20-k2-NAPI)
Watchdog timeout e1000 (7.3.20-k2-NAPI)
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.5
i686 Linux
urgent Severity high
: ---
: ---
Assigned To: Andy Gospodarek
Martin Jenner
:
Depends On:
Blocks: 461297
  Show dependency treegraph
 
Reported: 2007-10-16 10:34 EDT by Marcus Alves Grando
Modified: 2014-06-29 18:59 EDT (History)
15 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-05-18 15:04:02 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
rhel4-82544-workaround.patch (536 bytes, patch)
2008-05-08 15:07 EDT, Andy Gospodarek
no flags Details | Diff
Intel patch to decrease the size of the ring descriptors and increase their numbers to compensate (1.38 KB, patch)
2008-05-16 10:58 EDT, Frank Hirtz
no flags Details | Diff
e1000-rhel4-changes.diff (265.66 KB, patch)
2008-11-17 14:34 EST, Andy Gospodarek
no flags Details | Diff
e1000-disable-adaptive-interrupt-throttling.patch (1.96 KB, patch)
2008-12-23 17:10 EST, Andy Gospodarek
no flags Details | Diff

  None (edit)
Description Marcus Alves Grando 2007-10-16 10:34:52 EDT
Hello,

I have many watchdog timeouts in e1000 driver version 7.2.7-k2-NAPI
B024014F7239D211AAA851C.

# modinfo e1000
filename:       /lib/modules/2.6.9-55.0.2.ELsmp/kernel/drivers/net/e1000/e1000.ko
author:         Intel Corporation, <linux.nics@intel.com>
description:    Intel(R) PRO/1000 Network Driver
license:        GPL
version:        7.2.7-k2-NAPI B024014F7239D211AAA851C

# dmesg
e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
  Tx Queue             <0>
  TDH                  <67>
  TDT                  <51>
  next_to_use          <51>
  next_to_clean        <65>
buffer_info[next_to_clean]
  time_stamp           <5809242>
  next_to_watch        <6b>
  jiffies              <5809b12>
  next_to_watch.status <0>
e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
  Tx Queue             <0>
  TDH                  <67>
  TDT                  <51>
  next_to_use          <51>
  next_to_clean        <65>
buffer_info[next_to_clean]
  time_stamp           <5809242>
  next_to_watch        <6b>
  jiffies              <580a2e2>
  next_to_watch.status <0>
e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
  Tx Queue             <0>
  TDH                  <67>
  TDT                  <51>
  next_to_use          <51>
  next_to_clean        <65>
buffer_info[next_to_clean]
  time_stamp           <5809242>
  next_to_watch        <6b>
  jiffies              <580aab2>
  next_to_watch.status <0>
NETDEV WATCHDOG: eth0: transmit timed out
e1000: eth0: e1000_watchdog_task: NIC Link is Up 1000 Mbps Full Duplex
e1000: eth0: e1000_watchdog_task: NIC Link is Up 1000 Mbps Full Duplex
e1000: eth1: e1000_watchdog_task: NIC Link is Up 100 Mbps Full Duplex

Now i disable TSO to see if works better.

Regards
Comment 1 Marcus Alves Grando 2007-10-16 10:37:09 EDT
# lspci -v
13:01.0 Ethernet controller: Intel Corporation 82544EI Gigabit Ethernet
Controller (Copper) (rev 02)
	Subsystem: Intel Corporation PRO/1000 XT Server Adapter
	Flags: bus master, 66Mhz, medium devsel, latency 64, IRQ 193
	Memory at fa220000 (64-bit, non-prefetchable) [size=128K]
	Memory at fa200000 (64-bit, non-prefetchable) [size=128K]
	I/O ports at 9ce0 [size=32]
	Expansion ROM at fa100000 [disabled] [size=128K]
	Capabilities: [dc] Power Management version 2
	Capabilities: [e4] PCI-X non-bridge device.
	Capabilities: [f0] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-

18:01.0 Ethernet controller: Intel Corporation 82544EI Gigabit Ethernet
Controller (Copper) (rev 02)
	Subsystem: Intel Corporation PRO/1000 XT Server Adapter
	Flags: bus master, 66Mhz, medium devsel, latency 64, IRQ 201
	Memory at fa020000 (64-bit, non-prefetchable) [size=128K]
	Memory at fa000000 (64-bit, non-prefetchable) [size=128K]
	I/O ports at 8ce0 [size=32]
	Expansion ROM at f9f00000 [disabled] [size=128K]
	Capabilities: [dc] Power Management version 2
	Capabilities: [e4] PCI-X non-bridge device.
	Capabilities: [f0] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-

# lspci -n
13:01.0 Class 0200: 8086:1008 (rev 02)
18:01.0 Class 0200: 8086:1008 (rev 02)
Comment 2 Andy Gospodarek 2008-01-08 14:47:56 EST
I've seen many reports of problems like these and I'm currently trying to narrow
down the problem.

Is this a system where you can easily reproduce these?  If so, would you be
willing to try out one of my test kernels (or even a 4.6 kernel)?  You should be
able to get a 4.6 kernel running up2date and you can find my test kernels here:

http://people.redhat.com/agospoda/#rhel4

Comment 3 Jesse Brandeburg 2008-01-08 17:25:03 EST
Please list a full lspci -vvv, 

we are also tracking many similar issues at
http://sourceforge.net/tracker/?group_id=42302&atid=447449

There seem to be a few causes for these issues, some are due to known issues and
can be fixed with either an eeprom upgrade (usually associated with PCIe parts)
or the driver patch for ASPM, and the rest seem to be some kind of
incompatibility with the cache coherency protocol used on some AMD systems.

The 82544EI is a Gigabit Ethernet part from 2001, doesn't support PCI-X, and I
bet you're running it in a high end server, am I correct?

Comment 4 Andy Gospodarek 2008-01-08 17:35:51 EST
Thanks, Jesse.  I've spent most of the day sifting through bugzillas related to
this and the only ones that seem to not have a clear solution that I know about
are problems related to 82544EI and 82546EB (which I know are mostly older PCI
hardware) and have not moved to e1000e upstream.  Those issues are the ones I
want to make sure we fix since most of the problems on 8257x are fixed with
either firmware or the recent ASPM fix upstream.
Comment 5 Marcus Alves Grando 2008-03-24 20:49:04 EDT
(In reply to comment #3)
> Please list a full lspci -vvv, 
> 
[...]
> The 82544EI is a Gigabit Ethernet part from 2001, doesn't support PCI-X, and I
> bet you're running it in a high end server, am I correct?
> 

Ok, lspci -vvv below:

---------------
# lspci -vvv
00:00.0 Host bridge: Broadcom CMIC-HE (rev 22)
	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping-
SERR- FastB2B-
	Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR-

00:00.1 Host bridge: Broadcom CMIC-HE
	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping-
SERR- FastB2B-
	Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR-

00:00.2 Host bridge: Broadcom CMIC-HE
	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping-
SERR- FastB2B-
	Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR-

00:00.3 Host bridge: Broadcom CMIC-HE
	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping-
SERR- FastB2B-
	Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR-

00:03.0 SCSI storage controller: Adaptec AIC-7892P U160/m (rev 02)
	Subsystem: Dell: Unknown device 010a
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping-
SERR+ FastB2B-
	Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
	Latency: 32 (10000ns min, 6250ns max), Cache Line Size 10
	Interrupt: pin A routed to IRQ 177
	BIST result: 00
	Region 0: I/O ports at ec00 [disabled] [size=256]
	Region 1: Memory at fe102000 (64-bit, non-prefetchable) [size=4K]
	Expansion ROM at fe000000 [disabled] [size=128K]
	Capabilities: [dc] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:04.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
(prog-if 00 [VGA])
	Subsystem: Dell: Unknown device 010a
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop+ ParErr- Stepping+
SERR- FastB2B-
	Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
	Latency: 32 (2000ns min), Cache Line Size 10
	Region 0: Memory at fd000000 (32-bit, non-prefetchable) [size=16M]
	Region 1: I/O ports at e800 [size=256]
	Region 2: Memory at fe101000 (32-bit, non-prefetchable) [size=4K]
	Capabilities: [5c] Power Management version 2
		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:0f.0 Host bridge: Broadcom CSB5 South Bridge (rev 93)
	Subsystem: Broadcom CSB5 South Bridge
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping-
SERR+ FastB2B-
	Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort+ >SERR- <PERR-
	Latency: 64

00:0f.1 IDE interface: Broadcom CSB5 IDE Controller (rev 93) (prog-if 82 [Master
PriP])
	Subsystem: Broadcom CSB5 IDE Controller
	Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping-
SERR+ FastB2B-
	Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
	Latency: 64
	Region 0: I/O ports at <ignored>
	Region 1: I/O ports at <ignored>
	Region 2: I/O ports at <ignored>
	Region 3: I/O ports at <ignored>
	Region 4: I/O ports at 08b0 [size=16]

00:0f.2 USB Controller: Broadcom OSB4/CSB5 OHCI USB Controller (rev 05) (prog-if
10 [OHCI])
	Subsystem: Broadcom OSB4/CSB5 OHCI USB Controller
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr+ Stepping-
SERR+ FastB2B-
	Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
	Latency: 32 (20000ns max)
	Interrupt: pin A routed to IRQ 10
	Region 0: Memory at fe100000 (32-bit, non-prefetchable) [size=4K]

00:0f.3 ISA bridge: Broadcom CSB5 LPC bridge
	Subsystem: Broadcom: Unknown device 0230
	Control: I/O- Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping-
SERR+ FastB2B-
	Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
	Latency: 0

00:10.0 Host bridge: Broadcom CIOB30 (rev 03)
	Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping-
SERR+ FastB2B-
	Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort+ >SERR- <PERR-
	Capabilities: [60] PCI-X non-bridge device.
		Command: DPERE- ERO- RBC=0 OST=4
		Status: Bus=0 Dev=0 Func=0 64bit+ 133MHz+ SCD- USC-, DC=bridge, DMMRBC=0,
DMOST=4, DMCRS=0, RSCEM-

00:10.2 Host bridge: Broadcom CIOB30 (rev 03)
	Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping-
SERR+ FastB2B-
	Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort+ >SERR- <PERR-
	Capabilities: [60] PCI-X non-bridge device.
		Command: DPERE- ERO- RBC=0 OST=4
		Status: Bus=0 Dev=0 Func=0 64bit+ 133MHz+ SCD- USC-, DC=bridge, DMMRBC=0,
DMOST=4, DMCRS=0, RSCEM-

00:11.0 Host bridge: Broadcom CIOB30 (rev 03)
	Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping-
SERR+ FastB2B-
	Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort+ >SERR- <PERR-
	Capabilities: [60] PCI-X non-bridge device.
		Command: DPERE- ERO- RBC=0 OST=4
		Status: Bus=0 Dev=0 Func=0 64bit+ 133MHz+ SCD- USC-, DC=bridge, DMMRBC=0,
DMOST=4, DMCRS=0, RSCEM-

00:11.2 Host bridge: Broadcom CIOB30 (rev 03)
	Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping-
SERR+ FastB2B-
	Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort+ >SERR- <PERR-
	Capabilities: [60] PCI-X non-bridge device.
		Command: DPERE- ERO- RBC=0 OST=4
		Status: Bus=0 Dev=0 Func=0 64bit+ 133MHz+ SCD- USC-, DC=bridge, DMMRBC=0,
DMOST=4, DMCRS=0, RSCEM-

00:12.0 Host bridge: Broadcom CIOB30 (rev 03)
	Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping-
SERR+ FastB2B-
	Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort+ >SERR- <PERR-
	Capabilities: [60] PCI-X non-bridge device.
		Command: DPERE- ERO- RBC=0 OST=4
		Status: Bus=0 Dev=0 Func=0 64bit+ 133MHz+ SCD- USC-, DC=bridge, DMMRBC=0,
DMOST=4, DMCRS=0, RSCEM-

00:12.2 Host bridge: Broadcom CIOB30 (rev 03)
	Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping-
SERR+ FastB2B-
	Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort+ >SERR- <PERR-
	Capabilities: [60] PCI-X non-bridge device.
		Command: DPERE- ERO- RBC=0 OST=4
		Status: Bus=0 Dev=0 Func=0 64bit+ 133MHz+ SCD- USC-, DC=bridge, DMMRBC=0,
DMOST=4, DMCRS=0, RSCEM-

03:01.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID (rev 01)
	Subsystem: Dell MegaRAID 518 DELL PERC 4/DC RAID Controller
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping-
SERR+ FastB2B-
	Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=slow >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
	Latency: 32, Cache Line Size 10
	Interrupt: pin A routed to IRQ 185
	Region 0: Memory at fce00000 (32-bit, prefetchable) [size=64K]
	Expansion ROM at fcf00000 [disabled] [size=64K]
	Capabilities: [80] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-

13:01.0 Ethernet controller: Intel Corporation 82544EI Gigabit Ethernet
Controller (Copper) (rev 02)
	Subsystem: Intel Corporation PRO/1000 XT Server Adapter
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping-
SERR+ FastB2B-
	Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
	Latency: 64 (63750ns min), Cache Line Size 10
	Interrupt: pin A routed to IRQ 193
	Region 0: Memory at fa220000 (64-bit, non-prefetchable) [size=128K]
	Region 2: Memory at fa200000 (64-bit, non-prefetchable) [size=128K]
	Region 4: I/O ports at 9ce0 [size=32]
	Expansion ROM at fa100000 [disabled] [size=128K]
	Capabilities: [dc] Power Management version 2
		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 PME-Enable- DSel=0 DScale=1 PME-
	Capabilities: [e4] PCI-X non-bridge device.
		Command: DPERE- ERO+ RBC=0 OST=0
		Status: Bus=19 Dev=1 Func=0 64bit+ 133MHz+ SCD- USC-, DC=simple, DMMRBC=2,
DMOST=0, DMCRS=1, RSCEM-
	Capabilities: [f0] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-
		Address: 0000000000000000  Data: 0000

18:01.0 Ethernet controller: Intel Corporation 82544EI Gigabit Ethernet
Controller (Copper) (rev 02)
	Subsystem: Intel Corporation PRO/1000 XT Server Adapter
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping-
SERR+ FastB2B-
	Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
	Latency: 64 (63750ns min), Cache Line Size 10
	Interrupt: pin A routed to IRQ 201
	Region 0: Memory at fa020000 (64-bit, non-prefetchable) [size=128K]
	Region 2: Memory at fa000000 (64-bit, non-prefetchable) [size=128K]
	Region 4: I/O ports at 8ce0 [size=32]
	Expansion ROM at f9f00000 [disabled] [size=128K]
	Capabilities: [dc] Power Management version 2
		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 PME-Enable- DSel=0 DScale=1 PME-
	Capabilities: [e4] PCI-X non-bridge device.
		Command: DPERE- ERO+ RBC=0 OST=0
		Status: Bus=24 Dev=1 Func=0 64bit+ 133MHz+ SCD- USC-, DC=simple, DMMRBC=2,
DMOST=0, DMCRS=1, RSCEM-
	Capabilities: [f0] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-
		Address: 0000000000000000  Data: 0000
---------------

Since I turn off TSO this server does not have more probems with watchdog
timeout. This server is one Dell PowerEdge, I don't remember now which model.

Another test or something that you need?

Regards
Comment 6 Jesse Brandeburg 2008-03-25 11:48:11 EDT
Thanks for the reply, since this is an older e1000 driver, I am pretty sure we
fixed this problem in later releases.  In particular I think redhat pulled a fix
into RHEL5 for this:

see this bug: https://bugzilla.redhat.com/show_bug.cgi?id=206540

I can only see some of the comments, so I'm not sure what was eventually pulled
into rhel5 to fix this issue.  The problems sound a lot alike so I would guess
that it is the same root cause, the 82544 tso specific workaround doesn't play
well with some packet offsets and header lengths that are sent to the driver.

Unfortunately I think that the workaround may not have even made it into our
standalone driver (and therefore probably not upstream either)

As for debug, we have a descriptor ring dump patch here which will print enough
information to the system log for me to debug the issue:
https://sourceforge.net/tracker/download.php?group_id=42302&atid=447451&file_id=205190&aid=1460945

you can download the sourceforge e1000-7.2.7.tar.gz and try it with the patch
applied.
Comment 7 Andy Gospodarek 2008-03-25 12:14:00 EDT
Jesse,

We already include the patch referenced for rhel5 in bug 206540.  In fact, it
has been included since rhel4.6.

Marcus,

Please update your kernel to at least 2.6.9-62 (this should be easy via RHN) and
you will have the fix you need.  You can also always try my test kernels if you
like.  They contain patches that I hope to include in RHEL and are available here:

http://people.redhat.com/agospoda/

Please do not use the Intel driver suggested in comment #6 as is deviates
heavily from upstream and is unsupported by us.

Please let me know how the latest kernel on RHN works for you.  

Thanks!
Comment 8 Marcus Alves Grando 2008-04-01 16:09:24 EDT
Well, again with 2.6.9-67.0.7 (e1000 7.3.20-k2):

# dmesg
ip_tables: (C) 2000-2002 Netfilter core team
TCP: Treason uncloaked! Peer 201.62.16.112:28111/80 shrinks window
1208303136:1208303137. Repaired.
TCP: Treason uncloaked! Peer 201.62.16.112:28111/80 shrinks window
1208303136:1208303137. Repaired.
TCP: Treason uncloaked! Peer 200.219.177.202:1429/80 shrinks window
1943947205:1943947206. Repaired.
TCP: Treason uncloaked! Peer 200.219.177.202:1429/80 shrinks window
1943947205:1943947206. Repaired.
e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
  Tx Queue             <0>
  TDH                  <a7>
  TDT                  <91>
  next_to_use          <91>
  next_to_clean        <a5>
buffer_info[next_to_clean]
  time_stamp           <1152bf>
  next_to_watch        <b2>
  jiffies              <115df2>
  next_to_watch.status <0>
e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
  Tx Queue             <0>
  TDH                  <a7>
  TDT                  <91>
  next_to_use          <91>
  next_to_clean        <a5>
buffer_info[next_to_clean]
  time_stamp           <1152bf>
  next_to_watch        <b2>
  jiffies              <1165c2>
  next_to_watch.status <0>
NETDEV WATCHDOG: eth0: transmit timed out
e1000: eth0: e1000_watchdog_task: NIC Link is Up 1000 Mbps Full Duplex, Flow
Control: RX

# ethtool -i eth0
driver: e1000
version: 7.3.20-k2-NAPI
firmware-version: N/A
bus-info: 0000:13:01.0

# ethtool -d eth0
MAC Registers
-------------
0x00000: CTRL (Device control register)  0x0AFC0269
      Duplex:                            full
      Endian mode (buffers):             little
      Link reset:                        reset
      Set link up:                       1
      Invert Loss-Of-Signal:             no
      Receive flow control:              enabled
      Transmit flow control:             disabled
      VLAN mode:                         disabled
      Auto speed detect:                 enabled
      Speed select:                      1000Mb/s
      Force speed:                       no
      Force duplex:                      no
0x00008: STATUS (Device status register) 0x00007B83
      Duplex:                            full
      Link up:                           link config
      TBI mode:                          disabled
      Link speed:                        1000Mb/s
      Bus type:                          PCI-X
      Bus speed:                         100MHz
      Bus width:                         64-bit
0x00100: RCTL (Receive control register) 0x00008002
      Receiver:                          enabled
      Store bad packets:                 disabled
      Unicast promiscuous:               disabled
      Multicast promiscuous:             disabled
      Long packet:                       disabled
      Descriptor minimum threshold size: 1/2
      Broadcast accept mode:             accept
      VLAN filter:                       disabled
      Cononical form indicator:          disabled
      Discard pause frames:              filtered
      Pass MAC control frames:           don't pass
      Receive buffer size:               2048
0x02808: RDLEN (Receive desc length)     0x00001000
0x02810: RDH   (Receive desc head)       0x000000CC
0x02818: RDT   (Receive desc tail)       0x000000CA
0x02820: RDTR  (Receive delay timer)     0x00000000
0x00400: TCTL (Transmit ctrl register)   0x0103F0FA
      Transmitter:                       enabled
      Pad short packets:                 enabled
      Software XOFF Transmission:        disabled
      Re-transmit on late collision:     enabled
0x03808: TDLEN (Transmit desc length)    0x00001000
0x03810: TDH   (Transmit desc head)      0x00000071
0x03818: TDT   (Transmit desc tail)      0x00000071
0x03820: TIDV  (Transmit delay timer)    0x00000008

# ethtool -e eth0
Offset		Values
------		------
0x0000		00 02 b3 d4 96 9c 20 05 00 00 00 00 00 00 00 00 
0x0010		28 c2 02 96 0b 66 07 11 86 80 08 10 86 80 6c f2 
0x0020		ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
0x0030		ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
0x0040		ff db 21 00 18 40 ff ff ff ff ff ff ff ff ff ff 
0x0050		ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
0x0060		2c 00 00 40 0f 10 ff ff ff ff ff ff ff ff ff ff 
0x0070		ff ff ff ff ff ff ff ff ff ff ff ff ff ff 4b 03

More info?

Regards
Comment 9 Jesse Brandeburg 2008-04-01 17:12:38 EDT
hi marcus, does disabling TSO still solve your problem?  Is there a chance you 
could try the debug patch that I referenced earlier in this bug?

It basically adds a very large amount of debug output only if your server tx 
hangs.  This will help me resolve your tx hang.  Andy could possibly rebuild a 
kernel or a version of the driver for you with the e1000_dump function in 
place.   I'm sorry but there is likely no way for me to reproduce this so the 
only option I have is to have you (Marcus) run tests for me.
Comment 11 Andy Gospodarek 2008-04-11 11:28:24 EDT
(In reply to comment #10)
> 
> We recommended disabling TSO there as well, and have been awaiting feedback on
> it's impact. That being said, this wasn't frequent beforehand.

Frank, what do you mean by beforehand?  

It wasn't frequent before enabling TSO?  Before upgrading the kernel?  Something
else?

Tnanks.
Comment 12 Andy Gospodarek 2008-04-11 11:39:11 EDT
Jesse,

I was thinking about the fact that we get complaints about 82546EB as well as
82544 and wondered if you feel like the 'rhel5 patch' in comment #6 should be
expanded a bit.  Is the 82546 hardware similar enough to warrant a change like this?

diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
index 4a6c600..163016a 100644
--- a/drivers/net/e1000/e1000_main.c
+++ b/drivers/net/e1000/e1000_main.c
@@ -3350,6 +3350,7 @@ e1000_xmit_frame(struct sk_buff *skb, struct net_device
*netdev)
                        switch (adapter->hw.mac_type) {
                                unsigned int pull_size;
                        case e1000_82544:
+                       case e1000_82546:
                                /* Make sure we have room to chop off 4 bytes,
                                 * and that the end alignment will work out to
                                 * this hardware's requirements

Just so you know, I'm going to try and add your irq test patch as well as your
debug patch from comment #6 to my test kernels, so I can help you get to the
bottom of these problems.
Comment 14 Jesse Brandeburg 2008-04-11 12:24:39 EDT
The 82546 is completely different (much newer) core than the 82544, and this
workaround is not necessary.
Comment 15 Andy Gospodarek 2008-04-11 12:27:08 EDT
Thanks, Jesse!
Comment 16 Andy Gospodarek 2008-04-14 09:32:50 EDT
My test kernels have been updated to include a patch for this bugzilla.

http://people.redhat.com/agospoda/#rhel4

Please test them and report back your results.
Comment 29 Marcus Alves Grando 2008-04-23 08:10:43 EDT
Andy and Jesse,

I've tested your new kernel with debug patch, result below:

e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
  Tx Queue             <0>
  TDH                  <39>
  TDT                  <24>
  next_to_use          <24>
  next_to_clean        <37>
buffer_info[next_to_clean]
  time_stamp           <92e4d40>
  next_to_watch        <3c>
  jiffies              <92e5577>
  next_to_watch.status <0>
e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
  Tx Queue             <0>
  TDH                  <39>
  TDT                  <24>
  next_to_use          <24>
  next_to_clean        <37>
buffer_info[next_to_clean]
  time_stamp           <92e4d40>
  next_to_watch        <3c>
  jiffies              <92e5d47>
  next_to_watch.status <0>
e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
  Tx Queue             <0>
  TDH                  <39>
  TDT                  <24>
  next_to_use          <24>
  next_to_clean        <37>
buffer_info[next_to_clean]
  time_stamp           <92e4d40>
  next_to_watch        <3c>
  jiffies              <92e6517>
  next_to_watch.status <0>
NETDEV WATCHDOG: eth0: transmit timed out
Register dump
CTRL             0afc0269
STATUS           00007b83
RCTL             00008002
RDLEN            00001000
RDH              00000022
RDT              00000020
RDTR             00000000
TCTL             0103f0fa
TDBAL            361d8000
TDBAH            00000000
TDLEN            00001000
TDH              00000039
TDT              00000024
TIDV             00000008
TXDCTL           00000000
TADV             00000000
TARC0            00000000
TDBAL1           00000000
TDBAH1           00000000
TDLEN1           00000000
TDH1             00000000
TDT1             00000000
TXDCTL1          00000000
TARC1            00000000
CTRL_EXT         00000dd0
ERT              00000000
TX Desc ring0 dump
T[ desc]     [address 63:0  ] [vl pt Sdcdt ln] [bi->dma       ] leng  ntw
timestmp bi->skb
T[0x000]     00000000052824AA 000000008B000036 00000000052824AA 0036    0
00000000092E4DCF dcc29880
T[0x001]     00000000361CEAAA 000000008B000036 00000000361CEAAA 0036    1
00000000092E4DCF e3d65580
T[0x002]     0000322237E246AA 0000000020000000 0000000000000000 0036    3
00000000092E4DD0 00000000
T[0x003]     000000001B5FE8AA 00000200AB100114 000000001B5FE8AA 0114    3
00000000092E4DD0 e329d580
T[0x004]     00003222965FCB54 0000000020000000 0000000000000000 0004    5
00000000092E4DD0 00000000
T[0x005]     000000001ECB20AA 00000200AB100113 000000001ECB20AA 0113    5
00000000092E4DD0 da1f5880
T[0x006]     000032221EF850AA 0000000020000000 0000000000000000 017D    7
00000000092E4DD0 00000000
T[0x007]     000000001633B0AA 00000200AB100114 000000001633B0AA 0114    7
00000000092E4DD0 dab51680
T[0x008]     0000000037DDF6AA 000000008B000036 0000000037DDF6AA 0036    8
00000000092E4DD0 e4370e80
T[0x009]     00003222965FCB58 0000000020000000 0000000000000000 04A4    A
00000000092E4DD3 00000000
T[0x00A]     0000000018E768AA 00000200AB100115 0000000018E768AA 0115    A
00000000092E4DD3 e2594680
T[0x00B]     00000000379922AA 000000008B000036 00000000379922AA 0036    B
00000000092E4DD3 e189c680
T[0x00C]     0000000037FC62AA 000000008B000036 0000000037FC62AA 0036    C
00000000092E4DD3 dac64b80
T[0x00D]     00000000052F84AA 000000008B000036 00000000052F84AA 0036    D
00000000092E4DD4 dcdfbb80
T[0x00E]     0000000037E30EAA 000000008B000036 0000000037E30EAA 0036    E
00000000092E4DD6 dc180380
T[0x00F]     00000000366FA69E 000000008B000042 00000000366FA69E 0042    F
00000000092E4DD7 e45f1d80
T[0x010]     00003222965FD6AC 0000000020000000 0000000000000000 0004   11
00000000092E4DD7 00000000
T[0x011]     000000001D4940AA 00000200AB100114 000000001D4940AA 0114   11
00000000092E4DD7 e2290380
T[0x012]     0000000035C2E8AA 000000008B000036 0000000035C2E8AA 0036   12
00000000092E4DD8 da663680
T[0x013]     0000000035C2BCAA 000000008B000036 0000000035C2BCAA 0036   13
00000000092E4DD9 e4f68780
T[0x014]     0000000035C304AA 000000008B000036 0000000035C304AA 0036   14
00000000092E4DDA e5609080
T[0x015]     0000000036EAC89E 000000008B000042 0000000036EAC89E 0042   15
00000000092E4DDB d9c9fa80
T[0x016]     0000000037D0D4AA 000000008B000036 0000000037D0D4AA 0036   16
00000000092E4DDC dd416780
T[0x017]     0000000036CC86AA 000000008B000036 0000000036CC86AA 0036   17
00000000092E4DDC ddcb3180
T[0x018]     00003222E5C8F8D9 0000000020000000 0000000000000000 0004   1A
00000000092E4DDD 00000000
T[0x019]     00000000178680AA 0000020022100154 00000000178680AA 0154   19
00000000092E4DDD 00000000
T[0x01A]     00000000178681FE 00000200AB100004 00000000178681FE 0004   1A
00000000092E4DDD daac3380
T[0x01B]     00003222052B1CAA 0000000020000000 0000000000000000 0036   1C
00000000092E4DDF 00000000
T[0x01C]     00000000175050AA 00000200AB100113 00000000175050AA 0113   1C
00000000092E4DDF dec14a80
T[0x01D]     0000322223A7B0AA 0000000020000000 0000000000000000 017E   20
00000000092E4DE1 00000000
T[0x01E]     0000000037D976AA 0000020022100036 0000000037D976AA 0036   1E
00000000092E4DE1 00000000
T[0x01F]     00000000C8DBF5A0 0000020022100214 00000000C8DBF5A0 0214   1F
00000000092E4DE1 00000000
T[0x020]     00000000C8DBF7B4 00000200AB100004 00000000C8DBF7B4 0004   20
00000000092E4DE1 dd255c80
T[0x021]     0000000037952AAA 000000008B000036 0000000037952AAA 0036   21
00000000092E4DE1 e16e5180
T[0x022]     00003222187A90AA 0000000020000000 0000000000000000 0154   23
00000000092E4DE1 00000000
T[0x023]     000000001B47F8AA 00000200AB100114 000000001B47F8AA 0114   23
00000000092E4DE1 e4869080
T[0x024]     00003222EFAB4000 0000000020000000 0000000000000000 03A8   27
00000000092E4D40 00000000 NTU
T[0x025]     00000000366FDEAA 0000000022100036 0000000000000000 0036   25
00000000092E4D40 00000000
T[0x026]     00000001FF466000 0000000022100354 0000000000000000 0354   26
00000000092E4D40 00000000
T[0x027]     00000001FF466354 00000000AB100004 0000000000000000 0004   27
00000000092E4D40 00000000
T[0x028]     000000003798AAAA 000000008B000036 0000000000000000 0036   28
00000000092E4D40 00000000
T[0x029]     000032221A14D8AA 0000000020000000 0000000000000000 0114   2A
00000000092E4D40 00000000
T[0x02A]     0000000022D9D8AA 00000000AB100114 0000000000000000 0114   2A
00000000092E4D40 00000000
T[0x02B]     00003222E8112231 0000000020000000 0000000000000000 0004   2C
00000000092E4D40 00000000
T[0x02C]     0000000017DF10AA 00000000AB10017E 0000000000000000 017E   2C
00000000092E4D40 00000000
T[0x02D]     00003222167A7A25 0000000020000000 0000000000000000 0004   30
00000000092E4D40 00000000
T[0x02E]     0000000037995AAA 0000000022100036 0000000000000000 0036   2E
00000000092E4D40 00000000
T[0x02F]     0000000114C67000 00000000221005B0 0000000000000000 05B0   2F
00000000092E4D40 00000000
T[0x030]     0000000114C675B0 00000000AB100004 0000000000000000 0004   30
00000000092E4D40 00000000
T[0x031]     0000322236706A9E 0000000020000000 0000000000000000 0042   34
00000000092E4D40 00000000
T[0x032]     0000000037D364AA 0000000022100036 0000000000000000 0036   32
00000000092E4D40 00000000
T[0x033]     0000000114C675B4 0000000022100588 0000000000000000 0588   33
00000000092E4D40 00000000
T[0x034]     0000000114C67B3C 00000000AB100004 0000000000000000 0004   34
00000000092E4D40 00000000
T[0x035]     0000322236EAA69E 0000000020000000 0000000000000000 0042   36
00000000092E4D40 00000000
T[0x036]     0000000015E5C8AA 00000000AB100115 0000000000000000 0115   36
00000000092E4D40 00000000
T[0x037]     000032220021180E 0584360027000823 0000000000000000 0004   3C
00000000092E4D40 00000000 NTC
T[0x038]     00000000154E30AA 0000030026100033 00000000154E30AA 0033   38
00000000092E4D40 00000000
T[0x039]     00000000154E30DD 0000030026100004 00000000154E30DD 0004   39
00000000092E4D40 00000000
T[0x03A]     00000001EA2FF000 000003002610081A 00000001EA2FF000 081A   3A
00000000092E4D40 00000000
T[0x03B]     00000001EA2FF81A 0000030026100004 00000001EA2FF81A 0004   3B
00000000092E4D40 00000000
T[0x03C]     00000001EA2FF81E 00000300AF100004 00000001EA2FF81E 0004   3C
00000000092E4D40 d7dd7680
T[0x03D]     0000322297011000 0000000020000000 0000000000000000 03B4   40
00000000092E4D40 00000000
T[0x03E]     0000000037F53EAA 0000020022100036 0000000037F53EAA 0036   3E
00000000092E4D40 00000000
T[0x03F]     000000013D0C408C 0000020022100580 000000013D0C408C 0580   3F
00000000092E4D40 00000000
T[0x040]     000000013D0C460C 00000200AB100004 000000013D0C460C 0004   40
00000000092E4D40 de7cd080
T[0x041]     00003222C79F45B0 0000000020000000 0000000000000000 0004   44
00000000092E4D40 00000000
T[0x042]     0000000037DA66AA 0000020022100036 0000000037DA66AA 0036   42
00000000092E4D40 00000000
T[0x043]     000000013D0C4610 0000020022100580 000000013D0C4610 0580   43
00000000092E4D40 00000000
T[0x044]     000000013D0C4B90 00000200AB100004 000000013D0C4B90 0004   44
00000000092E4D40 dac9bc80
T[0x045]     000032221ADFF000 0000000020000000 0000000000000000 05A8   4A
00000000092E4D40 00000000
T[0x046]     0000000037D162AA 0000020022100036 0000000037D162AA 0036   46
00000000092E4D40 00000000
T[0x047]     000000013D0C4B94 0000020022100468 000000013D0C4B94 0468   47
00000000092E4D40 00000000
T[0x048]     000000013D0C4FFC 0000020022100004 000000013D0C4FFC 0004   48
00000000092E4D40 00000000
T[0x049]     000000013D0C5000 0000020022100114 000000013D0C5000 0114   49
00000000092E4D40 00000000
T[0x04A]     000000013D0C5114 00000200AB100004 000000013D0C5114 0004   4A
00000000092E4D40 dbb54680
T[0x04B]     00000000052AAA9E 000000000200003A 00000000052AAA9E 003A   4C
00000000092E4D40 00000000
T[0x04C]     00000000052AAAD8 000000008B000008 00000000052AAAD8 0008   4C
00000000092E4D40 deb7b780
T[0x04D]     0000322237E5BCAA 0000000020000000 0000000000000000 0036   4E
00000000092E4D41 00000000
T[0x04E]     0000000016C988AA 00000200AB100114 0000000016C988AA 0114   4E
00000000092E4D41 e3ad2e80
T[0x04F]     0000000037D3DCAA 000000008B000036 0000000037D3DCAA 0036   4F
00000000092E4D42 efb35180
T[0x050]     000032221F42C8AA 0000000020000000 0000000000000000 0113   53
00000000092E4D42 00000000
T[0x051]     000000001ABAD8AA 000002002210017C 000000001ABAD8AA 017C   51
00000000092E4D42 00000000
T[0x052]     00000001FAE35000 0000020022100211 00000001FAE35000 0211   52
00000000092E4D42 00000000
T[0x053]     00000001FAE35211 00000200AB100004 00000001FAE35211 0004   53
00000000092E4D42 da1f5780
T[0x054]     00000000052B58AA 000000008B000036 00000000052B58AA 0036   54
00000000092E4D42 e4b50d80
T[0x055]     0000322216D8A0AA 0000000020000000 0000000000000000 0113   5A
00000000092E4D47 00000000
T[0x056]     0000000037CDCAAA 0000020022100036 0000000037CDCAAA 0036   56
00000000092E4D47 00000000
T[0x057]     000000010E760B08 00000200221004F4 000000010E760B08 04F4   57
00000000092E4D47 00000000
T[0x058]     000000010E760FFC 0000020022100004 000000010E760FFC 0004   58
00000000092E4D47 00000000
T[0x059]     000000010E761000 0000020022100088 000000010E761000 0088   59
00000000092E4D47 00000000
T[0x05A]     000000010E761088 00000200AB100004 000000010E761088 0004   5A
00000000092E4D47 e07b4780
T[0x05B]     00000000052B1CAA 000000008B000036 00000000052B1CAA 0036   5B
00000000092E4D47 dac98980
T[0x05C]     0000000037D1B2AA 000000008B000036 0000000037D1B2AA 0036   5C
00000000092E4D48 de293c80
T[0x05D]     0000322237EA28AA 0000000020000000 0000000000000000 0036   5E
00000000092E4D49 00000000
T[0x05E]     0000000023A450AA 00000200AB100114 0000000023A450AA 0114   5E
00000000092E4D49 dcca3380
T[0x05F]     0000000036B30EAA 000000008B000036 0000000036B30EAA 0036   5F
00000000092E4D4C ddd97980
T[0x060]     0000000037DA68AA 000000008B000036 0000000037DA68AA 0036   60
00000000092E4D4C e40a5080
T[0x061]     000000003793ECAA 000000008B000036 000000003793ECAA 0036   61
00000000092E4D4D dde0c580
T[0x062]     0000000037DDFC9E 000000008B000042 0000000037DDFC9E 0042   62
00000000092E4D4F dd4ef680
T[0x063]     000000000527B4AA 000000008B000036 000000000527B4AA 0036   63
00000000092E4D4F dcc71480
T[0x064]     000000000524CE9E 000000008B000042 000000000524CE9E 0042   64
00000000092E4D50 ddcb3480
T[0x065]     0000000037E672AA 000000008B000036 0000000037E672AA 0036   65
00000000092E4D50 dfc6ed80
T[0x066]     0000000037D3EEAA 000000008B000036 0000000037D3EEAA 0036   66
00000000092E4D51 ebf01980
T[0x067]     00003222E8B8F000 0000000020000000 0000000000000000 0588   68
00000000092E4D52 00000000
T[0x068]     000000001C1720AA 00000200AB10017D 000000001C1720AA 017D   68
00000000092E4D52 e4c30080
T[0x069]     0000000037FBC6AA 000000008B000036 0000000037FBC6AA 0036   69
00000000092E4D52 de354b80
T[0x06A]     0000000037D934AA 000000008B000036 0000000037D934AA 0036   6A
00000000092E4D54 e3ad0280
T[0x06B]     00003222E8B8F58C 0000000020000000 0000000000000000 0588   6C
00000000092E4D55 00000000
T[0x06C]     000000001F6CB0AA 00000200AB100114 000000001F6CB0AA 0114   6C
00000000092E4D55 e5c9cc80
T[0x06D]     000000003791D6AA 000000008B000036 000000003791D6AA 0036   6D
00000000092E4D57 e5344980
T[0x06E]     000032221937F0AA 0000000020000000 0000000000000000 0114   6F
00000000092E4D58 00000000
T[0x06F]     00000000160128AA 00000200AB100114 00000000160128AA 0114   6F
00000000092E4D58 f0622580
T[0x070]     0000000037EC36AA 000000008B000036 0000000037EC36AA 0036   70
00000000092E4D59 e359bc80
T[0x071]     00000000379522AA 000000008B000036 00000000379522AA 0036   71
00000000092E4D5A e07b4580
T[0x072]     0000000036CC78AA 000000008B000036 0000000036CC78AA 0036   72
00000000092E4D5C dd3ad580
T[0x073]     0000322237C06000 0000000020000000 0000000000000000 0510   74
00000000092E4D5D 00000000
T[0x074]     000000001C7658AA 00000200AB10017C 000000001C7658AA 017C   74
00000000092E4D5D ee8a3580
T[0x075]     0000000037DA62AA 000000008B000036 0000000037DA62AA 0036   75
00000000092E4D5D db53cc80
T[0x076]     00000000052EF2AA 000000008B000036 00000000052EF2AA 0036   76
00000000092E4D5D e1d1be80
T[0x077]     0000322237C06514 0000000020000000 0000000000000000 03CE   78
00000000092E4D5F 00000000
T[0x078]     000000001625C8AA 00000200AB100115 000000001625C8AA 0115   78
00000000092E4D5F d9ded780
T[0x079]     000000003791F4AA 000000008B000036 000000003791F4AA 0036   79
00000000092E4D60 e4f68180
T[0x07A]     0000000037E6F8AA 000000008B000036 0000000037E6F8AA 0036   7A
00000000092E4D61 dbb1a080
T[0x07B]     0000000037E9B8AA 000000008B000036 0000000037E9B8AA 0036   7B
00000000092E4D61 de5f4780
T[0x07C]     0000000037D966AA 000000008B000036 0000000037D966AA 0036   7C
00000000092E4D61 daeeb680
T[0x07D]     00000000379388AA 000000008B000036 00000000379388AA 0036   7D
00000000092E4D62 e1d87280
T[0x07E]     000000003795AEAA 000000008B000036 000000003795AEAA 0036   7E
00000000092E4D64 da7f8880
T[0x07F]     00000000366F56AA 000000008B000036 00000000366F56AA 0036   7F
00000000092E4D65 e386c880
T[0x080]     0000000037E6FCAA 000000008B000036 0000000037E6FCAA 0036   80
00000000092E4D68 e4b87980
T[0x081]     00000000052FAEAA 000000008B000036 00000000052FAEAA 0036   81
00000000092E4D6C dd1a7880
T[0x082]     00000000366F7EAA 000000008B000036 00000000366F7EAA 0036   82
00000000092E4D6C de00c280
T[0x083]     0000000036C8AAAA 000000008B000036 0000000036C8AAAA 0036   83
00000000092E4D73 e4a1ba80
T[0x084]     0000000037E19AAA 000000008B000036 0000000037E19AAA 0036   84
00000000092E4D75 dec14180
T[0x085]     000032223D0C4C1C 0000000020000000 0000000000000000 0004   86
00000000092E4D76 00000000
T[0x086]     00000000184F40AA 00000200AB100264 00000000184F40AA 0264   86
00000000092E4D76 e45f1a80
T[0x087]     000000003795EC9E 000000008B000042 000000003795EC9E 0042   87
00000000092E4D76 e43e4880
T[0x088]     0000000037E1CA9E 000000008B000042 0000000037E1CA9E 0042   88
00000000092E4D77 e2e7be80
T[0x089]     00000000379436AA 000000008B000036 00000000379436AA 0036   89
00000000092E4D7B de00cd80
T[0x08A]     000000000528DEAA 000000008B000036 000000000528DEAA 0036   8A
00000000092E4D7C da405880
T[0x08B]     00003222366E0AAA 0000000020000000 0000000000000000 0036   8C
00000000092E4D7D 00000000
T[0x08C]     00000000243EC0AA 00000200AB100115 00000000243EC0AA 0115   8C
00000000092E4D7D da88fd80
T[0x08D]     000000003790EA9E 000000008B000042 000000003790EA9E 0042   8D
00000000092E4D7E e149a780
T[0x08E]     0000322237FAC6AA 0000000020000000 0000000000000000 0036   8F
00000000092E4D7E 00000000
T[0x08F]     000000002082B0AA 00000200AB100113 000000002082B0AA 0113   8F
00000000092E4D7E e52d3080
T[0x090]     0000000037DA4CAA 000000008B000036 0000000037DA4CAA 0036   90
00000000092E4D7E dabdc980
T[0x091]     00003222052EF8AA 0000000020000000 0000000000000000 0036   92
00000000092E4D7F 00000000
T[0x092]     000000001957E8AA 00000200AB100113 000000001957E8AA 0113   92
00000000092E4D7F e2407880
T[0x093]     0000322237DD66AA 0000000020000000 0000000000000000 0036   94
00000000092E4D80 00000000
T[0x094]     000000001679C8AA 00000200AB100114 000000001679C8AA 0114   94
00000000092E4D80 de593680
T[0x095]     0000000037DDFEAA 000000008B000036 0000000037DDFEAA 0036   95
00000000092E4D80 e4d2c680
T[0x096]     0000322237E8CCAA 0000000020000000 0000000000000000 0036   97
00000000092E4D81 00000000
T[0x097]     0000000015D430AA 00000200AB100114 0000000015D430AA 0114   97
00000000092E4D81 d9f7e980
T[0x098]     0000000035C2BAAA 000000008B000036 0000000035C2BAAA 0036   98
00000000092E4D81 d99c9e80
T[0x099]     00003222379898AA 0000000020000000 0000000000000000 0036   9C
00000000092E4D82 00000000
T[0x09A]     0000000037D9FAAA 0000020022100036 0000000037D9FAAA 0036   9A
00000000092E4D82 00000000
T[0x09B]     0000000137C065AC 0000020022100336 0000000137C065AC 0336   9B
00000000092E4D82 00000000
T[0x09C]     0000000137C068E2 00000200AB100004 0000000137C068E2 0004   9C
00000000092E4D82 daac3080
T[0x09D]     00003222208A40AA 0000000020000000 0000000000000000 01A6   9E
00000000092E4D84 00000000
T[0x09E]     000000001799B8AA 00000200AB10017D 000000001799B8AA 017D   9E
00000000092E4D84 dab82b80
T[0x09F]     0000000037E322AA 000000008B000036 0000000037E322AA 0036   9F
00000000092E4D84 de06e080
T[0x0A0]     00003222E9757000 0000000020000000 0000000000000000 0580   A3
00000000092E4D84 00000000
T[0x0A1]     0000000037CFF8AA 0000020022100036 0000000037CFF8AA 0036   A1
00000000092E4D84 00000000
T[0x0A2]     00000001EA97B5B4 0000020022100486 00000001EA97B5B4 0486   A2
00000000092E4D84 00000000
T[0x0A3]     00000001EA97BA3A 00000200AB100004 00000001EA97BA3A 0004   A3
00000000092E4D84 de0b3380
T[0x0A4]     00000000366DBEAA 000000008B000036 00000000366DBEAA 0036   A4
00000000092E4D84 dc194680
T[0x0A5]     000032221C7650AA 0000000020000000 0000000000000000 01A5   A8
00000000092E4D89 00000000
T[0x0A6]     0000000037F45AAA 0000020022100036 0000000037F45AAA 0036   A6
00000000092E4D89 00000000
T[0x0A7]     00000000CC3C95B4 00000200221005B0 00000000CC3C95B4 05B0   A7
00000000092E4D89 00000000
T[0x0A8]     00000000CC3C9B64 00000200AB100004 00000000CC3C9B64 0004   A8
00000000092E4D89 e3ce1780
T[0x0A9]     0000000037CF749E 000000008B000042 0000000037CF749E 0042   A9
00000000092E4D8C f0104780
T[0x0AA]     00003222166CC8AA 0000000020000000 0000000000000000 01A5   AB
00000000092E4D8D 00000000
T[0x0AB]     000000001746A0AA 00000200AB1001B3 000000001746A0AA 01B3   AB
00000000092E4D8D e293e380
T[0x0AC]     0000000037FA86AA 000000008B000036 0000000037FA86AA 0036   AC
00000000092E4D8D e593c380
T[0x0AD]     00003222965FC000 0000000020000000 0000000000000000 05B0   AE
00000000092E4D8F 00000000
T[0x0AE]     0000000017C8B8AA 00000200AB100114 0000000017C8B8AA 0114   AE
00000000092E4D8F d9c9f780
T[0x0AF]     0000000037D0CAAA 000000008B000036 0000000037D0CAAA 0036   AF
00000000092E4D90 e42dea80
T[0x0B0]     000032222399C0AA 0000000020000000 0000000000000000 0114   B1
00000000092E4D92 00000000
T[0x0B1]     0000000025DAD8AA 00000200AB100115 0000000025DAD8AA 0115   B1
00000000092E4D92 da4cba80
T[0x0B2]     0000000037FA94AA 000000008B000036 0000000037FA94AA 0036   B2
00000000092E4D97 eecc8780
T[0x0B3]     00000000052E9AAA 000000008B000036 00000000052E9AAA 0036   B3
00000000092E4D97 dd1a7d80
T[0x0B4]     00003222164370AA 0000000020000000 0000000000000000 0114   B7
00000000092E4D98 00000000
T[0x0B5]     0000000037E9BEAA 0000020022100036 0000000037E9BEAA 0036   B5
00000000092E4D98 00000000
T[0x0B6]     00000001E78905B4 00000200221003B0 00000001E78905B4 03B0   B6
00000000092E4D98 00000000
T[0x0B7]     00000001E7890964 00000200AB100004 00000001E7890964 0004   B7
00000000092E4D98 e5401880
T[0x0B8]     0000000037F48AAA 000000008B000036 0000000037F48AAA 0036   B8
00000000092E4D98 e1118980
T[0x0B9]     0000000037E2AA9E 000000008B000042 0000000037E2AA9E 0042   B9
00000000092E4D9A db53c780
T[0x0BA]     00000000379374AA 000000008B000036 00000000379374AA 0036   BA
00000000092E4D9C dab82e80
T[0x0BB]     000032220030A5B0 0000000020000000 0000000000000000 0004   BD
00000000092E4D9D 00000000
T[0x0BC]     0000000015B808AA 0000020022100213 0000000015B808AA 0213   BC
00000000092E4D9D 00000000
T[0x0BD]     0000000015B80ABD 00000200AB100004 0000000015B80ABD 0004   BD
00000000092E4D9D dd0cc280
T[0x0BE]     000032220030A5B4 0000000020000000 0000000000000000 05B0   BF
00000000092E4D9E 00000000
T[0x0BF]     0000000018F790AA 00000200AB100114 0000000018F790AA 0114   BF
00000000092E4D9E e44f5180
T[0x0C0]     0000000037940AAA 000000008B000036 0000000037940AAA 0036   C0
00000000092E4DA0 de5f4680
T[0x0C1]     0000000036B2C6AA 000000008B000036 0000000036B2C6AA 0036   C1
00000000092E4DA1 de7cd280
T[0x0C2]     0000000037E354AA 000000008B000036 0000000037E354AA 0036   C2
00000000092E4DA1 ddd83980
T[0x0C3]     000032220030AFFC 0000000020000000 0000000000000000 0004   C4
00000000092E4DA3 00000000
T[0x0C4]     000000001709D0AA 00000200AB100114 000000001709D0AA 0114   C4
00000000092E4DA3 dbed6880
T[0x0C5]     0000000037E8F8AA 000000008B000036 0000000037E8F8AA 0036   C5
00000000092E4DA5 e3c81180
T[0x0C6]     000032221750D0AA 0000000020000000 0000000000000000 0114   C7
00000000092E4DA6 00000000
T[0x0C7]     0000000022BF80AA 00000200AB10020D 0000000022BF80AA 020D   C7
00000000092E4DA6 e4f03580
T[0x0C8]     0000000036CD2AA2 000000008B00003E 0000000036CD2AA2 003E   C8
00000000092E4DA6 dfce4880
T[0x0C9]     0000000037935C9E 000000008B000042 0000000037935C9E 0042   C9
00000000092E4DA6 d9c4f280
T[0x0CA]     0000000037D96EA2 000000008B00003E 0000000037D96EA2 003E   CA
00000000092E4DA6 e1515880
T[0x0CB]     000000003793AAA2 000000008B00003E 000000003793AAA2 003E   CB
00000000092E4DA6 e329db80
T[0x0CC]     00000000360C06A2 000000008B00003E 00000000360C06A2 003E   CC
00000000092E4DA6 ef4bbe80
T[0x0CD]     00003222176A08AA 0000000020000000 0000000000000000 017C   CF
00000000092E4DA7 00000000
T[0x0CE]     0000000019FA10AA 000002002210025D 0000000019FA10AA 025D   CE
00000000092E4DA7 00000000
T[0x0CF]     0000000019FA1307 00000200AB100004 0000000019FA1307 0004   CF
00000000092E4DA7 e1d1b680
T[0x0D0]     0000322218DE08AA 0000000020000000 0000000000000000 017D   D1
00000000092E4DAA 00000000
T[0x0D1]     0000000015B710AA 00000200AB1001A5 0000000015B710AA 01A5   D1
00000000092E4DAA df679780
T[0x0D2]     0000000037946EAA 000000008B000036 0000000037946EAA 0036   D2
00000000092E4DAA e149a680
T[0x0D3]     00000000366F54AA 000000008B000036 00000000366F54AA 0036   D3
00000000092E4DAA e22b2380
T[0x0D4]     0000000035C306AA 000000008B000036 0000000035C306AA 0036   D4
00000000092E4DAB e026b880
T[0x0D5]     000000003794D6AA 000000008B000036 000000003794D6AA 0036   D5
00000000092E4DAB dd440980
T[0x0D6]     0000322223A7B8AA 0000000020000000 0000000000000000 0114   D9
00000000092E4DAC 00000000
T[0x0D7]     000000003798C4AA 0000020022100036 000000003798C4AA 0036   D7
00000000092E4DAC 00000000
T[0x0D8]     00000000CC3CA11C 0000020022100198 00000000CC3CA11C 0198   D8
00000000092E4DAC 00000000
T[0x0D9]     00000000CC3CA2B4 00000200AB100004 00000000CC3CA2B4 0004   D9
00000000092E4DAC e1d1b180
T[0x0DA]     00000000366F5EAA 000000008B000036 00000000366F5EAA 0036   DA
00000000092E4DAD df8b0280
T[0x0DB]     0000000037FC849E 000000008B000042 0000000037FC849E 0042   DB
00000000092E4DAE e5609d80
T[0x0DC]     00003222DB414000 0000000020000000 0000000000000000 0594   DF
00000000092E4DAF 00000000
T[0x0DD]     0000000037CF6AAA 0000020022100036 0000000037CF6AAA 0036   DD
00000000092E4DAF 00000000
T[0x0DE]     00000000609B00E0 00000200221002C6 00000000609B00E0 02C6   DE
00000000092E4DAF 00000000
T[0x0DF]     00000000609B03A6 00000200AB100004 00000000609B03A6 0004   DF
00000000092E4DAF daeeb780
T[0x0E0]     0000000036EA869E 000000008B000042 0000000036EA869E 0042   E0
00000000092E4DAF e5609b80
T[0x0E1]     00000000360BB6AA 000000008B000036 00000000360BB6AA 0036   E1
00000000092E4DAF dd440380
T[0x0E2]     0000000037F964AA 000000008B000036 0000000037F964AA 0036   E2
00000000092E4DAF de3e1680
T[0x0E3]     00003222E759E30E 0000000020000000 0000000000000000 0004   E4
00000000092E4DB1 00000000
T[0x0E4]     000000001ACB08AA 00000200AB100114 000000001ACB08AA 0114   E4
00000000092E4DB1 e1858880
T[0x0E5]     0000000037F4569E 000000008B000042 0000000037F4569E 0042   E5
00000000092E4DB4 da44a880
T[0x0E6]     00003222FF0375A0 0000000020000000 0000000000000000 02E7   E7
00000000092E4DB5 00000000
T[0x0E7]     00000000194E78AA 00000200AB10025D 00000000194E78AA 025D   E7
00000000092E4DB5 dbf4e780
T[0x0E8]     000032223798DEAA 0000000020000000 0000000000000000 0036   E9
00000000092E4DB9 00000000
T[0x0E9]     000000001B0CC09E 00000200AB100120 000000001B0CC09E 0120   E9
00000000092E4DB9 e215b180
T[0x0EA]     0000322237CFFCAA 0000000020000000 0000000000000000 0036   EC
00000000092E4DBC 00000000
T[0x0EB]     000000001BE8A4AA 00000200221001A5 000000001BE8A4AA 01A5   EB
00000000092E4DBC 00000000
T[0x0EC]     000000001BE8A64F 00000200AB100004 000000001BE8A64F 0004   EC
00000000092E4DBC dac9bb80
T[0x0ED]     000032222573CA25 0000000020000000 0000000000000000 0004   EE
00000000092E4DBC 00000000
T[0x0EE]     000000001AF250AA 00000200AB100113 000000001AF250AA 0113   EE
00000000092E4DBC dd416d80
T[0x0EF]     0000322236C8A4AA 0000000020000000 0000000000000000 0036   F2
00000000092E4DBC 00000000
T[0x0F0]     000000003792B6AA 0000020022100036 000000003792B6AA 0036   F0
00000000092E4DBC 00000000
T[0x0F1]     00000001E1195000 00000200221005B0 00000001E1195000 05B0   F1
00000000092E4DBC 00000000
T[0x0F2]     00000001E11955B0 00000200AB100004 00000001E11955B0 0004   F2
00000000092E4DBC db154180
T[0x0F3]     0000322236C8A6AA 0000000020000000 0000000000000000 0036   F4
00000000092E4DBD 00000000
T[0x0F4]     000000001A9A20AA 00000200AB100115 000000001A9A20AA 0115   F4
00000000092E4DBD ddb81080
T[0x0F5]     0000000037E2DEAA 000000008B000036 0000000037E2DEAA 0036   F5
00000000092E4DBD e0935380
T[0x0F6]     000032223798F6AA 0000000020000000 0000000000000000 0036   F7
00000000092E4DBF 00000000
T[0x0F7]     0000000024B060AA 00000200AB100114 0000000024B060AA 0114   F7
00000000092E4DBF db989b80
T[0x0F8]     00000000379546AA 000000008B000036 00000000379546AA 0036   F8
00000000092E4DC3 ddb53480
T[0x0F9]     00003222D5FFDFFC 0000000020000000 0000000000000000 0004   FA
00000000092E4DC6 00000000
T[0x0FA]     00000000247D60AA 00000200AB100114 00000000247D60AA 0114   FA
00000000092E4DC6 f639f080
T[0x0FB]     000000003790D2AA 000000008B000036 000000003790D2AA 0036   FB
00000000092E4DC6 e5c9c880
T[0x0FC]     00003222DF2039A9 0000000020000000 0000000000000000 0004   FF
00000000092E4DCC 00000000
T[0x0FD]     0000000036CBDC9E 0000020022100042 0000000036CBDC9E 0042   FD
00000000092E4DCC 00000000
T[0x0FE]     00000001E2B9B680 000002002210059C 00000001E2B9B680 059C   FE
00000000092E4DCC 00000000
T[0x0FF]     00000001E2B9BC1C 00000200AB100004 00000001E2B9BC1C 0004   FF
00000000092E4DCC e3f0ac80

RX Desc ring dump
R[desc]      [address 63:0  ] [vl er S cks ln] [bi->dma       ] [bi->skb]
R[0x000]     000000001E530812 000000009B1D0468 000000001E530812 de354e80
R[0x001]     000000001C0E2012 000000002FBB0185 000000001C0E2012 e189cd80
R[0x002]     0000000017570012 000000003B1F0040 0000000017570012 e5854180
R[0x003]     00000000187A9812 000000004D8A0040 00000000187A9812 dc194080
R[0x004]     0000000022D84012 0000000020A0029A 0000000022D84012 e3797880
R[0x005]     0000000017238012 0000000038B403DD 0000000017238012 dda96180
R[0x006]     000000001F66F812 000000002A790046 000000001F66F812 e39aa980
R[0x007]     000000001B34F812 00000000E3C2018E 000000001B34F812 dc4c8680
R[0x008]     0000000019F81812 0000000032280040 0000000019F81812 e1515a80
R[0x009]     000000001A47A012 000000001E7C0046 000000001A47A012 e3e56280
R[0x00A]     00000000192DF812 00000000744F0040 00000000192DF812 dae8d580
R[0x00B]     00000000167A7012 0000000067510290 00000000167A7012 e2290080
R[0x00C]     000000001AB39012 00000000BE0F01CE 000000001AB39012 e593cd80
R[0x00D]     0000000017794812 00000000A90A01DF 0000000017794812 e43e4280
R[0x00E]     000000001AA77812 0000000051290246 000000001AA77812 ed3a1080
R[0x00F]     000000001694E812 00000000512B0040 000000001694E812 dda96c80
R[0x010]     00000000179F6012 000000003E790040 00000000179F6012 de593b80
R[0x011]     0000000024DCE012 00000000A66F0040 0000000024DCE012 e326bd80
R[0x012]     0000000016DA6812 0000000084B60040 0000000016DA6812 dadcf280
R[0x013]     000000001E508812 000000001DBE0040 000000001E508812 ddcb3680
R[0x014]     000000001F5A2012 00000000E3AB0040 000000001F5A2012 deec8b80
R[0x015]     000000001DEB1812 000000004DF70219 000000001DEB1812 e386cc80
R[0x016]     000000001A883012 00000000CF120225 000000001A883012 dc732680
R[0x017]     0000000018563012 00000000C850031B 0000000018563012 e5344180
R[0x018]     0000000015E57012 0000000030A50040 0000000015E57012 daffd880
R[0x019]     000000001D466812 00000000830E0040 000000001D466812 dc9a2980
R[0x01A]     000000001CF1C012 000000009F5300FE 000000001CF1C012 e329d080
R[0x01B]     0000000017B6C012 0000000078D70215 0000000017B6C012 e4b87680
R[0x01C]     000000001C7E6812 0000000034400040 000000001C7E6812 dae8d280
R[0x01D]     000000001A40F012 00000000F55602C3 000000001A40F012 de1b8580
R[0x01E]     000000001F8BF012 0000000043D10046 000000001F8BF012 ef28dd80
R[0x01F]     000000001AA03012 0000000067320286 000000001AA03012 de80e580
R[0x020]     000000001FBB8012 0000000032F20231 000000001FBB8012 db989e80
R[0x021]     0000000019073812 00000000362B0040 0000000019073812 dab21c80 NTU
R[0x022]     0000000021AB1012 00000000CE750040 0000000021AB1012 ee8a3680 NTC
R[0x023]     000000001B47F012 000000004B3E020A 000000001B47F012 dab21380
R[0x024]     000000001FCEB812 000000007CAA0040 000000001FCEB812 dd416080
R[0x025]     000000001B781812 00000000D5050040 000000001B781812 e2755180
R[0x026]     0000000015E57812 000000001F65004E 0000000015E57812 df8b0d80
R[0x027]     00000000227A7812 0000000043C401D5 00000000227A7812 df137880
R[0x028]     000000001FA0D012 00000000807E0046 000000001FA0D012 ed908680
R[0x029]     000000001B782812 000000006B030295 000000001B782812 da088a80
R[0x02A]     0000000017DA7812 00000000AC0901E7 0000000017DA7812 e093ed80
R[0x02B]     000000001C4CA812 000000003DB20060 000000001C4CA812 f01a2880
R[0x02C]     00000000196FA012 00000000F5860290 00000000196FA012 dcca3480
R[0x02D]     0000000015940812 000000005C92022F 0000000015940812 e593ce80
R[0x02E]     000000001788F012 00000000DA1B0046 000000001788F012 e38e9880
R[0x02F]     0000000019B8E012 00000000583F01DA 0000000019B8E012 dd86b480
R[0x030]     00000000175F4812 00000000AEA201DC 00000000175F4812 dd255680
R[0x031]     0000000020E8D012 00000000AA950040 0000000020E8D012 d94e9780
R[0x032]     000000001B20D012 000000000612020C 000000001B20D012 e2e7bd80
R[0x033]     000000001ECEE812 0000000091250040 000000001ECEE812 e4f03d80
R[0x034]     000000001D3CC012 00000000E3300040 000000001D3CC012 da96de80
R[0x035]     000000001A9AC012 00000000768A0040 000000001A9AC012 e46a9b80
R[0x036]     000000001A47A812 00000000D3D60040 000000001A47A812 eecc8c80
R[0x037]     0000000017418012 0000000048810040 0000000017418012 e3c81d80
R[0x038]     000000001C4D1012 0000000044180042 000000001C4D1012 e595cb80
R[0x039]     00000000177A3012 0000000085360042 00000000177A3012 d9c4fb80
R[0x03A]     0000000016091012 00000000110501D8 0000000016091012 da663b80
R[0x03B]     000000001AFB9012 00000000A5F10040 000000001AFB9012 dc8c8a80
R[0x03C]     00000000159C3812 00000000D7AA0040 00000000159C3812 e4d94c80
R[0x03D]     00000000150BA012 000000008D700040 00000000150BA012 dcc9b480
R[0x03E]     0000000017A79812 0000000094D20040 0000000017A79812 e22b2d80
R[0x03F]     0000000016E56812 000000009D910042 0000000016E56812 de593d80
R[0x040]     00000000178C2012 00000000144604A7 00000000178C2012 dcc29e80
R[0x041]     000000001DC0F012 00000000913302A3 000000001DC0F012 dd4efc80
R[0x042]     000000001B175012 0000000071650218 000000001B175012 dc654680
R[0x043]     0000000017F91812 0000000071C5021D 0000000017F91812 e1c46380
R[0x044]     00000000197AA812 00000000732E0046 00000000197AA812 dbadf480
R[0x045]     0000000019484012 00000000E729004E 0000000019484012 e293e180
R[0x046]     000000001901A012 0000000025270040 000000001901A012 e3358180
R[0x047]     0000000024086812 00000000C7CF0040 0000000024086812 de0b3280
R[0x048]     0000000017943812 0000000035480253 0000000017943812 dc083180
R[0x049]     0000000024314012 000000007FBA0040 0000000024314012 e4f2a580
R[0x04A]     000000001A47D012 000000001F420272 000000001A47D012 e4749880
R[0x04B]     00000000179F3012 000000008A8601EF 00000000179F3012 e5854b80
R[0x04C]     000000002034F812 0000000017750288 000000002034F812 da96d280
R[0x04D]     000000001DC31812 0000000083120438 000000001DC31812 e4f2ab80
R[0x04E]     0000000016F50812 00000000928C02B8 0000000016F50812 e5663180
R[0x04F]     000000001CBEF812 00000000FE4401FF 000000001CBEF812 da674180
R[0x050]     000000002532F012 0000000081DB0040 000000002532F012 daffdd80
R[0x051]     000000001784B812 0000000090BD0040 000000001784B812 db53cb80
R[0x052]     000000001DA91012 00000000C4010297 000000001DA91012 e2f66b80
R[0x053]     000000001C254012 00000000C4D60040 000000001C254012 da70e780
R[0x054]     0000000016DD7012 00000000A4B5030A 0000000016DD7012 dd4efb80
R[0x055]     00000000165CE812 00000000E6080277 00000000165CE812 daac3a80
R[0x056]     000000001E5C3812 000000009CD80040 000000001E5C3812 e55b4c80
R[0x057]     0000000018BD4012 000000003DAD0046 0000000018BD4012 ddc21c80
R[0x058]     00000000172EB812 000000001D0D0247 00000000172EB812 e4d94780
R[0x059]     0000000019173812 00000000CC320040 0000000019173812 d7a18380
R[0x05A]     00000000157B4812 000000004A190040 00000000157B4812 dec31680
R[0x05B]     00000000208A4812 000000000B9C0040 00000000208A4812 da7f8480
R[0x05C]     000000001C409012 00000000CD1B02D8 000000001C409012 efe7ea80
R[0x05D]     0000000018D86012 00000000DEB801EB 0000000018D86012 e301e080
R[0x05E]     0000000019965812 00000000AF8702A7 0000000019965812 dc021c80
R[0x05F]     000000002046A812 00000000A5610217 000000002046A812 e4370480
R[0x060]     0000000022111812 00000000F8560040 0000000022111812 e3f0a980
R[0x061]     00000000175CF812 000000007D34048E 00000000175CF812 e1168480
R[0x062]     000000001C2F1012 0000000037C40040 000000001C2F1012 efdbe880
R[0x063]     0000000017349012 00000000DBC70046 0000000017349012 daeebc80
R[0x064]     0000000017F4F012 0000000013E50040 0000000017F4F012 dad55580
R[0x065]     00000000216F1012 00000000C8A70040 00000000216F1012 de80ea80
R[0x066]     0000000018D6A812 000000005FEA0478 0000000018D6A812 e2cdb280
R[0x067]     000000001506F812 000000008B5F0042 000000001506F812 df395d80
R[0x068]     0000000025CC6812 00000000D2B40233 0000000025CC6812 e385d080
R[0x069]     000000001A2C2012 0000000047270046 000000001A2C2012 e22b2b80
R[0x06A]     000000001C7E6012 000000008BEE02A9 000000001C7E6012 e40a5480
R[0x06B]     0000000015886012 00000000D6F90040 0000000015886012 ee35bc80
R[0x06C]     0000000015FD6012 00000000C9AB0046 0000000015FD6012 e5fa7c80
R[0x06D]     0000000018BD4812 0000000003650040 0000000018BD4812 de5f4c80
R[0x06E]     0000000018CE5812 00000000E8610214 0000000018CE5812 e5401080
R[0x06F]     000000001F024812 0000000055E50339 000000001F024812 e4c7d580
R[0x070]     00000000168DD812 0000000067C301B4 00000000168DD812 dc926180
R[0x071]     0000000015E74812 00000000CBAB0040 0000000015E74812 e152ea80
R[0x072]     000000001AA25812 0000000033AE0040 000000001AA25812 ddb81780
R[0x073]     000000001B884812 000000005C7C0246 000000001B884812 ecd82380
R[0x074]     000000001DAAD812 000000005CAB0040 000000001DAAD812 de983b80
R[0x075]     00000000179D7012 0000000015BE0217 00000000179D7012 e578ed80
R[0x076]     0000000015BD4812 00000000F2F20040 0000000015BD4812 dc926580
R[0x077]     0000000018359012 0000000018380046 0000000018359012 e5854880
R[0x078]     0000000019D43812 00000000AEF50046 0000000019D43812 e4d2c280
R[0x079]     000000001A477012 000000004DEA0046 000000001A477012 dd3da580
R[0x07A]     0000000019394812 0000000046840252 0000000019394812 dd416a80
R[0x07B]     00000000158DE012 0000000078CF039F 00000000158DE012 dc083e80
R[0x07C]     0000000015C29812 000000003F8001C4 0000000015C29812 e5401380
R[0x07D]     0000000015CC8812 00000000FA310040 0000000015CC8812 da96d980
R[0x07E]     0000000025DE5812 0000000096520040 0000000025DE5812 e1858680
R[0x07F]     000000001506F012 00000000D9C30252 000000001506F012 e44f5680
R[0x080]     0000000015B83012 0000000041C2016B 0000000015B83012 dd440080
R[0x081]     00000000187D9812 000000005AC401C4 00000000187D9812 dbb54c80
R[0x082]     00000000165CF812 00000000ABD703A8 00000000165CF812 ee35b680
R[0x083]     000000001733A012 0000000033F30040 000000001733A012 dabdce80
R[0x084]     00000000194FF812 00000000904B0040 00000000194FF812 db772480
R[0x085]     000000001697C012 0000000027430042 000000001697C012 ddb81980
R[0x086]     000000001AE02012 00000000226D0042 000000001AE02012 e1b6fe80
R[0x087]     000000001AEB9012 0000000053570046 000000001AEB9012 dc6fc380
R[0x088]     000000001F078012 00000000F13501CC 000000001F078012 dd760880
R[0x089]     000000001A6AE012 0000000059D10040 000000001A6AE012 e4d40b80
R[0x08A]     000000001670D012 000000005D880236 000000001670D012 de5c5580
R[0x08B]     000000001AFCF812 000000005FA70040 000000001AFCF812 daffd180
R[0x08C]     000000001E6F8812 00000000E29901DF 000000001E6F8812 e2ffeb80
R[0x08D]     000000001D30C012 00000000120E01CF 000000001D30C012 dab51480
R[0x08E]     0000000016C60812 00000000DD4F0328 0000000016C60812 e026b780
R[0x08F]     000000001DFE9012 00000000FD740251 000000001DFE9012 dd3da180
R[0x090]     00000000181BA012 00000000B6790040 00000000181BA012 daa4e980
R[0x091]     000000001D446012 00000000279F01E4 000000001D446012 de9a4d80
R[0x092]     0000000019E6F812 00000000BCEF01BC 0000000019E6F812 e2cdbd80
R[0x093]     000000001AE1D012 00000000F15403DC 000000001AE1D012 da7f8780
R[0x094]     00000000176E3012 000000002BE20040 00000000176E3012 e329d480
R[0x095]     000000001BC26012 00000000AB130040 000000001BC26012 de5f4e80
R[0x096]     00000000183B6812 0000000020DD0046 00000000183B6812 e5911080
R[0x097]     000000001C5BC812 00000000CEFD02F7 000000001C5BC812 e57b5380
R[0x098]     000000001E0C9812 00000000545D0250 000000001E0C9812 f6aba480
R[0x099]     00000000196BF812 00000000C4000042 00000000196BF812 d9385080
R[0x09A]     000000001B96B812 00000000200B02FB 000000001B96B812 e24aaa80
R[0x09B]     000000001D175812 000000000620024C 000000001D175812 db53c980
R[0x09C]     0000000018184012 00000000B1430040 0000000018184012 dd3c8380
R[0x09D]     000000001E776812 0000000017D60040 000000001E776812 e2594980
R[0x09E]     000000001D68D812 000000002C390342 000000001D68D812 e359b980
R[0x09F]     0000000017B69012 000000002E8D0052 0000000017B69012 e19b2080
R[0x0A0]     000000001AE2B812 0000000034DE0046 000000001AE2B812 e2564380
R[0x0A1]     000000001E898812 000000006CFC0040 000000001E898812 db989780
R[0x0A2]     0000000016AF1812 00000000121B02C8 0000000016AF1812 d9dedb80
R[0x0A3]     0000000018E9F012 000000004C0E0046 0000000018E9F012 dfc9d980
R[0x0A4]     000000001D768012 00000000007A0040 000000001D768012 e1d87380
R[0x0A5]     0000000023F4D012 00000000FEB90040 0000000023F4D012 dd416180
R[0x0A6]     00000000216F1812 00000000D03E0046 00000000216F1812 e3e8bd80
R[0x0A7]     000000001C30F812 00000000188D0250 000000001C30F812 ed3a1c80
R[0x0A8]     000000001D141012 0000000005220040 000000001D141012 dde0c880
R[0x0A9]     0000000016546812 00000000613D01BF 0000000016546812 dbf70c80
R[0x0AA]     000000001F180812 000000009BA50040 000000001F180812 e152e580
R[0x0AB]     0000000019990012 0000000078B30457 0000000019990012 f6f05280
R[0x0AC]     000000001929A012 00000000369F0040 000000001929A012 dab21180
R[0x0AD]     0000000018AA8012 00000000AE260307 0000000018AA8012 e4b87880
R[0x0AE]     000000001A068012 000000009D3C0040 000000001A068012 dd901480
R[0x0AF]     0000000018744012 00000000C6E30040 0000000018744012 de593880
R[0x0B0]     000000001C9A3812 0000000074F000DA 000000001C9A3812 e2cde980
R[0x0B1]     00000000156E5012 000000002A3F038E 00000000156E5012 de8c1880
R[0x0B2]     0000000017A32812 0000000083C00040 0000000017A32812 da1f0780
R[0x0B3]     0000000022986012 000000005A4201DD 0000000022986012 e53fc580
R[0x0B4]     000000001DEB1012 00000000885201BD 000000001DEB1012 da4cb280
R[0x0B5]     000000001D30C812 0000000006980046 000000001D30C812 da636d80
R[0x0B6]     00000000187D9012 000000004FE80040 00000000187D9012 e0266a80
R[0x0B7]     000000001ED51812 00000000A4C10040 000000001ED51812 ed416d80
R[0x0B8]     00000000181A2012 00000000F6B202F3 00000000181A2012 e1db4380
R[0x0B9]     0000000015812012 00000000D0AD024D 0000000015812012 db154880
R[0x0BA]     00000000164A1012 000000004F060040 00000000164A1012 e329da80
R[0x0BB]     0000000018E7E812 000000008B1C0040 0000000018E7E812 da314780
R[0x0BC]     0000000016573812 00000000D0150040 0000000016573812 dc81f780
R[0x0BD]     0000000018FF0012 000000003DB80040 0000000018FF0012 dd3c8480
R[0x0BE]     00000000181BA812 00000000DE220040 00000000181BA812 ebf01c80
R[0x0BF]     0000000017F91012 000000004B7D0040 0000000017F91012 e2290780
R[0x0C0]     000000002091B812 00000000C63E0040 000000002091B812 e52d3780
R[0x0C1]     000000001FFB6012 00000000AE4D0040 000000001FFB6012 da68a880
R[0x0C2]     000000001848C012 0000000057430040 000000001848C012 e06e7380
R[0x0C3]     000000001EA3E012 00000000EE5C0040 000000001EA3E012 e2290980
R[0x0C4]     0000000019E56012 00000000BF260227 0000000019E56012 dc02dd80
R[0x0C5]     0000000020F70812 000000005EEC01DB 0000000020F70812 daec3b80
R[0x0C6]     000000001D87B812 000000000B2601A7 000000001D87B812 e4f4f080
R[0x0C7]     000000001D45A012 000000003A2A0040 000000001D45A012 da4cb180
R[0x0C8]     0000000017597812 00000000F1D602AE 0000000017597812 e1973280
R[0x0C9]     000000001AF8E812 00000000394201D6 000000001AF8E812 dab84880
R[0x0CA]     000000001BEA3812 0000000045320040 000000001BEA3812 d7a18680
R[0x0CB]     0000000017868812 00000000FED60040 0000000017868812 e593c480
R[0x0CC]     0000000018550012 000000004DCB0040 0000000018550012 da088880
R[0x0CD]     00000000192FA012 00000000DE5D0040 00000000192FA012 e46a9980
R[0x0CE]     0000000017A16812 000000003B9C0277 0000000017A16812 dad13780
R[0x0CF]     000000001D466012 0000000025BF0040 000000001D466012 dd255180
R[0x0D0]     0000000023719012 000000007A950252 0000000023719012 e1d1b880
R[0x0D1]     000000001AFC9012 0000000009EF0040 000000001AFC9012 e1723680
R[0x0D2]     0000000015081812 00000000F4780040 0000000015081812 de9a4e80
R[0x0D3]     0000000015062812 00000000CBDA0040 0000000015062812 ddcc5d80
R[0x0D4]     000000001B506012 00000000883D02AE 000000001B506012 e359b380
R[0x0D5]     0000000025465012 00000000855B0040 0000000025465012 ee8a3e80
R[0x0D6]     00000000196F0812 000000004B430040 00000000196F0812 e5854280
R[0x0D7]     000000001DDCF812 00000000632C01C6 000000001DDCF812 ef4bb780
R[0x0D8]     000000001A2C3012 00000000FEBA0246 000000001A2C3012 dabdc280
R[0x0D9]     000000001E3FD012 00000000CDC00040 000000001E3FD012 e4b50780
R[0x0DA]     0000000016C82012 000000009CB90040 0000000016C82012 dd440880
R[0x0DB]     000000001C453012 00000000869C0245 000000001C453012 de4f4d80
R[0x0DC]     000000001C9A3012 000000005C0A0040 000000001C9A3012 dcc9b880
R[0x0DD]     0000000016B38812 0000000046CA0229 0000000016B38812 dae87780
R[0x0DE]     000000001B826812 00000000A2110040 000000001B826812 efdda080
R[0x0DF]     000000001AD79012 00000000A0D002B7 000000001AD79012 dc288b80
R[0x0E0]     0000000024772012 00000000B11703F1 0000000024772012 df395480
R[0x0E1]     000000001B100812 0000000028730046 000000001B100812 e1858380
R[0x0E2]     000000001E2F1812 00000000A5CE0040 000000001E2F1812 db989a80
R[0x0E3]     0000000016E5D012 0000000067EA01BC 0000000016E5D012 d9906480
R[0x0E4]     000000001F04C812 00000000B7A30046 000000001F04C812 e6fc0b80
R[0x0E5]     0000000018D5F812 00000000D00800FF 0000000018D5F812 de3cf580
R[0x0E6]     000000001B294012 0000000091650040 000000001B294012 daab7280
R[0x0E7]     0000000016C4E812 00000000A30E031C 0000000016C4E812 efc7c980
R[0x0E8]     000000001B71E012 0000000018590305 000000001B71E012 e2cdb680
R[0x0E9]     000000001E767012 000000006D9F0040 000000001E767012 da68aa80
R[0x0EA]     000000001C172812 00000000FD450040 000000001C172812 e4f4f980
R[0x0EB]     0000000015C13012 000000008ED10040 0000000015C13012 e50b4080
R[0x0EC]     0000000015445012 000000003F160040 0000000015445012 e2755880
R[0x0ED]     00000000204B8812 00000000F8CC05EE 00000000204B8812 e0fa2780
R[0x0EE]     000000001F7D1012 00000000DA2E01CB 000000001F7D1012 e5021e80
R[0x0EF]     000000002187D812 000000004399040D 000000002187D812 e16e5980
R[0x0F0]     0000000015E51012 00000000DB670040 0000000015E51012 dc194a80
R[0x0F1]     0000000020DEF012 0000000021D70220 0000000020DEF012 e4370880
R[0x0F2]     0000000023088812 00000000B1B10040 0000000023088812 e2cde880
R[0x0F3]     0000000015081012 00000000396D0292 0000000015081012 ddcf4480
R[0x0F4]     0000000018257012 00000000588C02D6 0000000018257012 dd874080
R[0x0F5]     000000001557C812 000000009D670040 000000001557C812 ddd97b80
R[0x0F6]     000000001B08F012 00000000D1BF0050 000000001B08F012 e3e8b780
R[0x0F7]     00000000184C8812 00000000BE6D0040 00000000184C8812 e385d680
R[0x0F8]     000000001B71E812 00000000F4AB0040 000000001B71E812 e30c1480
R[0x0F9]     000000001E2B4012 00000000B02F0040 000000001E2B4012 d99c9580
R[0x0FA]     000000001B855812 00000000764A0040 000000001B855812 dfabd480
R[0x0FB]     0000000021390812 00000000832B03F8 0000000021390812 dcca3180
R[0x0FC]     000000001E776012 00000000C56F01DE 000000001E776012 db28f080
R[0x0FD]     000000001B8B6012 00000000356C0267 000000001B8B6012 e21e7780
R[0x0FE]     00000000165B2812 00000000283C01C3 00000000165B2812 dac47d80
R[0x0FF]     000000001B256012 00000000B60D0040 000000001B256012 dbe47680
e1000: eth0: e1000_watchdog_task: NIC Link is Up 1000 Mbps Full Duplex, Flow
Control: RX
Comment 33 Marcus Alves Grando 2008-05-07 14:07:36 EDT
Jesse, any news?
Comment 34 Jesse Brandeburg 2008-05-07 19:41:27 EDT
Hi Marcus, sorry to take so long to respond, looking at the log you sent (which
is useful! thank you!)  I see some odd behavior around the point of failure in
the log.

I think we are probably seeing more (we've seen some before) of a bad reaction
between the TSO code and the workarounds for 82544.  As far as I know the
last_tx_tso workaround is not necessary for 82544, only for 82545/82546, but i
think the code is running it, as per I see TSO descriptor set ups ending with a
4 byte descriptor, which is the workaround.

the kernel you're testing doesn't have CONFIG_DEBUG_SLAB enabled does it?

I also see some descriptors which seem not quite right.

My recommendation is to continue with TSO disabled.  

Andy, we might try to make a driver that doesn't ever set last_tx_tso if
mac_type == 82544.

Comment 35 Andy Gospodarek 2008-05-08 15:07:11 EDT
Created attachment 304888 [details]
rhel4-82544-workaround.patch

Jesse, if you suspect that the 82544 doesn't need the workaround, then a simple
patch like this would probably do the trick.  I can integrated it into my test
kernels if these seems reasonable enough.
Comment 36 Jesse Brandeburg 2008-05-15 15:13:43 EDT
andy, the patch is completely reasonable, but I don't know for sure if it will
fix, can the reporter test?

Comment 38 Frank Hirtz 2008-05-16 10:58:07 EDT
Created attachment 305695 [details]
Intel patch to decrease the size of the ring descriptors and increase their numbers to compensate
Comment 39 Jesse Brandeburg 2008-05-16 12:01:49 EDT
(In reply to comment #38)
> Created an attachment (id=305695) [edit]
> Intel patch to decrease the size of the ring descriptors and increase their
> numbers to compensate

Who at Intel gave you this patch?  I am the lead of the group that maintains 
and develops these drivers.  While I know about the errata and this patch, I 
STRONGLY suggest this not be made a default change to the driver.  It is not 
applicable to many of the parts that this driver supports and this patch has a 
very large performance impact in the general case.

Comment 40 Jesse Brandeburg 2008-05-16 12:29:50 EDT
additionally, this bug is on an 82544, which will *not* be solved by the patch 
submitted by Frank.
Comment 41 Andy Gospodarek 2008-05-16 12:36:49 EDT
Jesse, I now have a couple concerns about this patch as well.  

1.  I'd like to understand the performance impact before rolling something like
this as the default.  We were told that there could be some 82546 issues that
this patch would reveal.

2.  If there is specific hardware that this might help, I'd rather see a change
in e1000_xmit_frame that only writes small hunks to each descriptor for that
hardware rather than for all devices.

Sorry to confuse the issue a bit here, but we now seem to be working an 82544
and 82546 problem in this same bug.  This is getting to be a bit confusing for
everyone.  :-/

For 82544 we still plan to test the patch you suggested and I attached in
comment #35 where we disable the TSO workaround and that patch should be in my
test kernels soon.

The patch posted in comment #38 was to adddress the 82546 issue and may not make
it into my next test kernel build (scheduled for today).
Comment 42 Jesse Brandeburg 2008-05-16 13:53:18 EDT
okay, can we get another bug going (or can i get added to an existing one) for
the 82546 tx hang issue?

I think that is the best route at this point.  It can be public or private, your
call.
Comment 43 Marcus Alves Grando 2008-05-16 14:27:14 EDT
Guys,

Since yesterday at 21:00 a patch provided in comment #35 are running. Until now
no one watchdog happening.

I'm waiting until tursday to see if it works well. That's a enough time to
happening again.

Regards
Comment 44 Andy Gospodarek 2008-05-16 14:40:44 EDT
Marcus, thank you for the feedback.  I will look forward to more news from you
-- hopefully good news!  :)

Comment 45 Andy Gospodarek 2008-05-16 23:03:27 EDT
My test kernels have been updated to include a patch for this bugzilla.

http://people.redhat.com/agospoda/#rhel4

Please test them and report back your results.
Comment 47 Andy Gospodarek 2008-05-22 15:35:30 EDT
I realize this is getting confusing at this point (I need to open a new BZ to
address the 82545/6 issues), but I created a patch that will now set the tx
descriptor size (setting the number of descriptors was already available) as a
module parameter.

The patch is here:

http://people.redhat.com/agospoda/rhel4/0019-e1000-add-module-parameter-to-set-transmit-descript.patch

And the test kernels are here:

http://people.redhat.com/agospoda/#rhel4

I have done some light testing and it appears to work.  Using the new parameter 
'TxDescPower' is a bit tricky, so I'll explain it a bit.  Here are the comments
I added to the code to address it.

/* Transmit Descriptor Power
 *
 * Valid Range: 6-12
 * This value represents the size-order of each transmit descriptor.
 * The valid size for descriptors would be 2^6 (64) to 2^12 (4096) bytes
 * each.  As this value decreases one may want to consider increasing
 * the TxDescriptors value to maintain the same amount of frame memory.
 *
 * Default Value: 12
 */

I'll break it down with an example.  On the 82545/6 the default number of
descriptors is this (as reported by ethtool):

Current hardware settings:
RX:             256
RX Mini:        0
RX Jumbo:       0
TX:             256

and the maximums that can be used are these:

Pre-set maximums:
RX:             4096
RX Mini:        0
RX Jumbo:       0
TX:             4096

The default (and current) value for TxDescPower is 12, which translates to
(2^12) 4096 bytes.  In the default case, that would mean that about (4096*256)
1MB of kernel memory would be allocated (minus the small ring-buffer accounting
overhead).

If you have a system with 82545/6 that seems to experience tx unit hangs under
heavy load you may benefit from lowering the TxDescPower to that each DMA will
be smaller and shorter and reduce the changes for a tx unit hang.  Remember that
each time you lower TxDescPower by a value of '1' you are halving the size of
each descriptor buffer, so you should double TxDescriptors to continue to have
the same amount of kernel buffer memory available.

To maintain 1MB of buffer memory you should use pairs like this:

TxDescPower = 12 with TxDescriptors = 256 -> 1MB of buffer memory
TxDescPower = 11 with TxDescriptors = 512 -> 1MB of buffer memory 
TxDescPower = 10 with TxDescriptors = 1024 -> 1MB of buffer memory 
TxDescPower = 9 with TxDescriptors = 2048 -> 1MB of buffer memory 
TxDescPower = 8 with TxDescriptors = 4096 -> 1MB of buffer memory

Since you cannot allocate more than 4096 ring buffer entries, you will start to
use less memory when you use a TxDescPower = 7 or 6, so staying above 8 is
probably a good idea.  I would also recommend stepping down TxDescPower by 1
with each set of tests to find a setting that will work best for your system
rather than jumping all the way to TxDescPower = 8 and TxDescriptors = 4096
right away.

There are two possible things to consider when modifying TxDescPower:

1.  As the descriptor-size decreases is it more likely than one frame will often
span a single descriptor and an extra DMA transfer may be needed to transmit the
frame.  I have not been able to gather any quantitative data on the performance
hit when doing this, but it certainly seems like it would be a problem in a
scenario where there are always large frames being transmitted on the network --
in the case where there are only small frames it seems possible there would not
be a negative performance impact.

2.  Since there is some overhead for management in each ring descriptor,
increasing the number of descriptors will burn more additional memory.  This
value should be around 160 bytes / descriptor on 32-bit systems and 320 bytes on
64-bit systems.  Not significant numbers, but this could consume over 1MB in
some cases.

These two caveats seem trivial on a system that regularly sees tx-timeouts and
the havoc that occurs on effected systems, but I wanted to make sure anyone
testing with them would be aware.
Comment 48 Marcus Alves Grando 2008-05-22 18:59:56 EDT
Andy and Jesse,

Since I applied rhel4-82544-workaround.patch on my system with 82544, doesn't
happened watchdog timeout again. That's already 1 week now, without patch
watchdog happened twice a day (more or less).

So, I can conclude that 82544 doesn't need last_tx_tso to be set to 1.

One watchdog timeout resolved. Thanks guys.

About 82545/6 I doesn't have any system with that model. Sorry if I can not help
with this.
Comment 49 Andy Gospodarek 2008-05-22 19:19:06 EDT
Thank you so much for the feedback, Marcus.  Your testing is appreciated.
Comment 50 Andy Gospodarek 2008-06-16 09:42:43 EDT
My test kernels have been updated to include a patch for this bugzilla.

http://people.redhat.com/agospoda/#rhel5

Please test them and report back your results.
Comment 51 Marcus Alves Grando 2008-08-04 14:29:01 EDT
(In reply to comment #50)
> My test kernels have been updated to include a patch for this bugzilla.
> 
> http://people.redhat.com/agospoda/#rhel5
> 
> Please test them and report back your results.

Andy,

Latest patch already works fine. You change anything else or just include in your test kernel?

Regards
Comment 52 Andy Gospodarek 2008-08-04 14:50:13 EDT
Nothing new in that kernel, my scripts just updated the bugzilla automaticaly.  Sorry for the noise.
Comment 56 RHEL Product and Program Management 2008-09-03 09:05:14 EDT
Updating PM score.
Comment 58 Marcus Alves Grando 2008-09-19 10:17:12 EDT
Andy and Jesse,

Today, I've updated one server with RHAS 5.2 to 2.6.18-115.el5, and I saw this problem too. It's another Intel model (82541GI) and seems to has same problem. I've added some additional patches[1]. See below:

# lspci 
00:00.0 Host bridge: Intel Corporation E7520 Memory Controller Hub (rev 09)
00:02.0 PCI bridge: Intel Corporation E7525/E7520/E7320 PCI Express Port A (rev 09)
00:04.0 PCI bridge: Intel Corporation E7525/E7520 PCI Express Port B (rev 09)
00:05.0 PCI bridge: Intel Corporation E7520 PCI Express Port B1 (rev 09)
00:06.0 PCI bridge: Intel Corporation E7520 PCI Express Port C (rev 09)
00:1d.0 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #3 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB2 EHCI Controller (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev c2)
00:1f.0 ISA bridge: Intel Corporation 82801EB/ER (ICH5/ICH5R) LPC Interface Bridge (rev 02)
00:1f.1 IDE interface: Intel Corporation 82801EB/ER (ICH5/ICH5R) IDE Controller (rev 02)
01:00.0 PCI bridge: Intel Corporation 80332 [Dobson] I/O processor (A-Segment Bridge) (rev 06)
01:00.2 PCI bridge: Intel Corporation 80332 [Dobson] I/O processor (B-Segment Bridge) (rev 06)
02:0e.0 RAID bus controller: Dell PowerEdge Expandable RAID controller 4 (rev 06)
05:00.0 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI Bridge A (rev 09)
05:00.2 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI Bridge B (rev 09)
06:07.0 Ethernet controller: Intel Corporation 82541GI Gigabit Ethernet Controller (rev 05)
07:08.0 Ethernet controller: Intel Corporation 82541GI Gigabit Ethernet Controller (rev 05)
08:00.0 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI Bridge A (rev 09)
08:00.2 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI Bridge B (rev 09)
0b:0d.0 VGA compatible controller: ATI Technologies Inc Radeon RV100 QY [Radeon 7000/VE]

dmesg:

e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
  Tx Queue             <0>
  TDH                  <ae>
  TDT                  <98>
  next_to_use          <98>
  next_to_clean        <aa>
buffer_info[next_to_clean]
  time_stamp           <f9b9ec>
  next_to_watch        <af>
  jiffies              <f9bc4e>
  next_to_watch.status <0>
e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
  Tx Queue             <0>
  TDH                  <ae>
  TDT                  <98>
  next_to_use          <98>
  next_to_clean        <aa>
buffer_info[next_to_clean]
  time_stamp           <f9b9ec>
  next_to_watch        <af>
  jiffies              <f9be42>
  next_to_watch.status <0>
e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
  Tx Queue             <0>
  TDH                  <ae>
  TDT                  <98>
  next_to_use          <98>
  next_to_clean        <aa>
buffer_info[next_to_clean]
  time_stamp           <f9b9ec>
  next_to_watch        <af>
  jiffies              <f9c036>
  next_to_watch.status <0>
e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
  Tx Queue             <0>
  TDH                  <ae>
  TDT                  <98>
  next_to_use          <98>
  next_to_clean        <aa>
buffer_info[next_to_clean]
  time_stamp           <f9b9ec>
  next_to_watch        <af>
  jiffies              <f9c22a>
  next_to_watch.status <0>
NETDEV WATCHDOG: eth0: transmit timed out
e1000: eth0: e1000_watchdog_task: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX

# modinfo e1000
filename:       /lib/modules/2.6.18-115.el5PAE/kernel/drivers/net/e1000/e1000.ko
version:        7.3.20-k2-NAPI
license:        GPL
description:    Intel(R) PRO/1000 Network Driver
author:         Intel Corporation, <linux.nics@intel.com>
srcversion:     B472EA7E306A6055CC76482

At this time TSO are ON and now I'll disable it to test.

[1] Additional patches:
0021-e1000-correctly-set-TSO6-fields-via-ethtool.patch
0022-e1000-add-module-parameter-to-set-tx-descriptor-pow.patch
0023-e1000-restart-receive-unit-workaroun-on-ESB2-hardwa.patch
0024-e1000-disable-TSO-workaround-on-82544.patch
0025-e1000-remove-e1000_clean_tx_irq-call-from-e1000_net.patch

Maybe same workaround to 82544EI?
Comment 59 Andy Gospodarek 2008-09-25 11:32:33 EDT
I don't think the 82541 should have the same changes that were done for 82544.  I'm pretty sure those chips are different enough that there will probably need to be a different fix for that hardware.
Comment 64 Andy Gospodarek 2008-11-14 13:18:44 EST
The customer can try 2 more times since TxDescPower=7 and 6 are both supported, but I cannot guarantee either will work.  Did TxDescPower=8 allow their system to be stable for a longer period of time than previous runs?

With the 82546 there are known issues that this patch hopes to workaround by setting shrinking the amount of data per DMA transfer, but there is no guaranteed way to avoid them.  There is no guarantee that moving back to the older driver will help anything either.  The best way is to consider replacing the hardware with something a bit newer.
Comment 67 Andy Gospodarek 2008-11-17 14:33:09 EST
(In reply to comment #66)
> Masahiro,
> 
> From the sysreport
> 
> system was booted on Nov 11 16:51:50 with kernel  2.6.9-77.EL.gtest.47.
> The issue has resurfaced on 11 17:02:19 with TxDescPower = 8.
> 
> Customer claims he has not seen the issue with 2.6.9-55.0.12. Any kernel
> above this, has an issue with  the NIC.  if we can point to some change in
> the driver that has exposed the bug in the NIC then it would be great.
> Otherwise I am afraid whether we would be able to get the customer accept
> this is an issue with h/w.
> 
> checking with the customer for TxDescPower=7 and 6 
> 
> Ranjith
> 
>

I have spoken with individuals at Intel at length about this issue and they assure me that the 82545/82546 hardware does have an issue where DMA units can hang, and if I remember correctly the high-traffic situations are most often how this is reproduced.  The goal of the workaround patch is to reduce the time that it takes for a DMA transfer to occur.  This limits the chance that we hit the hardware bug, but does not eliminate it.

As I look at the differences between 2.6.9-55 and the latest RHEL4 kernel, the one change that stands out to me is the adaptive interrupt moderation patches that were added during that time.  If we have increased the efficiently of our receive path, we may find more frequent data transfers are occurring and the increased utilization may cause enough contention to make this hardware problem more apparent.  I'm not really sure, though, as I don't have a setup that has ever been able to reliably reproduce the failure.

Any thoughts or comments on the above statements, Jesse?
Comment 68 Andy Gospodarek 2008-11-17 14:34:43 EST
Created attachment 323789 [details]
e1000-rhel4-changes.diff

Here is one giant patch that has all the changes from 2.6.9-55 to the latest rhel4 kernel.
Comment 69 Jesse Brandeburg 2008-11-17 18:49:50 EST
(In reply to comment #67)
> As I look at the differences between 2.6.9-55 and the latest RHEL4 kernel, the
> one change that stands out to me is the adaptive interrupt moderation patches
> that were added during that time.  If we have increased the efficiently of our
> receive path, we may find more frequent data transfers are occurring and the
> increased utilization may cause enough contention to make this hardware problem
> more apparent.  I'm not really sure, though, as I don't have a setup that has
> ever been able to reliably reproduce the failure.

We have not heard many reports about this either.  I did hear a couple of sporadic reports about the "adaptive interrupts" causing some issues for people with pci/pci-x hardware.

At this stage of the game for e1000 I would suggest that you could try to disable adaptive interrupt moderation by default, by defaulting driver parameter InterruptThrottleRate=8000 in e1000_param.c (set default member)

or use modprobe.conf to do the same temporarily, don't forget commas for the module for multiple ports.

options e1000 InterrupThrottleRate=8000,8000,8000,8000

I hope we can continue to work on this to figure it out.
Comment 70 Andy Gospodarek 2008-12-15 16:15:48 EST
Marcus, is there any change to your system when you try Jesse's suggestion from comment #69?

I am trying to finish up my RHEL4 issues since we have an update to RHEL4 soon.  If disabling the adaptive interrupt throttle rate resolves this issue, I could always submit a patch to disable it by default on the affected hardware, but I do not even want to consider this if it will not resolve your issue.  Thanks!
Comment 71 Jesse Brandeburg 2008-12-16 15:47:36 EST
Andy, one suggestion is to set the default for InterruptThrottleRate to 8000 in the driver for RHEL4, this will disable Adaptive moderation in e1000.  If you didn't want to impact latency or only wanted to change fewer setups, you could limit the default option to be 8000 for everything less than 82546.

This is a safe path and makes latency the same as it would have been for 7.2.7
Comment 72 Andy Gospodarek 2008-12-23 17:10:33 EST
Created attachment 327776 [details]
 e1000-disable-adaptive-interrupt-throttling.patch

Good suggestion, Jesse.  I'll probably propose this for the next rhel4 update.
Comment 73 Jesse Brandeburg 2008-12-31 19:00:48 EST
Hi Andy, thanks for doing that work.  I'm concerned your patch may be backwards.  82545/6 are the most likely to work with AIM enabled.  I think the e1000 in RHEL4 supports some PCIe devices still, right?  82571 for sure works well with AIM.

I actually added the adaptive interrupt moderation code specifically FOR 82545/6 adapters (and mainly for 82571 back when e1000 supported that family)

My feeling is that if you're going to disable it at all, you probably should just make the default 8000 for all e1000 supported devices (aka PCI/PCI-X) to stay consistent/safe.

The places where I have predominately seen problem reports is in PCI only systems.  Most 82545/6 adapters are plugged into PCI-X.

The people who like the low latency settings usually will hand-tune their ITR setting, but it might be worth a mention in the release notes if you do change the ITR default (back to the old default) to 8000
Comment 74 Marcus Alves Grando 2009-01-05 12:58:33 EST
(In reply to comment #70)
> Marcus, is there any change to your system when you try Jesse's suggestion from
> comment #69?
> 
> I am trying to finish up my RHEL4 issues since we have an update to RHEL4 soon.
>  If disabling the adaptive interrupt throttle rate resolves this issue, I could
> always submit a patch to disable it by default on the affected hardware, but I
> do not even want to consider this if it will not resolve your issue.  Thanks!

Guys, I can't reproduce this anymore, since we changed related servers. I've tried to install again but without real usage, NICs works fine.

Regards
Comment 75 Andy Gospodarek 2009-01-05 13:56:01 EST
Jesse, the reason I proposed that disabling adaptive interrupt modulation for 82545/6 was that hunks that added AIM support were added during the update where complaints went from non-existent to frequent for tx timeouts for 82545/6 devices.  You seem quite sure that AIM is not helping create this problem, so I will not propose the patch in comment #60 for the next RHEL4 update.
Comment 77 Marcus Alves Grando 2009-01-20 08:41:15 EST
Andy, do you expect commit this patch before 4.8? I didn't saw in 2.6.9-79.

Regards
Comment 80 Vivek Goyal 2009-01-26 10:09:51 EST
Committed in 80.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/
Comment 99 errata-xmlrpc 2009-05-18 15:04:02 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1024.html

Note You need to log in before you can comment on or make changes to this bug.