Bug 219496

Summary: frequent transmit timeouts on e1000: dev3749
Product: [Fedora] Fedora Reporter: Diego Novillo <dnovillo>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 6CC: jesse.brandeburg, jonstanley, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-02-08 04:26:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 427887    

Description Diego Novillo 2006-12-13 15:55:55 UTC
Description of problem:

I'm seeing frequent transmit timeout messages on /var/log/messages and kernel
messages like 'kernel: e1000: dev3749: e1000_clean_tx_irq: Detected Tx
Unit Hang'.

I had the machine connected to a network switch and every few days, the network
switch would get all confused and had to be power cycled.  All the machines
connected to that same switch would be disconnected from the network.

I now have the machine connected directly to the router and that problem seems
to have stopped.  However, I keep seeing these messages:

Dec 12 14:36:42 legolas kernel: e1000: dev3749: e1000_clean_tx_irq: Detected Tx
Unit Hang
Dec 12 14:36:42 legolas kernel:   Tx Queue             <0>
Dec 12 14:36:42 legolas kernel:   TDH                  <4f>
Dec 12 14:36:42 legolas kernel:   TDT                  <4f>
Dec 12 14:36:42 legolas kernel:   next_to_use          <4f>
Dec 12 14:36:42 legolas kernel:   next_to_clean        <62>
Dec 12 14:36:42 legolas kernel: buffer_info[next_to_clean]
Dec 12 14:36:42 legolas kernel:   time_stamp           <103d60cd7>
Dec 12 14:36:42 legolas kernel:   next_to_watch        <64>
Dec 12 14:36:42 legolas kernel:   jiffies              <103d617c7>
Dec 12 14:36:42 legolas kernel:   next_to_watch.status <0>
Dec 12 14:36:44 legolas kernel: e1000: dev3749: e1000_clean_tx_irq: Detected Tx
Unit Hang
Dec 12 14:36:44 legolas kernel:   Tx Queue             <0>
Dec 12 14:36:44 legolas kernel:   TDH                  <4f>
Dec 12 14:36:44 legolas kernel:   TDT                  <4f>
Dec 12 14:36:44 legolas kernel:   next_to_use          <4f>
Dec 12 14:36:44 legolas kernel:   next_to_clean        <62>
Dec 12 14:36:44 legolas kernel: buffer_info[next_to_clean]
Dec 12 14:36:44 legolas kernel:   time_stamp           <103d60cd7>
Dec 12 14:36:44 legolas kernel:   next_to_watch        <64>
Dec 12 14:36:44 legolas kernel:   jiffies              <103d61f98>
Dec 12 14:36:44 legolas kernel:   next_to_watch.status <0>
Dec 12 14:36:46 legolas kernel: e1000: dev3749: e1000_clean_tx_irq: Detected Tx
Unit Hang
Dec 12 14:36:46 legolas kernel:   Tx Queue             <0>
Dec 12 14:36:46 legolas kernel:   TDH                  <4f>
Dec 12 14:36:46 legolas kernel:   TDT                  <4f>
Dec 12 14:36:46 legolas kernel:   next_to_use          <4f>
Dec 12 14:36:46 legolas kernel:   next_to_clean        <62>
Dec 12 14:36:46 legolas kernel: buffer_info[next_to_clean]
Dec 12 14:36:46 legolas kernel:   time_stamp           <103d60cd7>
Dec 12 14:36:46 legolas kernel:   next_to_watch        <64>
Dec 12 14:36:46 legolas kernel:   jiffies              <103d62768>
Dec 12 14:36:46 legolas kernel:   next_to_watch.status <0>
Dec 12 14:36:48 legolas kernel: e1000: dev3749: e1000_clean_tx_irq: Detected Tx
Unit Hang
Dec 12 14:36:48 legolas kernel:   Tx Queue             <0>
Dec 12 14:36:48 legolas kernel:   TDH                  <4f>
Dec 12 14:36:48 legolas kernel:   TDT                  <4f>
Dec 12 14:36:48 legolas kernel:   next_to_use          <4f>
Dec 12 14:36:48 legolas kernel:   next_to_clean        <62>
Dec 12 14:36:48 legolas kernel: buffer_info[next_to_clean]
Dec 12 14:36:48 legolas kernel:   time_stamp           <103d60cd7>
Dec 12 14:36:48 legolas kernel:   next_to_watch        <64>
Dec 12 14:36:48 legolas kernel:   jiffies              <103d62f38>
Dec 12 14:36:48 legolas kernel:   next_to_watch.status <0>
Dec 12 14:36:49 legolas kernel: NETDEV WATCHDOG: dev3749: transmit timed out
Dec 12 14:36:50 legolas kernel: e1000: dev3749: e1000_watchdog: NIC Link is Up 1
00 Mbps Full Duplex
Dec 12 14:36:50 legolas kernel: e1000: dev3749: e1000_watchdog: 10/100 speed: di
sabling TSO

Version-Release number of selected component (if applicable):

The system is a Dimension 9200 Dual Core, running Fedora Core 6 with kernel
2.6.18-1.2849.fc6.

$ lspci
00:00.0 Host bridge: Intel Corporation 82P965/G965 Memory Controller Hub (rev 02)
00:01.0 PCI bridge: Intel Corporation 82P965/G965 PCI Express Root Port (rev 02)
00:19.0 Ethernet controller: Intel Corporation 82566DC Gigabit Network
Connection (rev 02)
00:1a.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #4 (rev 02)
00:1a.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #5 (rev 02)
00:1a.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI #2 (rev 02)
00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio Controller
(rev 02)
00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 1
(rev 02)
00:1c.4 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 5
(rev 02)
00:1d.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #3 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI #1 (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev f2)
00:1f.0 ISA bridge: Intel Corporation 82801HH (ICH8DH) LPC Interface Controller
(rev 02)
00:1f.2 RAID bus controller: Intel Corporation 82801HR/HO/HH (ICH8R/DO/DH) SATA
RAID Controller (rev 02)
00:1f.3 SMBus: Intel Corporation 82801H (ICH8 Family) SMBus Controller (rev 02)
01:00.0 VGA compatible controller: ATI Technologies Inc Unknown device 7183
01:00.1 Display controller: ATI Technologies Inc Unknown device 71a3

$ sudo ethtool -a dev3749
Pause parameters for dev3749:
Autonegotiate:  on
RX:             on
TX:             on

$ sudo ethtool -i dev3749
driver: e1000
version: 7.1.9-k4-NAPI
firmware-version: 1.1-0
bus-info: 0000:00:19.0
[0] [x86_64] legolas:~>
$ sudo ethtool -g dev3749
Ring parameters for dev3749:
Pre-set maximums:
RX:             4096
RX Mini:        0
RX Jumbo:       0
TX:             4096
Current hardware settings:
RX:             256
RX Mini:        0
RX Jumbo:       0
TX:             256

$ sudo ethtool -k dev3749
Offload parameters for dev3749:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: off

How reproducible:

Frequently.  I see these messages in /var/log/messages* every few days. 
Sometimes it occurs many times during the day, others a couple of weeks can go
by without them.

Steps to Reproduce:

I'm not quite sure how to trigger this.  It seems to happen randomly.  Googling
around, I found that disabling TSO may help, so I have.  I don't know whether
that's the solution.

Comment 1 Jesse Brandeburg 2007-11-11 06:58:27 UTC
we've fixed quite a few bugs in e1000 especially with the 82566 network
connection.  Is there any chance you could try the latest e1000 driver from
e1000.sourceforge.net and see if the problem still occurs (with TSO on)



Comment 2 Jon Stanley 2008-01-08 01:48:40 UTC
(This is a mass-update to all current FC6 kernel bugs in NEW state)

Hello,

I'm reviewing this bug list as part of the kernel bug triage project, an attempt
to isolate current bugs in the Fedora kernel.

http://fedoraproject.org/wiki/KernelBugTriage

I am CC'ing myself to this bug, however this version of Fedora is no longer
maintained.

Please attempt to reproduce this bug with a current version of Fedora (presently
Fedora 8). If the bug no longer exists, please close the bug or I'll do so in a
few days if there is no further information lodged.

Thanks for using Fedora!

Comment 3 Jon Stanley 2008-02-08 04:26:20 UTC
Per the previous comment in this bug, I am closing it as INSUFFICIENT_DATA,
since no information has been lodged for over 30 days.

Please re-open this bug or file a new one if you can provide the requested data,
and thanks for filing the original report!