Bug 620047

Summary:	eth0 (r8169): transmit queue 0 timed out
Product:	[Fedora] Fedora	Reporter:	Bernd Bartmann <bernd.bartmann>
Component:	kernel	Assignee:	Kernel Maintainer List <kernel-maint>
Status:	CLOSED WONTFIX	QA Contact:	Fedora Extras Quality Assurance <extras-qa>
Severity:	high	Docs Contact:
Priority:	low
Version:	13	CC:	anton, dougsland, drees76, fredrik, gansalmon, itamar, jonathan, kernel-maint, lsof, madhu.chinakonda, pb, robin.bowes
Target Milestone:	---
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2011-06-29 12:43:43 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Bernd Bartmann 2010-07-31 16:16:44 UTC

Description of problem:
I found the call trace shown below in my /var/log/messages file after the network connection to the system broke for some seconds:

Jul 19 00:01:55 beverly kernel: ------------[ cut here ]------------
Jul 19 00:01:55 beverly kernel: WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0xf0/0x192()
Jul 19 00:01:55 beverly kernel: Hardware name: MS-7514
Jul 19 00:01:55 beverly kernel: NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
Jul 19 00:01:55 beverly kernel: Modules linked in: nfs fscache nfsd lockd nfs_acl auth_rpcgss exportfs sunrpc ipv6 cpufr
eq_ondemand acpi_cpufreq freq_table dvb_pll cx22702 uinput cx88_dvb cx88_vp3054_i2c videobuf_dvb snd_hda_codec_atihdmi s
nd_hda_codec_realtek tda10021 snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer cx8800 snd
cx8802 cx88xx ir_common soundcore budget_av i2c_i801 saa7146_vv v4l2_common budget_core dvb_core videodev tveeprom v4l1_
compat ir_core v4l2_compat_ioctl32 saa7146 btcx_risc videobuf_dma_sg ttpci_eeprom videobuf_core iTCO_wdt iTCO_vendor_sup
port snd_page_alloc r8169 mii ppdev parport_pc parport microcode raid456 async_raid6_recov async_pq raid6_pq async_xor x
or async_memcpy async_tx raid1 radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
Jul 19 00:01:55 beverly kernel: Pid: 0, comm: swapper Not tainted 2.6.33.6-147.fc13.x86_64 #1
Jul 19 00:01:55 beverly kernel: Call Trace:
Jul 19 00:01:55 beverly kernel: <IRQ>  [<ffffffff8104aecc>] warn_slowpath_common+0x77/0x8f
Jul 19 00:01:55 beverly kernel: [<ffffffff8104af31>] warn_slowpath_fmt+0x3c/0x3e
Jul 19 00:01:55 beverly kernel: [<ffffffff8139bd5f>] ? netif_tx_lock+0x3f/0x68
Jul 19 00:01:55 beverly kernel: [<ffffffff8139be78>] dev_watchdog+0xf0/0x192
Jul 19 00:01:55 beverly kernel: [<ffffffff81068f15>] ? sched_clock_local+0x1c/0x82
Jul 19 00:01:55 beverly kernel: [<ffffffff8106903e>] ? sched_clock_cpu+0xc3/0xce
Jul 19 00:01:55 beverly kernel: [<ffffffff81057d4a>] run_timer_softirq+0x1ba/0x25e
Jul 19 00:01:55 beverly kernel: [<ffffffff8106bc41>] ? ktime_get+0x60/0xb9
Jul 19 00:01:55 beverly kernel: [<ffffffff81051039>] __do_softirq+0xe0/0x1a1
Jul 19 00:01:55 beverly kernel: [<ffffffff8106fc80>] ? tick_program_event+0x25/0x27
Jul 19 00:01:55 beverly kernel: [<ffffffff8100aa1c>] call_softirq+0x1c/0x30
Jul 19 00:01:55 beverly kernel: [<ffffffff8100c21d>] do_softirq+0x41/0x7e
Jul 19 00:01:55 beverly kernel: [<ffffffff81050e8c>] irq_exit+0x36/0x78
Jul 19 00:01:55 beverly kernel: [<ffffffff81020244>] smp_apic_timer_interrupt+0x89/0x97
Jul 19 00:01:55 beverly kernel: [<ffffffff8100a4d3>] apic_timer_interrupt+0x13/0x20
Jul 19 00:01:55 beverly kernel: <EOI>  [<ffffffff8101139d>] ? mwait_idle+0x75/0x83
Jul 19 00:01:55 beverly kernel: [<ffffffff8101134f>] ? mwait_idle+0x27/0x83
Jul 19 00:01:55 beverly kernel: [<ffffffff81008bfd>] cpu_idle+0xa5/0xdf
Jul 19 00:01:55 beverly kernel: [<ffffffff81413215>] rest_init+0x79/0x7b
Jul 19 00:01:55 beverly kernel: [<ffffffff81ba6df8>] start_kernel+0x40e/0x419
Jul 19 00:01:55 beverly kernel: [<ffffffff81ba62bc>] x86_64_start_reservations+0xa7/0xab
Jul 19 00:01:55 beverly kernel: [<ffffffff81ba63b8>] x86_64_start_kernel+0xf8/0x107
Jul 19 00:01:55 beverly kernel: ---[ end trace b9947213556af327 ]---
Jul 19 00:01:55 beverly kernel: r8169: eth0: link up


Version-Release number of selected component (if applicable):
2.6.33.6-147.fc13.x86_64

How reproducible:
Sometimes

Steps to Reproduce:
1. Create heavy network load on an r8169 network interface, e.g. NFS
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Chuck Ebbert 2010-08-04 13:06:31 UTC

Does adding "pcie_aspm=off" to the kernel boot options help?

Comment 2 Bernd Bartmann 2010-08-04 15:39:26 UTC

Just added this option to the kernel line. We'll see if this improves the situation. I won't have time to run more extensive network load test before the weekend. 

BTW: What does this option mean?

Comment 3 Bernd Bartmann 2010-08-08 17:23:15 UTC

Ok, did some network stress tests, i.e. copy ~50GB from NFS server to the system, in parallel watched some video files via a SAMBA share from this system and had two SSH sessions open to monitor the log files. So far so good.
Also, the system is now on kernel-2.6.33.6-147.2.4.fc13.x86_64.

Comment 4 Bernd Bartmann 2010-08-09 06:20:49 UTC

*** Bug 541716 has been marked as a duplicate of this bug. ***

Comment 5 David Rees 2010-10-03 15:44:28 UTC

I've seen the same thing once now on a Opteron server running 2.6.34.7-56.fc13.x86_64 and an e1000e NIC - is it possible/likely that this is the same issue?  Have only seen this once and otherwise the system has been stable for quite some time now.

Bug #620253 also appears to be the same as this one except with a different NIC.

Comment 6 Fredrik Chabot 2010-11-10 08:40:20 UTC

After about 3 months uptime with heavy network traffic:

Nov  9 22:03:58 sql8 kernel: ------------[ cut here ]------------
Nov  9 22:03:58 sql8 kernel: WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0xf0/0x192()
Nov  9 22:03:58 sql8 kernel: Hardware name: RS500-E6-PS4
Nov  9 22:03:58 sql8 kernel: NETDEV WATCHDOG: eth0 (e1000e): transmit queue 0 timed out
Nov  9 22:03:58 sql8 kernel: Modules linked in: nfs lockd fscache nfs_acl auth_rpcgss autofs4 fuse sunrpc cpufreq_ondemand acpi_cpufreq freq_table ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 uinput snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm iTCO_wdt i2c_i801 e1000e ioatdma iTCO_vendor_support i2c_core dca snd_timer snd joydev microcode soundcore snd_page_alloc raid1 mptsas mptscsih mptbase scsi_transport_sas [last unloaded: scsi_wait_scan]
Nov  9 22:03:58 sql8 kernel: Pid: 0, comm: swapper Tainted: G   M    W  2.6.33.6-147.fc13.x86_64 #1
Nov  9 22:03:58 sql8 kernel: Call Trace:
Nov  9 22:03:58 sql8 kernel: <IRQ>  [<ffffffff8104aecc>] warn_slowpath_common+0x77/0x8f
Nov  9 22:03:58 sql8 kernel: [<ffffffff8104af31>] warn_slowpath_fmt+0x3c/0x3e
Nov  9 22:03:58 sql8 kernel: [<ffffffff8139bd5f>] ? netif_tx_lock+0x3f/0x68
Nov  9 22:03:58 sql8 kernel: [<ffffffff8139be78>] dev_watchdog+0xf0/0x192
Nov  9 22:03:58 sql8 kernel: [<ffffffff81057ab2>] ? internal_add_timer+0xca/0xcc
Nov  9 22:03:58 sql8 kernel: [<ffffffff81057b76>] ? cascade+0x65/0x7f
Nov  9 22:03:58 sql8 kernel: [<ffffffff81057d4a>] run_timer_softirq+0x1ba/0x25e
Nov  9 22:03:58 sql8 kernel: [<ffffffff81051039>] __do_softirq+0xe0/0x1a1
Nov  9 22:03:58 sql8 kernel: [<ffffffff81099aa2>] ? handle_IRQ_event+0x5b/0x11c
Nov  9 22:03:58 sql8 kernel: [<ffffffff8100aa1c>] call_softirq+0x1c/0x30
Nov  9 22:03:58 sql8 kernel: [<ffffffff8100c21d>] do_softirq+0x41/0x7e
Nov  9 22:03:58 sql8 kernel: [<ffffffff81050e8c>] irq_exit+0x36/0x78
Nov  9 22:03:58 sql8 kernel: [<ffffffff8100b957>] do_IRQ+0xa7/0xbe
Nov  9 22:03:58 sql8 kernel: [<ffffffff8142a553>] ret_from_intr+0x0/0x11
Nov  9 22:03:58 sql8 kernel: <EOI>  [<ffffffff81268b29>] ? acpi_idle_enter_simple+0x112/0x146
Nov  9 22:03:58 sql8 kernel: [<ffffffff81268b22>] ? acpi_idle_enter_simple+0x10b/0x146
Nov  9 22:03:58 sql8 kernel: [<ffffffff8126883d>] acpi_idle_enter_bm+0xd3/0x2ad
Nov  9 22:03:58 sql8 kernel: [<ffffffff8135f4a1>] cpuidle_idle_call+0x94/0xef
Nov  9 22:03:58 sql8 kernel: [<ffffffff81008bfd>] cpu_idle+0xa5/0xdf
Nov  9 22:03:58 sql8 kernel: [<ffffffff81413215>] rest_init+0x79/0x7b
Nov  9 22:03:58 sql8 kernel: [<ffffffff81ba6df8>] start_kernel+0x40e/0x419
Nov  9 22:03:58 sql8 kernel: [<ffffffff81ba62bc>] x86_64_start_reservations+0xa7/0xab
Nov  9 22:03:58 sql8 kernel: [<ffffffff81ba63b8>] x86_64_start_kernel+0xf8/0x107
Nov  9 22:03:58 sql8 kernel: ---[ end trace d9d3a1889f8925bf ]---

Comment 7 Peter Bieringer 2010-12-05 16:07:37 UTC

I ran into similiar problem using F13 (kept old kernel 2.6.33 running) and now after upgrading to F14 this issue hits me again.

Dec  5 16:45:58 system kernel: [ 2847.008084] WARNING: at net/sched/sch_generic.c:258 dev_watchdog+0xc6/0x12e()
Dec  5 16:45:58 system kernel: [ 2847.008095] Hardware name: A9830IMS
Dec  5 16:45:58 system kernel: [ 2847.008105] NETDEV WATCHDOG: eth1 (e1000e): transmit queue 0 timed out
Dec  5 16:45:58 system kernel: [ 2847.008115] Modules linked in: fuse sit tunnel4 f71882fg tun lockd sunrpc xt_hl nf_nat_tftp nf_conntrack_tftp ts_kmp nf_conntrack_amanda nf_conntrack_sane nf_conntrack_ipv6 nf_nat_ftp nf_conntrack_ftp ipt_MASQUERADE ip6table_mangle xt_TCPMSS iptable_mangle ip6t_LOG xt_limit xt_owner ipt_LOG iptable_nat nf_nat ip6t_REJECT ip6table_filter ip6_tables cpufreq_ondemand acpi_cpufreq mperf ipv6 sha256_generic aes_i586 aes_generic cbc dm_crypt uinput dvb_pll mt352 snd_hda_codec_realtek snd_hda_intel stv0299 snd_hda_codec snd_hwdep snd_seq b2c2_flexcop_pci snd_seq_device snd_pcm b2c2_flexcop dvb_core snd_timer cx24123 serio_raw iTCO_wdt cx24113 s5h1420 i2c_i801 usblp iTCO_vendor_support joydev snd e1000e soundcore snd_page_alloc usb_storage i915 drm_kms_helper drm i2c_algo_bit i2c_core video output [last unloaded: scsi_wait_scan]
Dec  5 16:45:58 system kernel: [ 2847.008396] Pid: 0, comm: swapper Not tainted 2.6.35.9-64.fc14.i686 #1
Dec  5 16:45:58 system kernel: [ 2847.008406] Call Trace:
Dec  5 16:45:58 system kernel: [ 2847.008432]  [<c0439321>] warn_slowpath_common+0x6a/0x7f
Dec  5 16:45:58 system kernel: [ 2847.008450]  [<c072a78b>] ? dev_watchdog+0xc6/0x12e
Dec  5 16:45:58 system kernel: [ 2847.008467]  [<c04393a9>] warn_slowpath_fmt+0x2b/0x2f
Dec  5 16:45:58 system kernel: [ 2847.008483]  [<c072a78b>] dev_watchdog+0xc6/0x12e
Dec  5 16:45:58 system kernel: [ 2847.008502]  [<c04213e9>] ? hpet_legacy_next_event+0xf/0x11
Dec  5 16:45:58 system kernel: [ 2847.008521]  [<c0458965>] ? clockevents_program_event+0xc7/0xd9
Dec  5 16:45:58 system kernel: [ 2847.008537]  [<c04551b5>] ? ktime_get+0x5d/0x8d
Dec  5 16:45:58 system kernel: [ 2847.008553]  [<c04597e1>] ? tick_dev_program_event+0x29/0x109
Dec  5 16:45:58 system kernel: [ 2847.008571]  [<c04438b4>] run_timer_softirq+0x167/0x20e
Dec  5 16:45:58 system kernel: [ 2847.008588]  [<c072a6c5>] ? dev_watchdog+0x0/0x12e
Dec  5 16:45:58 system kernel: [ 2847.008605]  [<c043e75e>] __do_softirq+0xa9/0x14a
Dec  5 16:45:58 system kernel: [ 2847.008622]  [<c043e832>] do_softirq+0x33/0x3d
Dec  5 16:45:58 system kernel: [ 2847.008637]  [<c043ea3b>] irq_exit+0x31/0x64
Dec  5 16:45:58 system kernel: [ 2847.008653]  [<c0404c0c>] do_IRQ+0x7d/0x91
Dec  5 16:45:58 system kernel: [ 2847.008669]  [<c04038f0>] common_interrupt+0x30/0x38
Dec  5 16:45:58 system kernel: [ 2847.008686]  [<c04300e0>] ? sched_setscheduler+0x6/0x11
Dec  5 16:45:58 system kernel: [ 2847.008705]  [<c05ef124>] ? intel_idle+0xf2/0x119
Dec  5 16:45:58 system kernel: [ 2847.008724]  [<c06f4258>] cpuidle_idle_call+0x6e/0xc1
Dec  5 16:45:58 system kernel: [ 2847.008739]  [<c040214c>] cpu_idle+0x8e/0xaf
Dec  5 16:45:58 system kernel: [ 2847.008756]  [<c0793b1d>] rest_init+0x71/0x73
Dec  5 16:45:58 system kernel: [ 2847.008774]  [<c0a1d7d7>] start_kernel+0x34a/0x34f
Dec  5 16:45:58 system kernel: [ 2847.008791]  [<c0a1d0c9>] i386_start_kernel+0xc9/0xd0
Dec  5 16:45:58 system kernel: [ 2847.008803] ---[ end trace da0be79886ba25f8 ]---
Dec  5 16:45:58 system kernel: [ 2847.008858] e1000e 0000:02:00.0: eth1: Reset adapter


Unfortunately, afterwards a reboot is required to bring network back to life again.

Mainboard: MSI-9830
CPU: Atom N270

Comment 8 Need Real Name 2011-02-14 22:42:30 UTC

Workaround: boot with pcie_aspm=off
See bug 538920

Comment 9 Robin Bowes 2011-03-22 17:36:29 UTC

I am seeing the same regular crash if I stress the network, eg. copy large files over NFS.

Comment 10 Bug Zapper 2011-06-01 12:24:53 UTC

This message is a reminder that Fedora 13 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 13.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '13'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 13's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 13 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 11 Bug Zapper 2011-06-29 12:43:43 UTC

Fedora 13 changed to end-of-life (EOL) status on 2011-06-25. Fedora 13 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.