Bug 620047
Summary: | eth0 (r8169): transmit queue 0 timed out | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Bernd Bartmann <bernd.bartmann> |
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> |
Status: | CLOSED WONTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | high | Docs Contact: | |
Priority: | low | ||
Version: | 13 | CC: | anton, dougsland, drees76, fredrik, gansalmon, itamar, jonathan, kernel-maint, lsof, madhu.chinakonda, pb, robin.bowes |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2011-06-29 12:43:43 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Bernd Bartmann
2010-07-31 16:16:44 UTC
Does adding "pcie_aspm=off" to the kernel boot options help? Just added this option to the kernel line. We'll see if this improves the situation. I won't have time to run more extensive network load test before the weekend. BTW: What does this option mean? Ok, did some network stress tests, i.e. copy ~50GB from NFS server to the system, in parallel watched some video files via a SAMBA share from this system and had two SSH sessions open to monitor the log files. So far so good. Also, the system is now on kernel-2.6.33.6-147.2.4.fc13.x86_64. *** Bug 541716 has been marked as a duplicate of this bug. *** I've seen the same thing once now on a Opteron server running 2.6.34.7-56.fc13.x86_64 and an e1000e NIC - is it possible/likely that this is the same issue? Have only seen this once and otherwise the system has been stable for quite some time now. Bug #620253 also appears to be the same as this one except with a different NIC. After about 3 months uptime with heavy network traffic: Nov 9 22:03:58 sql8 kernel: ------------[ cut here ]------------ Nov 9 22:03:58 sql8 kernel: WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0xf0/0x192() Nov 9 22:03:58 sql8 kernel: Hardware name: RS500-E6-PS4 Nov 9 22:03:58 sql8 kernel: NETDEV WATCHDOG: eth0 (e1000e): transmit queue 0 timed out Nov 9 22:03:58 sql8 kernel: Modules linked in: nfs lockd fscache nfs_acl auth_rpcgss autofs4 fuse sunrpc cpufreq_ondemand acpi_cpufreq freq_table ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 uinput snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm iTCO_wdt i2c_i801 e1000e ioatdma iTCO_vendor_support i2c_core dca snd_timer snd joydev microcode soundcore snd_page_alloc raid1 mptsas mptscsih mptbase scsi_transport_sas [last unloaded: scsi_wait_scan] Nov 9 22:03:58 sql8 kernel: Pid: 0, comm: swapper Tainted: G M W 2.6.33.6-147.fc13.x86_64 #1 Nov 9 22:03:58 sql8 kernel: Call Trace: Nov 9 22:03:58 sql8 kernel: <IRQ> [<ffffffff8104aecc>] warn_slowpath_common+0x77/0x8f Nov 9 22:03:58 sql8 kernel: [<ffffffff8104af31>] warn_slowpath_fmt+0x3c/0x3e Nov 9 22:03:58 sql8 kernel: [<ffffffff8139bd5f>] ? netif_tx_lock+0x3f/0x68 Nov 9 22:03:58 sql8 kernel: [<ffffffff8139be78>] dev_watchdog+0xf0/0x192 Nov 9 22:03:58 sql8 kernel: [<ffffffff81057ab2>] ? internal_add_timer+0xca/0xcc Nov 9 22:03:58 sql8 kernel: [<ffffffff81057b76>] ? cascade+0x65/0x7f Nov 9 22:03:58 sql8 kernel: [<ffffffff81057d4a>] run_timer_softirq+0x1ba/0x25e Nov 9 22:03:58 sql8 kernel: [<ffffffff81051039>] __do_softirq+0xe0/0x1a1 Nov 9 22:03:58 sql8 kernel: [<ffffffff81099aa2>] ? handle_IRQ_event+0x5b/0x11c Nov 9 22:03:58 sql8 kernel: [<ffffffff8100aa1c>] call_softirq+0x1c/0x30 Nov 9 22:03:58 sql8 kernel: [<ffffffff8100c21d>] do_softirq+0x41/0x7e Nov 9 22:03:58 sql8 kernel: [<ffffffff81050e8c>] irq_exit+0x36/0x78 Nov 9 22:03:58 sql8 kernel: [<ffffffff8100b957>] do_IRQ+0xa7/0xbe Nov 9 22:03:58 sql8 kernel: [<ffffffff8142a553>] ret_from_intr+0x0/0x11 Nov 9 22:03:58 sql8 kernel: <EOI> [<ffffffff81268b29>] ? acpi_idle_enter_simple+0x112/0x146 Nov 9 22:03:58 sql8 kernel: [<ffffffff81268b22>] ? acpi_idle_enter_simple+0x10b/0x146 Nov 9 22:03:58 sql8 kernel: [<ffffffff8126883d>] acpi_idle_enter_bm+0xd3/0x2ad Nov 9 22:03:58 sql8 kernel: [<ffffffff8135f4a1>] cpuidle_idle_call+0x94/0xef Nov 9 22:03:58 sql8 kernel: [<ffffffff81008bfd>] cpu_idle+0xa5/0xdf Nov 9 22:03:58 sql8 kernel: [<ffffffff81413215>] rest_init+0x79/0x7b Nov 9 22:03:58 sql8 kernel: [<ffffffff81ba6df8>] start_kernel+0x40e/0x419 Nov 9 22:03:58 sql8 kernel: [<ffffffff81ba62bc>] x86_64_start_reservations+0xa7/0xab Nov 9 22:03:58 sql8 kernel: [<ffffffff81ba63b8>] x86_64_start_kernel+0xf8/0x107 Nov 9 22:03:58 sql8 kernel: ---[ end trace d9d3a1889f8925bf ]--- I ran into similiar problem using F13 (kept old kernel 2.6.33 running) and now after upgrading to F14 this issue hits me again. Dec 5 16:45:58 system kernel: [ 2847.008084] WARNING: at net/sched/sch_generic.c:258 dev_watchdog+0xc6/0x12e() Dec 5 16:45:58 system kernel: [ 2847.008095] Hardware name: A9830IMS Dec 5 16:45:58 system kernel: [ 2847.008105] NETDEV WATCHDOG: eth1 (e1000e): transmit queue 0 timed out Dec 5 16:45:58 system kernel: [ 2847.008115] Modules linked in: fuse sit tunnel4 f71882fg tun lockd sunrpc xt_hl nf_nat_tftp nf_conntrack_tftp ts_kmp nf_conntrack_amanda nf_conntrack_sane nf_conntrack_ipv6 nf_nat_ftp nf_conntrack_ftp ipt_MASQUERADE ip6table_mangle xt_TCPMSS iptable_mangle ip6t_LOG xt_limit xt_owner ipt_LOG iptable_nat nf_nat ip6t_REJECT ip6table_filter ip6_tables cpufreq_ondemand acpi_cpufreq mperf ipv6 sha256_generic aes_i586 aes_generic cbc dm_crypt uinput dvb_pll mt352 snd_hda_codec_realtek snd_hda_intel stv0299 snd_hda_codec snd_hwdep snd_seq b2c2_flexcop_pci snd_seq_device snd_pcm b2c2_flexcop dvb_core snd_timer cx24123 serio_raw iTCO_wdt cx24113 s5h1420 i2c_i801 usblp iTCO_vendor_support joydev snd e1000e soundcore snd_page_alloc usb_storage i915 drm_kms_helper drm i2c_algo_bit i2c_core video output [last unloaded: scsi_wait_scan] Dec 5 16:45:58 system kernel: [ 2847.008396] Pid: 0, comm: swapper Not tainted 2.6.35.9-64.fc14.i686 #1 Dec 5 16:45:58 system kernel: [ 2847.008406] Call Trace: Dec 5 16:45:58 system kernel: [ 2847.008432] [<c0439321>] warn_slowpath_common+0x6a/0x7f Dec 5 16:45:58 system kernel: [ 2847.008450] [<c072a78b>] ? dev_watchdog+0xc6/0x12e Dec 5 16:45:58 system kernel: [ 2847.008467] [<c04393a9>] warn_slowpath_fmt+0x2b/0x2f Dec 5 16:45:58 system kernel: [ 2847.008483] [<c072a78b>] dev_watchdog+0xc6/0x12e Dec 5 16:45:58 system kernel: [ 2847.008502] [<c04213e9>] ? hpet_legacy_next_event+0xf/0x11 Dec 5 16:45:58 system kernel: [ 2847.008521] [<c0458965>] ? clockevents_program_event+0xc7/0xd9 Dec 5 16:45:58 system kernel: [ 2847.008537] [<c04551b5>] ? ktime_get+0x5d/0x8d Dec 5 16:45:58 system kernel: [ 2847.008553] [<c04597e1>] ? tick_dev_program_event+0x29/0x109 Dec 5 16:45:58 system kernel: [ 2847.008571] [<c04438b4>] run_timer_softirq+0x167/0x20e Dec 5 16:45:58 system kernel: [ 2847.008588] [<c072a6c5>] ? dev_watchdog+0x0/0x12e Dec 5 16:45:58 system kernel: [ 2847.008605] [<c043e75e>] __do_softirq+0xa9/0x14a Dec 5 16:45:58 system kernel: [ 2847.008622] [<c043e832>] do_softirq+0x33/0x3d Dec 5 16:45:58 system kernel: [ 2847.008637] [<c043ea3b>] irq_exit+0x31/0x64 Dec 5 16:45:58 system kernel: [ 2847.008653] [<c0404c0c>] do_IRQ+0x7d/0x91 Dec 5 16:45:58 system kernel: [ 2847.008669] [<c04038f0>] common_interrupt+0x30/0x38 Dec 5 16:45:58 system kernel: [ 2847.008686] [<c04300e0>] ? sched_setscheduler+0x6/0x11 Dec 5 16:45:58 system kernel: [ 2847.008705] [<c05ef124>] ? intel_idle+0xf2/0x119 Dec 5 16:45:58 system kernel: [ 2847.008724] [<c06f4258>] cpuidle_idle_call+0x6e/0xc1 Dec 5 16:45:58 system kernel: [ 2847.008739] [<c040214c>] cpu_idle+0x8e/0xaf Dec 5 16:45:58 system kernel: [ 2847.008756] [<c0793b1d>] rest_init+0x71/0x73 Dec 5 16:45:58 system kernel: [ 2847.008774] [<c0a1d7d7>] start_kernel+0x34a/0x34f Dec 5 16:45:58 system kernel: [ 2847.008791] [<c0a1d0c9>] i386_start_kernel+0xc9/0xd0 Dec 5 16:45:58 system kernel: [ 2847.008803] ---[ end trace da0be79886ba25f8 ]--- Dec 5 16:45:58 system kernel: [ 2847.008858] e1000e 0000:02:00.0: eth1: Reset adapter Unfortunately, afterwards a reboot is required to bring network back to life again. Mainboard: MSI-9830 CPU: Atom N270 Workaround: boot with pcie_aspm=off See bug 538920 I am seeing the same regular crash if I stress the network, eg. copy large files over NFS. This message is a reminder that Fedora 13 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 13. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '13'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 13's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 13 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping Fedora 13 changed to end-of-life (EOL) status on 2011-06-25. Fedora 13 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed. |