Bug 704758 - Netdev transmit queue timeout on Intel 82576 based NIC
Summary: Netdev transmit queue timeout on Intel 82576 based NIC
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 14
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Keywords:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-05-14 18:03 UTC by Mike Hinz
Modified: 2012-08-16 13:50 UTC (History)
6 users (show)

(edit)
Clone Of:
(edit)
Last Closed: 2012-08-16 13:50:12 UTC


Attachments (Terms of Use)

Description Mike Hinz 2011-05-14 18:03:48 UTC
Description of problem:  Extremely large numbers of TX and RX errors observed in normal operations and when testing with Iperf.  After errors are seen, NIC no longer receives or transmits data.  This occurs with Intel Quad Port 82576 NIC.  


Version-Release number of selected component (if applicable): 2.6.35.13-91.fc14.x86_64 #1 SMP 


How reproducible:  Always reproducible, but variable time and/or data volume to failure.  Ranges from failure within 100 seconds to almost 9 hours.  


Steps to Reproduce:
1.  Use Iperf to transmit/receive data across NIC
2.  Failure will abruptly occur.  TX and RX error counts abruptly rise.  
3.
  
Actual results:  82576 NIC fails to tx/rx data.  


Expected results:  82576 NIC should reliably tx/rx large volumes of data over a long period of time.  


Additional info:  See the below netstat output, backtrace, modinfo and ethtool version info, lspci -vv output, dmesg output

Output from a script run every 5 seconds, date and netstat -i:
"""
Fri May 13 22:09:16 GMT 2011
Kernel Interface table
Iface       MTU Met    RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0       1500   0      808      0      0      0      185      0      0      0 BMRU
eth2       1500   0    21538      0      0      0    20987      0      0      0 BMRU
lo        16436   0       12      0      0      0       12      0      0      0 LRU
virbr0     1500   0        0      0      0      0       20      0      0      0 BMRU
Fri May 13 22:09:21 GMT 2011
Kernel Interface table
Iface       MTU Met    RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0       1500   0      821      0      0      0      185      0      0      0 BMRU
eth2       1500   0    21903      0      0      0    21344      0      0      0 BMRU
lo        16436   0       12      0      0      0       12      0      0      0 LRU
virbr0     1500   0        0      0      0      0       20      0      0      0 BMRU
Fri May 13 22:09:26 GMT 2011
Kernel Interface table
Iface       MTU Met    RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0       1500   0      824      0      0      0      185      0      0      0 BMRU
eth2       1500   0    22447      0      0      0    21880      0      0      0 BMRU
lo        16436   0       12      0      0      0       12      0      0      0 LRU
virbr0     1500   0        0      0      0      0       20      0      0      0 BMRU
Fri May 13 22:09:31 GMT 2011
Kernel Interface table
Iface       MTU Met    RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0       1500   0      825      0      0      0      185      0      0      0 BMRU
eth2       1500   0    22456 34359738360 8589934590 8589942780    21890 17179869180      0      0 BMRU
lo        16436   0       12      0      0      0       12      0      0      0 LRU
virbr0     1500   0        0      0      0      0       20      0      0      0 BMRU
Fri May 13 22:09:36 GMT 2011
Kernel Interface table
Iface       MTU Met    RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0       1500   0      830      0      0      0      185      0      0      0 BMRU
eth2       1500   0    22456 51539607540 12884901885 12884914170    21890 25769803770      0      0 BMRU
lo        16436   0       12      0      0      0       12      0      0      0 LRU
virbr0     1500   0        0      0      0      0       20      0      0      0 BMRU
Fri May 13 22:09:41 GMT 2011
Kernel Interface table
Iface       MTU Met    RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0       1500   0      839      0      0      0      188      0      0      0 BMRU
eth2       1500   0    22456 85899345900 21474836475 21474856950    21890 42949672950      0      0 BMU
lo        16436   0       12      0      0      0       12      0      0      0 LRU
virbr0     1500   0        0      0      0      0       20      0      0      0 BMRU
Fri May 13 22:09:46 GMT 2011
Kernel Interface table
Iface       MTU Met    RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0       1500   0      851      0      0      0      190      0      0      0 BMRU
eth2       1500   0    22456 85899345900 21474836475 21474856950    21890 42949672950      0      0 BMU
lo        16436   0       12      0      0      0       12      0      0      0 LRU
virbr0     1500   0        0      0      0      0       20      0      0      0 BMRU
"""

Backtrace from /var/log/messages entry upon failure:
"""
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624563] ------------[ cut here ]------------
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624572] WARNING: at net/sched/sch_generic.c:258 dev_watchdog+0xf3/0x167()
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624575] Hardware name: X8ST3
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624578] NETDEV WATCHDOG: eth2 (igb): transmit queue 0 timed out
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624580] Modules linked in: fuse ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat bridge stp llc rmd160 crypto_null camellia lzo lzo_compress cast6 cast5 deflate zlib_deflate cts ctr gcm ccm serpent blowfish twofish_x86_64 twofish_common ecb xcbc cbc sha256_generic sha512_generic des_generic cryptd aes_x86_64 aes_generic ah6 ah4 esp6 esp4 xfrm4_mode_beet xfrm4_tunnel tunnel4 xfrm4_mode_tunnel xfrm4_mode_transport xfrm6_mode_transport xfrm6_mode_ro xfrm6_mode_beet xfrm6_mode_tunnel ipcomp ipcomp6 xfrm_ipcomp xfrm6_tunnel tunnel6 af_key sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ipv6 kvm_intel kvm uinput igb i2c_i801 i2c_core serio_raw e1000e ioatdma i7core_edac joydev iTCO_wdt iTCO_vendor_support edac_core dca microcode raid1 [last unloaded: scsi_wait_scan]
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624660] Pid: 27, comm: events/0 Not tainted 2.6.35.13-91.fc14.x86_64 #1
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624663] Call Trace:
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624665]  <IRQ>  [<ffffffff8104dcf1>] warn_slowpath_common+0x85/0x9d
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624675]  [<ffffffff8104ddac>] warn_slowpath_fmt+0x46/0x48
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624680]  [<ffffffff813d65aa>] ? netif_tx_lock+0x44/0x6d
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624684]  [<ffffffff813d6714>] dev_watchdog+0xf3/0x167
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624688]  [<ffffffff8106b8f4>] ? sched_clock_cpu+0x42/0xc6
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624694]  [<ffffffff810688f7>] ? run_posix_cpu_timers+0x2a/0x5bb
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624698]  [<ffffffff8106a025>] ? hrtimer_run_pending+0x17/0xc7
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624703]  [<ffffffff81040451>] ? task_tick_fair+0x6c/0xf7
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624709]  [<ffffffff8105a1f4>] run_timer_softirq+0x1d6/0x2a3
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624713]  [<ffffffff813d6621>] ? dev_watchdog+0x0/0x167
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624719]  [<ffffffff81053dd9>] __do_softirq+0xf0/0x1bf
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624724]  [<ffffffff81072b8c>] ? tick_dev_program_event+0x36/0xf4
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624729]  [<ffffffff8101058b>] ? native_sched_clock+0x35/0x37
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624734]  [<ffffffff8100abdc>] call_softirq+0x1c/0x30
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624738]  [<ffffffff8100c338>] do_softirq+0x46/0x82
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624742]  [<ffffffff81053f65>] irq_exit+0x49/0x8b
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624748]  [<ffffffff81470c1a>] smp_apic_timer_interrupt+0x7e/0x8c
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624752]  [<ffffffff8100a693>] apic_timer_interrupt+0x13/0x20
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624754]  <EOI>  [<ffffffff81010b27>] ? native_read_tsc+0x6/0x16
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624763]  [<ffffffff812212be>] paravirt_read_tsc+0xe/0x12
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624768]  [<ffffffff812213af>] delay_tsc+0x35/0x74
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624772]  [<ffffffff81221309>] __delay+0xf/0x11
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624776]  [<ffffffff8122134d>] __const_udelay+0x42/0x44
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624789]  [<ffffffffa01327c4>] e1000_get_hw_semaphore_generic+0x44/0xd0 [igb]
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624800]  [<ffffffffa01306be>] e1000_acquire_swfw_sync_82575+0x5e/0xa0 [igb]
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624810]  [<ffffffffa0130738>] e1000_acquire_phy_82575+0x38/0x40 [igb]
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624820]  [<ffffffffa0133f7d>] __e1000_read_phy_reg_igp+0x3d/0xa0 [igb]
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624827]  [<ffffffff8103c1a5>] ? need_resched+0x23/0x2d
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624838]  [<ffffffffa01341d0>] e1000_read_phy_reg_igp+0x10/0x20 [igb]
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624849]  [<ffffffffa013acf9>] e1000_read_phy_reg+0x19/0x20 [igb]
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624858]  [<ffffffffa012a89c>] igb_update_stats+0xa6c/0xa90 [igb]
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624863]  [<ffffffff81078555>] ? __raw_local_irq_save+0x1d/0x23
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624872]  [<ffffffffa012a917>] igb_watchdog_task+0x57/0x5d0 [igb]
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624877]  [<ffffffff81062bed>] worker_thread+0x1c5/0x251
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624886]  [<ffffffffa012a8c0>] ? igb_watchdog_task+0x0/0x5d0 [igb]
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624891]  [<ffffffff81066a4b>] ? autoremove_wake_function+0x0/0x39
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624895]  [<ffffffff81062a28>] ? worker_thread+0x0/0x251
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624899]  [<ffffffff810665b1>] kthread+0x7f/0x87
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624903]  [<ffffffff8100aae4>] kernel_thread_helper+0x4/0x10
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624907]  [<ffffffff81066532>] ? kthread+0x0/0x87
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624911]  [<ffffffff8100aae0>] ? kernel_thread_helper+0x0/0x10
May 13 22:09:37 HOU-PM-0001 kernel: [  992.624915] ---[ end trace f5dc4241a34c579e ]---
"""

Driver version from modinfo and ethtool -i:
"""
# modinfo igb
filename:       /lib/modules/2.6.35.13-91.fc14.x86_64/kernel/drivers/net/igb/igb.ko
version:        3.0.19
license:        GPL
description:    Intel(R) Gigabit Ethernet Network Driver
author:         Intel Corporation, <e1000-devel@lists.sourceforge.net>
srcversion:     9615CC3B262CEBC014E1DF9
alias:          pci:v00008086d000010D6sv*sd*bc*sc*i*
alias:          pci:v00008086d000010A9sv*sd*bc*sc*i*
alias:          pci:v00008086d000010A7sv*sd*bc*sc*i*
alias:          pci:v00008086d00001526sv*sd*bc*sc*i*
alias:          pci:v00008086d000010E8sv*sd*bc*sc*i*
alias:          pci:v00008086d0000150Dsv*sd*bc*sc*i*
alias:          pci:v00008086d000010E7sv*sd*bc*sc*i*
alias:          pci:v00008086d000010E6sv*sd*bc*sc*i*
alias:          pci:v00008086d00001518sv*sd*bc*sc*i*
alias:          pci:v00008086d0000150Asv*sd*bc*sc*i*
alias:          pci:v00008086d000010C9sv*sd*bc*sc*i*
alias:          pci:v00008086d00000440sv*sd*bc*sc*i*
alias:          pci:v00008086d0000043Csv*sd*bc*sc*i*
alias:          pci:v00008086d0000043Asv*sd*bc*sc*i*
alias:          pci:v00008086d00000438sv*sd*bc*sc*i*
alias:          pci:v00008086d00001527sv*sd*bc*sc*i*
alias:          pci:v00008086d00001516sv*sd*bc*sc*i*
alias:          pci:v00008086d00001511sv*sd*bc*sc*i*
alias:          pci:v00008086d00001510sv*sd*bc*sc*i*
alias:          pci:v00008086d0000150Fsv*sd*bc*sc*i*
alias:          pci:v00008086d0000150Esv*sd*bc*sc*i*
alias:          pci:v00008086d00001524sv*sd*bc*sc*i*
alias:          pci:v00008086d00001523sv*sd*bc*sc*i*
alias:          pci:v00008086d00001522sv*sd*bc*sc*i*
alias:          pci:v00008086d00001521sv*sd*bc*sc*i*
depends:        dca
vermagic:       2.6.35.13-91.fc14.x86_64 SMP mod_unload 
parm:           InterruptThrottleRate:Maximum interrupts per second, per vector, (max 100000), default 3=adaptive (array of int)
parm:           IntMode:Change Interrupt Mode (0=Legacy, 1=MSI, 2=MSI-X), default 2 (array of int)
parm:           Node:set the starting node to allocate memory on, default -1 (array of int)
parm:           LLIPort:Low Latency Interrupt TCP Port (0-65535), default 0=off (array of int)
parm:           LLIPush:Low Latency Interrupt on TCP Push flag (0,1), default 0=off (array of int)
parm:           LLISize:Low Latency Interrupt on Packet Size (0-1500), default 0=off (array of int)
parm:           RSS:Number of Receive-Side Scaling Descriptor Queues (0-8), default 1=number of cpus (array of int)
parm:           VMDQ:Number of Virtual Machine Device Queues: 0-1 = disable, 2-8 enable, default 0 (array of int)
parm:           max_vfs:Number of Virtual Functions: 0 = disable, 1-7 enable, default 0 (array of int)
parm:           QueuePairs:Enable TX/RX queue pairs for interrupt handling (0,1), default 1=on (array of int)
parm:           EEE:Enable/disable on parts that support the feature (array of int)
parm:           DMAC:Enable/disable on parts that support the feature (array of int)
parm:           debug:Debug level (0=none, ..., 16=all) (int)


# ethtool -i eth2
driver: igb
version: 3.0.19
firmware-version: 1.2-1
bus-info: 0000:05:00.0
"""

Also tried with below older driver version, with same results:
"""
# ethtool -i eth2
driver: igb
version: 2.1.0-k2
firmware-version: 15.255-15
bus-info: 0000:05:00.0
"""

Relevant section of output from dmesg:
"""
[    7.584479] Intel(R) Gigabit Ethernet Network Driver - version 3.0.19
[    7.584747] Copyright (c) 2007-2010 Intel Corporation.
[    7.585052] igb 0000:05:00.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18
[    7.585334] igb 0000:05:00.0: setting latency timer to 64
[    7.585680]   alloc irq_desc for 65 on node -1
[    7.585682]   alloc kstat_irqs on node -1
[    7.585688] igb 0000:05:00.0: irq 65 for MSI/MSI-X
[    7.585689]   alloc irq_desc for 66 on node -1
[    7.585691]   alloc kstat_irqs on node -1
[    7.585694] igb 0000:05:00.0: irq 66 for MSI/MSI-X
[    7.776181] igb 0000:05:00.0: DCA enabled
[    7.776527] igb 0000:05:00.0: Intel(R) Gigabit Ethernet Network Connection
[    7.776793] igb 0000:05:00.0: eth2: (PCIe:2.5GT/s:Width x4) 00:1b:21:75:af:64
[    7.777355] igb 0000:05:00.0: eth2: PBA No: E91609-003
[    7.777618] igb 0000:05:00.0: Using MSI-X interrupts. 1 rx queue(s), 1 tx queue(s)
[    7.778143] igb 0000:05:00.1: PCI INT B -> GSI 19 (level, low) -> IRQ 19
[    7.778421] igb 0000:05:00.1: setting latency timer to 64
[    7.778819]   alloc irq_desc for 67 on node -1
[    7.778821]   alloc kstat_irqs on node -1
[    7.778826] igb 0000:05:00.1: irq 67 for MSI/MSI-X
[    7.778827]   alloc irq_desc for 68 on node -1
[    7.778829]   alloc kstat_irqs on node -1
[    7.778832] igb 0000:05:00.1: irq 68 for MSI/MSI-X
[    7.969702] igb 0000:05:00.1: DCA enabled
[    7.970048] igb 0000:05:00.1: Intel(R) Gigabit Ethernet Network Connection
[    7.970313] igb 0000:05:00.1: eth3: (PCIe:2.5GT/s:Width x4) 00:1b:21:75:af:65
[    7.970878] igb 0000:05:00.1: eth3: PBA No: E91609-003
[    7.971140] igb 0000:05:00.1: Using MSI-X interrupts. 1 rx queue(s), 1 tx queue(s)
[    7.971659] igb 0000:06:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[    7.971940] igb 0000:06:00.0: setting latency timer to 64
[    7.972257]   alloc irq_desc for 69 on node -1
[    7.972259]   alloc kstat_irqs on node -1
[    7.972264] igb 0000:06:00.0: irq 69 for MSI/MSI-X
[    7.972266]   alloc irq_desc for 70 on node -1
[    7.972267]   alloc kstat_irqs on node -1
[    7.972270] igb 0000:06:00.0: irq 70 for MSI/MSI-X
[    8.164215] igb 0000:06:00.0: DCA enabled
[    8.164557] igb 0000:06:00.0: Intel(R) Gigabit Ethernet Network Connection
[    8.164823] igb 0000:06:00.0: eth4: (PCIe:2.5GT/s:Width x4) 00:1b:21:75:af:66
[    8.165418] igb 0000:06:00.0: eth4: PBA No: E91609-003
[    8.165680] igb 0000:06:00.0: Using MSI-X interrupts. 1 rx queue(s), 1 tx queue(s)
[    8.166261] igb 0000:06:00.1: PCI INT B -> GSI 17 (level, low) -> IRQ 17
[    8.166540] igb 0000:06:00.1: setting latency timer to 64
[    8.166848]   alloc irq_desc for 71 on node -1
[    8.166850]   alloc kstat_irqs on node -1
[    8.166855] igb 0000:06:00.1: irq 71 for MSI/MSI-X
[    8.166856]   alloc irq_desc for 72 on node -1
[    8.166858]   alloc kstat_irqs on node -1
[    8.166861] igb 0000:06:00.1: irq 72 for MSI/MSI-X
[    8.357754] igb 0000:06:00.1: DCA enabled
[    8.358098] igb 0000:06:00.1: Intel(R) Gigabit Ethernet Network Connection
[    8.358367] igb 0000:06:00.1: eth5: (PCIe:2.5GT/s:Width x4) 00:1b:21:75:af:67
[    8.358957] igb 0000:06:00.1: eth5: PBA No: E91609-003
[    8.359221] igb 0000:06:00.1: Using MSI-X interrupts. 1 rx queue(s), 1 tx queue(s)
"""

Relevant portion of output from lspci -vv:
"""
05:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
	Subsystem: Intel Corporation Gigabit ET2 Quad Port Server Adapter
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 256 bytes
	Interrupt: pin A routed to IRQ 18
	Region 0: Memory at f97e0000 (32-bit, non-prefetchable) [size=128K]
	Region 1: Memory at f9800000 (32-bit, non-prefetchable) [size=4M]
	Region 2: I/O ports at ac00 [size=32]
	Region 3: Memory at f97dc000 (32-bit, non-prefetchable) [size=16K]
	Capabilities: <access denied>
	Kernel driver in use: igb
	Kernel modules: igb

05:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
	Subsystem: Intel Corporation Gigabit ET2 Quad Port Server Adapter
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 256 bytes
	Interrupt: pin B routed to IRQ 19
	Region 0: Memory at f8fe0000 (32-bit, non-prefetchable) [size=128K]
	Region 1: Memory at f9000000 (32-bit, non-prefetchable) [size=4M]
	Region 2: I/O ports at a880 [size=32]
	Region 3: Memory at f8fdc000 (32-bit, non-prefetchable) [size=16K]
	Capabilities: <access denied>
	Kernel driver in use: igb
	Kernel modules: igb

06:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
	Subsystem: Intel Corporation Gigabit ET2 Quad Port Server Adapter
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 256 bytes
	Interrupt: pin A routed to IRQ 16
	Region 0: Memory at fa7e0000 (32-bit, non-prefetchable) [size=128K]
	Region 1: Memory at fa800000 (32-bit, non-prefetchable) [size=4M]
	Region 2: I/O ports at bc00 [size=32]
	Region 3: Memory at fa7dc000 (32-bit, non-prefetchable) [size=16K]
	Capabilities: <access denied>
	Kernel driver in use: igb
	Kernel modules: igb

06:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
	Subsystem: Intel Corporation Gigabit ET2 Quad Port Server Adapter
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 256 bytes
	Interrupt: pin B routed to IRQ 17
	Region 0: Memory at f9fe0000 (32-bit, non-prefetchable) [size=128K]
	Region 1: Memory at fa000000 (32-bit, non-prefetchable) [size=4M]
	Region 2: I/O ports at b880 [size=32]
	Region 3: Memory at f9fdc000 (32-bit, non-prefetchable) [size=16K]
	Capabilities: <access denied>
	Kernel driver in use: igb
	Kernel modules: igb
"""

Comment 1 Mike Hinz 2011-05-14 18:27:24 UTC
We can arrange for external access to this server should that be required for further troubleshooting.  Please email me and we'll arrange access.  Thanks.

Comment 2 Mike Hinz 2011-06-07 18:47:38 UTC
I wanted to update this bug in case anyone else has run into it.  After some research and a bunch of testing, we've found that setting pcie_aspm=off in the kernel parameters fixed this particular issue.  We've now testing for more than 2 weeks on two different servers and have no encountered the issue again after the para was changed.  We still believe that there's a bug, but this very effectively solves the issue for us.  FYI.

Regards.


Mike.

Comment 3 Albert Strasheim 2012-04-03 13:04:48 UTC
similar issue here. pings started taking a very long time. will try pcie_aspm=off or pcie_aspm=performance.

[946170.720012] ------------[ cut here ]------------
[946170.720080] WARNING: at net/sched/sch_generic.c:256 dev_watchdog+0x257/0x260()
[946170.720158] Hardware name: MacPro3,1
[946170.720214] NETDEV WATCHDOG: eth1 (e1000e): transmit queue 0 timed out
[946170.720272] Modules linked in: bluetooth rfkill des_generic md4 nls_utf8 cifs fscache lockd nf_conntrack_tftp nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ftp xt_state ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ipt_MASQUERADE ib_uverbs iptable_nat nf_nat ib_umad nf_conntrack_ipv4 nf_conntrack rdma_cm ib_cm nf_defrag_ipv4 iw_cm ip6_tables ib_addr ib_sa coretemp mlx4_ib ib_mad ib_core mlx4_en snd_hda_codec_realtek snd_hda_intel snd_hda_codec joydev snd_hwdep snd_seq snd_seq_device snd_pcm iTCO_wdt applesmc i2c_i801 iTCO_vendor_support input_polldev ioatdma i5400_edac igb shpchp snd_timer snd soundcore edac_core dca snd_page_alloc e1000e i5k_amb mlx4_core microcode virtio_net kvm_intel kvm sunrpc binfmt_misc raid10 firewire_ohci firewire_core crc_itu_t radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
[946170.723459] Pid: 0, comm: swapper/0 Tainted: G          I  3.3.0-4.fc16.x86_64 #1
[946170.723537] Call Trace:
[946170.723592]  <IRQ>  [<ffffffff81057b1f>] warn_slowpath_common+0x7f/0xc0
[946170.723692]  [<ffffffff81057c16>] warn_slowpath_fmt+0x46/0x50
[946170.723752]  [<ffffffff810727c0>] ? __queue_work+0xe0/0x3e0
[946170.723812]  [<ffffffff810939a1>] ? trigger_load_balance+0x61/0x2d0
[946170.723874]  [<ffffffff81503a97>] dev_watchdog+0x257/0x260
[946170.723934]  [<ffffffff8106716c>] run_timer_softirq+0x12c/0x3b0
[946170.723993]  [<ffffffff8101bc59>] ? sched_clock+0x9/0x10
[946170.724060]  [<ffffffff81503840>] ? qdisc_reset+0x50/0x50
[946170.724121]  [<ffffffff8105f148>] __do_softirq+0xb8/0x230
[946170.724600]  [<ffffffff8101bbe3>] ? native_sched_clock+0x13/0x80
[946170.724659]  [<ffffffff8101bc59>] ? sched_clock+0x9/0x10
[946170.724717]  [<ffffffff8108dad5>] ? sched_clock_local+0x25/0x90
[946170.724777]  [<ffffffff815fd4dc>] call_softirq+0x1c/0x30
[946170.724837]  [<ffffffff81016455>] do_softirq+0x65/0xa0
[946170.724895]  [<ffffffff8105f55e>] irq_exit+0x9e/0xc0
[946170.724954]  [<ffffffff815fde2e>] smp_apic_timer_interrupt+0x6e/0x99
[946170.725021]  [<ffffffff815fcade>] apic_timer_interrupt+0x6e/0x80
[946170.725081]  <EOI>  [<ffffffff8101ce75>] ? mwait_idle+0x95/0x210
[946170.725177]  [<ffffffff81013239>] cpu_idle+0xd9/0x120
[946170.725236]  [<ffffffff815d230e>] rest_init+0x72/0x74
[946170.725296]  [<ffffffff81ceebf8>] start_kernel+0x3ba/0x3c5
[946170.725355]  [<ffffffff81cee346>] x86_64_start_reservations+0x131/0x135
[946170.725415]  [<ffffffff81cee140>] ? early_idt_handlers+0x140/0x140
[946170.725474]  [<ffffffff81cee44c>] x86_64_start_kernel+0x102/0x111
[946170.725533] ---[ end trace 25c3f439e11991f1 ]---

Comment 4 Fedora End Of Life 2012-08-16 13:50:16 UTC
This message is a notice that Fedora 14 is now at end of life. Fedora 
has stopped maintaining and issuing updates for Fedora 14. It is 
Fedora's policy to close all bug reports from releases that are no 
longer maintained.  At this time, all open bugs with a Fedora 'version'
of '14' have been closed as WONTFIX.

(Please note: Our normal process is to give advanced warning of this 
occurring, but we forgot to do that. A thousand apologies.)

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, feel free to reopen 
this bug and simply change the 'version' to a later Fedora version.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we were unable to fix it before Fedora 14 reached end of life. If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora, you are encouraged to click on 
"Clone This Bug" (top right of this page) and open it against that 
version of Fedora.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping


Note You need to log in before you can comment on or make changes to this bug.