Description of problem: While copying some large files from machine to machine on the same gig-E LAN, the copy paused and dmesg reported: WARNING: at net/sched/sch_generic.c:246 dev_watchdog+0xf3/0x164() (Not tainted) Hardware name: C2SEA NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out Modules linked in: fuse w83627ehf hwmon_vid coretemp cpufreq_ondemand acpi_cpufreq freq_table ipv6 kvm_intel kvm uinput usblp snd_hda_codec_intelhdmi snd_hda_codec_rea ltek snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore ppdev r8169 snd_page_alloc iTCO_wdt iTCO_vendor_support i2c_i801 parp ort_pc parport mii raid0 raid1 firewire_ohci pata_acpi ata_generic dm_multipath firewire_core crc_itu_t pata_it8213 i915 drm_kms_helper drm i2c_algo_bit i2c_core video output [last unloaded: scsi_wait_scan] Pid: 0, comm: swapper Not tainted 2.6.31.6-166.fc12.x86_64 #1 Call Trace: <IRQ> [<ffffffff810516f4>] warn_slowpath_common+0x84/0x9c [<ffffffff81051763>] warn_slowpath_fmt+0x41/0x43 [<ffffffff8138e831>] ? netif_tx_lock+0x44/0x6d [<ffffffff8138e99b>] dev_watchdog+0xf3/0x164 [<ffffffff8106eb3b>] ? getnstimeofday+0x5b/0xaf [<ffffffff81064068>] ? __queue_work+0x3a/0x43 [<ffffffff8106c562>] ? sched_clock_cpu+0x16e/0x176 [<ffffffff8105bec4>] run_timer_softirq+0x19f/0x21c [<ffffffff8106e8b3>] ? clocksource_read+0xf/0x11 [<ffffffff8102566a>] ? apic_write+0x16/0x18 [<ffffffff81057614>] __do_softirq+0xdd/0x1ad [<ffffffff81012eac>] call_softirq+0x1c/0x30 [<ffffffff810143fb>] do_softirq+0x47/0x8d [<ffffffff81057326>] irq_exit+0x44/0x86 [<ffffffff8141ed92>] smp_apic_timer_interrupt+0x86/0x94 [<ffffffff81012873>] apic_timer_interrupt+0x13/0x20 <EOI> [<ffffffff812679dd>] ? acpi_idle_enter_bm+0x281/0x2b5 [<ffffffff812679d6>] ? acpi_idle_enter_bm+0x27a/0x2b5 [<ffffffff81353b7f>] ? cpuidle_idle_call+0x99/0xce [<ffffffff81010c60>] ? cpu_idle+0xa6/0xe9 [<ffffffff8141489e>] ? start_secondary+0x1f3/0x234 ---[ end trace f81f94a8bc7ef390 ]--- r8169: eth0: link up After which the file copy resumed. Version-Release number of selected component (if applicable): kernel-2.6.31.6-166.fc12.x86_64 How reproducible: Unknown Steps to Reproduce: 1.Copy large files over gig-E network 2. 3. Actual results: Brief network outage Expected results: No error Additional info: 02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 02) Jumbo frames in use (7200), copy performed via rsync Both machines have the same hardware/software configuration. There was no error on the destination (receiving) machine.
After restarting the transfer a few times when the outage lasted a little too long, I tried reducing the rsync bandwidth by about 5Mb/s from what it was unrestricted (--bwlimit=35000). That appears to prevent the problem from reoccurring. Does the driver need to reserve some bandwidth, or throttle transmission when the TX queue depth exceeds a threshold?
I am also having this issue. But it is happening both when transferring files from a computer to a SAN and vise versa. I will also get this when I am just using the internet. This computer has been on ubuntu 9.04 64 bit & Centos 5.4 where this is not an issue. Problems occurring on kernel 2.6.32.9-70.fc12.x86_64. Also happens when I tried rawhide as well.
I am also having this issue. It happening when transferring large amount of data. Module r8169 is fixed since 2.6.34-rc4. I used backported driver from 2.6.34-rc7 on FC13 2.6.33.4-95, kernel works fine, but connection itself dont. I tryed driver from 2.6.30 on FC13, and same bug appeared. Unfortunately fix from 2.6.34-rc4 only removes kernel error, but network connection still becomes frozen after transferring some data. Another words this is not one bug, but two. 1st bug - causing invalid reset_task processing in r8169 and kernel error 2nd bug much deeper, causing overflow conditions occur more frequently or invalid overflow proccessing
partial fix and workaround https://bugzilla.kernel.org/show_bug.cgi?id=12411 may help.
This bug appears to be a duplicate of Bug 538920. Closing. *** This bug has been marked as a duplicate of bug 538920 ***