Bug 249185
| Summary: | Non Xen Kernel generates 'Detected Tx Unit Hang' messages on e1000 driver while Xen enabled Kernels do not. | ||
|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Greg Morgan <drkludge> |
| Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> |
| Status: | CLOSED DUPLICATE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
| Severity: | medium | Docs Contact: | |
| Priority: | low | ||
| Version: | 7 | CC: | chris.brown, i-kitayama, jesse.brandeburg |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2008-01-09 17:11:29 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Greg Morgan
2007-07-22 10:12:36 UTC
The working driver is Jul 21 17:31:03 mowgli kernel: input: PC Speaker as /class/input/input3 Jul 21 17:31:03 mowgli kernel: Intel(R) PRO/1000 Network Driver - version 7.3.15-k2-NAPI Jul 21 17:31:03 mowgli kernel: Copyright (c) 1999-2006 Intel Corporation. The TK hang driver is Jul 21 17:22:59 mowgli kernel: Intel(R) PRO/1000 Network Driver - version 7.3.20-k2-NAPI Jul 21 17:22:59 mowgli kernel: Copyright (c) 1999-2006 Intel Corporation. uptime 18:05:04 up 15:28,... with no TX hang errors on the 7.3.15 driver while coping 320G from an NFS Intel gigabit enabled server, if that helps. OK so I revised the Summary and provide you with one of those handy Time-Life charts. ;-) The real issue here is that the Xen kernels available in my grub menu do not generate the TX hang error messages while the regular fc7 kernels generate the TX hang messages. Trying to use gvim to create the table below between reboots on an NFS mounted home directory was very unresponsive with the TX hanging kernels. The Summary message was updated accordingly. Intel Driver Kernel TK Hang Issues 7.3.15-k2-NAPI /boot/vmlinuz-2.6.20-2925.11.fc7xen Rock Solid 7.3.15-k2-NAPI /boot/vmlinuz-2.6.20-2925.9.fc7xen Rock Solid 7.3.20-k2-NAPI /boot/vmlinuz-2.6.21-1.3228.fc7 TX Issues Encountered 7.3.20-k2-NAPI /boot/vmlinuz-2.6.22.1-27.fc7 TX Issues Encountered One workaround to try is turning off TSO:
# ethtool -K eth0 tso off
Hello, I'm reviewing this bug as part of the kernel bug triage project, an attempt to isolate current bugs in the fedora kernel. http://fedoraproject.org/wiki/KernelBugTriage I am CC'ing myself to this bug and will try and assist you in resolving it if I can. There hasn't been much activity on this bug for a while. Could you tell me if you are still having problems with the latest kernel? Did Chuck's suggestion work for you? If the problem no longer exists then please close this bug or I'll do so in a few days if there is no additional information lodged. Cheers Chris Chris, Ack. I'll check the machine tonight with the ethtool from comment #3. Note that this is the same host mentioned in bug 200656. I did perform some of the ethool operations in that bug. However, I know that I am still using the Xen kernels at this point with no problems at all. The Xen Kernel e1000 7.3.15-k2-NAPI driver is very similar to the 7.3.15tdh code that was provided me in bug 200656 comment #10. Regards, Greg Hi Greg, Any change using ethtool? Cheers Chris Chris, Sorry for the many delays. The problem still exists with the ethtool command. This problem also exists in f8 as I posted here in bug 398921. The 7.3.15-k2-NAPI Intel driver fixed the problem but the 7.3.20-k2-NAPI Intel driver regressed? or added back a similar problem that creates the Nov 25 18:19:03 mowgli kernel: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang Nov 25 18:19:03 mowgli kernel: Tx Queue <0> Nov 25 18:19:03 mowgli kernel: TDH <5a> Nov 25 18:19:03 mowgli kernel: TDT <5a> Nov 25 18:19:03 mowgli kernel: next_to_use <5a> Nov 25 18:19:03 mowgli kernel: next_to_clean <6e> Nov 25 18:19:03 mowgli kernel: buffer_info[next_to_clean] Nov 25 18:19:03 mowgli kernel: time_stamp <37cf825> Nov 25 18:19:03 mowgli kernel: next_to_watch <6e> Nov 25 18:19:03 mowgli kernel: jiffies <37d1100> Nov 25 18:19:03 mowgli kernel: next_to_watch.status <0> Nov 25 18:19:05 mowgli kernel: NETDEV WATCHDOG: eth0: transmit timed out messages. The problem occurs under load but can occur during web surfing. Hi Greg, Thanks for the update. I'm closing this as a dupe of bug 398921 then - thanks for filing that one. 2.6.24 is just around the corner so we could see what that brings ... :) *** This bug has been marked as a duplicate of 398921 *** |