Red Hat Bugzilla – Bug 79103
[tg3] eth0: Error, poll already scheduled
Last modified: 2013-07-02 22:08:16 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20021003
Description of problem:
We just installed kernel-2.4.18-18.7.x and we are now seeing thousands of
kernel: tg3: eth0: Error, poll already scheduled messages
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Boot system
2. Read over network interface
Actual Results: /var/log/messages fills up with Dec 5 09:38:48 simba kernel:
tg3: eth0: Error, poll already scheduled
Dec 5 09:39:19 simba last message repeated 52 times
Dec 5 09:40:23 simba last message repeated 140 times
Dec 5 09:41:38 simba last message repeated 101 times
Dec 5 09:42:39 simba last message repeated 230 times
Dec 5 09:43:43 simba last message repeated 161 times
Expected Results: No Error message
Some more info:
After repeating this error message for 3 days (several thousand instances) the
machine stoped serving NFS, its only purpose. The message that looks most
unusual in the log is
Dec 8 18:11:34 simba rpc.statd: Can't callback simba.colorado.edu
(100021,4), giving up.
I attempted to remotely reboot (using shutdown) the machine which only seemed to
cause the system load to rise to 8 and stay there. I could still log in and
attempted to reboot using reboot, the machine did not reboot and needed to be
Hope this helps.
Some more info that might help:
We only observe this error on a Dell poweredge 4600. Two other P4 machines that
use the 3com 996BT NIC do not report this error at all.
This is kernel ID line from the machine that keeps reporting this error:
Dec 8 20:49:40 simba kernel: eth0: Tigon3 [partno(BCM95700A6) rev 7104
PHY(5401)] (PCI:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:06:5b:88:dc:7f
This is the kernel ID line from one of the machines that do not crash:
Dec 8 18:06:44 monalisa kernel: eth0: Tigon3 [partno(3C996B-T) rev 0105
PHY(5701)] (PCI:33MHz:32-bit) 10/100/1000BaseT Ethernet 00:04:76:f1:0a:86
I spoke too soon. All three machines have now crashed with the 2.4.18-18.7.x
Could this be related to 69920?
Yes, the kernel 2.4.18-18.7.x tg3 crash is the one solved by driver version 1.2,
mentioned in bug 69920 :)
Should be fixed in experiment #1 rpms, below.
[snip comment from other bug report]
To all still experiencing problems,
1) please boot with "noapic" on the kernel command line. You can run "cat
/proc/cmdline" to check for sure.
2) I have posted some new rpms for testing, based on the latest errata:
latest production tg3 release, 1.2a, built into unofficial rpms:
but I would like people to test my experiment which should provide additional
...and if that doesn't work for people, fall back to experiment 2:
Feedback requested! On several systems, there is evidence that the lock-ups are
not directly related to driver but more to system board. So please make sure to
attach 'dmesg' and 'lspci -vvv' output in future bug reports.
Are these additional changes to the 2.4.18-19.7.x kernel? We just put that on
one of our P4 with a 996BT NIC and we'll see how that runs for a few days.
This is fixed in the current 8.1 beta kernels, and also the
unofficial-errata-kernel version "aragorn2" that I have posted at
http://people.redhat.com/jgarzik/pub/ The "aragorn2" kernels are based off of
Red Hat's latest official errata kernel, 2.4.18-19.8.0, with the addition of an
updated e1000 and bugfixed tg3 driver.