Bug 53758 - Linux Error: eepro100: cmd_wait for(0xffffff80) timedout with(0xffffff80)!
Linux Error: eepro100: cmd_wait for(0xffffff80) timedout with(0xffffff80)!
Status: CLOSED CURRENTRELEASE
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
6.2
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Arjan van de Ven
Brock Organ
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2001-09-17 17:17 EDT by David S. Brown
Modified: 2008-08-01 12:22 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-09-30 11:39:10 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description David S. Brown 2001-09-17 17:17:02 EDT
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)

Description of problem:
Intel eepro100 on STL2 MB with quant. 2 1000MHz processors, 1 GB Memory. 
Running: Linux l2 2.2.19-6.2.7smp #1 SMP Thu Jun 14 07:42:45 EDT 2001 i686 
unknown

lsmod
Module                  Size  Used by
nfs                    73472  40 (autoclean)
lockd                  45040   1 (autoclean) [nfs]
sunrpc                 63824   1 (autoclean) [nfs lockd]
e1000                  23408   1 (autoclean)
eepro100               17392   1 (autoclean)
aic7xxx               132960   4

gives error:
Linux Error: eepro100: cmd_wait for(0xffffff80) timedout with(0xffffff80)!

and then sometimes..
huge numbers of: 
kernel: nfs: task 291659 can't get a request slot 

This is similar but not exactly like: http://www.tux.org/hypermail/linux-
eepro100/2001-May/0010.html
http://www.cs.helsinki.fi/linux/linux-kernel/2001-00/0792.html

If I see the nfs task slot message the machine will hang, if I don't see 
that message the machine may be slow on nfs, but will probably work, in a 
half hearted way.

One or both errors are consistent if the machine is up over 24 hours.

I called RedHat Telephone support, they said, yes they've seen this, and 
no they don't have a solution.

So, its up to you.  They claim this is hard to fix because its 
intermittent, but not for me.  I am flogging this machine with very high 
traffic in a test environment.  Its pretty easy to reproduce if I flogg it 
enough.
 

Version-Release number of selected component (if applicable):


How reproducible:
Sometimes

Steps to Reproduce:
1.Simlulate my environment (as dictated above)
2.Flogg it with the compile and link of 25,000 files everything being 
mounted via nfs.
3. Repeat above for 24 - 48 hours
	

Actual Results:  Machine will spit out one or both error messages above, 
machine may lock up ethernet.  Reboot, or possibly ifdown eth0; ifup eth0 
will fix.

Expected Results:  obvious.

Additional info:

I can't upgrade beyond 6.2, this is a test machine, to test interaction of 
our product with RH6.2.
Comment 1 Arjan van de Ven 2001-09-17 17:21:21 EDT
2.2.19 also has a "e100" driver, for the same cards. Could you try this driver
instead ? (see /etc/modules.conf or /etc/conf.modules for where to change the
driver used)
Comment 2 David S. Brown 2001-09-17 17:58:31 EDT
I have made the suggested change and will test for another 48 hours.

--dsbrown
Comment 3 David S. Brown 2001-09-19 14:44:40 EDT
Tested under e100 module.  

I no longer see: eepro100: cmd_wait for(0xffffff80) timedout with(0xffffff80)!

At least at this point, 48 hours later.

But I still see: 

kernel: nfs: task 291659 can't get a request slot 

I also got a very nasty:

Stuck on TLB IPI wait (CPU#3)
followed by a non-responsive termial, I had to power-off reset.
(those may be ones in the error message I can't read my writing)

So, does this mean RedHat thinks it has Four(4) processors on my Two(2) 
processor box?

I suspect a kernel problem?  A lot of the similar errors I've read on Bugzilla 
mention a context switch problem. 

Comment 4 Arjan van de Ven 2001-09-19 14:52:07 EDT
Stuck on TLB IPI wait (CPU#3)


that is the internal count of the CPU, basically bioses number CPU's but only
number 0 is needed, the rest is "free form"... 

The message is often a hardware problem; passing "noapic" on the kernel
commandline (eg lilo prompt) seems to often work around it.
Comment 5 Bugzilla owner 2004-09-30 11:39:10 EDT
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/

Note You need to log in before you can comment on or make changes to this bug.