Bug 142456

Summary: kernel BUG in e1000_reset_hw
Product: Red Hat Enterprise Linux 2.1 Reporter: satish <ksatishv>
Component: kernelAssignee: John W. Linville <linville>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: 2.1CC: linville, riel
Target Milestone: ---   
Target Release: ---   
Hardware: ia64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-03-08 21:08:48 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description satish 2004-12-09 20:21:16 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.9)
Gecko/20020408

Description of problem:
I have a application which writes a lot of data in to a network based
filesystem. After a while we hit a hard hang. This is seen only on
e1000 driver. When i use tg3 i do not see the hard hang
I am using a 2.4.18.e-12smp kernel

<6> Netdev
<4> kernel BUG at e1000-hw.c:146
<4> swapper[0]: Nat consumption 8589934624
<4> e1000_reset_hw[1000] + 0x1d0 <---

show_stack
showa_regs
die
die_if_kernel
ia64_fault

Is there any workaround patch that are available for e1000. I tried
with the latest version on e1000 ie 5.5 and still see the hang

Version-Release number of selected component (if applicable):
4.3.9-k1

How reproducible:
Always

Steps to Reproduce:
1. Run the test porgram which pumps a lot of date in to the e1000 driver
2.
3.
    

Actual Results:  Machine hangs due to Kernel BUG

Expected Results:  work fine

Additional info:

Comment 1 Rik van Riel 2004-12-09 20:53:29 UTC
*** Bug 142457 has been marked as a duplicate of this bug. ***

Comment 2 Rik van Riel 2004-12-09 20:55:06 UTC
Would using the tg3 driver be an acceptable workaround for now, or are
there specific reasons why e1000 is needed on this system ?

Comment 3 Ernie Petrides 2004-12-09 21:48:29 UTC
*** Bug 142457 has been marked as a duplicate of this bug. ***

Comment 4 satish 2004-12-13 14:34:53 UTC
tg3 is acceptable, but i  am not sure it will work on Intel hardware.
I tried loading it and it failed with "unknown device". That apart i
am not sure we can convince our customers to use tg3 in place of
e1000. Is there any chance of getting the e1000 fixed ?

Comment 5 John W. Linville 2005-02-18 14:21:15 UTC
2.4.18.e-12smp (AS2.1-ia64 "Gold") is pretty old...have you considered
using a later version?  There have been many updates since then,
including several to the e1000 driver.

Comment 6 satish 2005-02-24 19:43:24 UTC
We tried using the latest version of e1000 driver and that did not 
help. How will moving to a later kernel version help? 

Comment 7 John W. Linville 2005-02-28 18:32:07 UTC
It is generally helpful to start with a current kernel.  That way, we
know that it is not some incompatibility between the later driver and
the earlier kernel perpetuating the problem.  Besides, even if we knew
exactly what to fix, we would apply that fix to the latest kernel.

Is there some reason to resist going to the later kernel?  If not,
then please recreate the issue with the latest kernel available.  Thanks!

Comment 8 John W. Linville 2005-03-08 21:08:48 UTC
The problem described in the initial problem report results from the
e1000 driver sleeping in interrupt context.  This problem is corrected
in later versions of the e1000 driver, including the version of e1000
in the latest AS2.1-ia64 kernels.

If you are experiencing a problem while using the latest AS2.1-ia64
kernel, it is a different problem.  In that case, please open a new
bug or contact Red Hat through any other support channels available to
you.