Bug 138240 (IT_54308)

Summary: MCA in tulip on ifconfig down/reboot
Product: Red Hat Enterprise Linux 3 Reporter: Grant Grundler <grant.grundler>
Component: kernelAssignee: John W. Linville <linville>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 3.0CC: jgarzik, petrides, riel, tao
Target Milestone: ---   
Target Release: ---   
Hardware: ia64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-05-18 13:28:27 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 132991    

Description Grant Grundler 2004-11-06 00:22:44 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux ia64; en-US; rv:1.7)
Gecko/20040917 Firefox/0.9.3

Description of problem:
The existing tulip driver (0.9.15-pre12) in 2.4.21-20.EL
kernel has two known bugs:
1) In the ifconfig down path, tulip_remove_one() calls
   pci_free_consistent() before calling unregister_netdev().
   Fix is move the unregister_netdev() a few lines up in the source.
   Jeff Garzik original sent me this fix more than 6 monthes ago.
   This fix should be in all RHEL releases by now.

2) tulip_stop_rxtx() doesn't wait for DMA to fully stop like the
   function call name implies. Charlie Brett (HP) gets credit for
   finding this last April. I submitted a patch to Jeff Garzik but
   don't recall it getting accepted or what the outcome was.
   Patch is still available from:
       ftp://ftp.parisc-linux.org/patches/diff-2.6.6-tulip_stop_rxtx

thanks,
grant

iod00d
grant.grundler

Version-Release number of selected component (if applicable):
2.4.21-20.EL

How reproducible:
Always

Steps to Reproduce:
1. from system A, generate large packets
   (e.g. pktgen, "ping -b -f -s 1492")

2. on system B (HP ia64):
    while :
    do
       date
       ifconfig eth4 down
       sleep 5
       ifconfig eth4 up
       sleep 10
    done

eth4 is a tulip device.
 

Actual Results:  HP ia64 machine will MCA in minutes if no seconds.

Expected Results:  system should not MCA.

Additional info:

Comment 1 John W. Linville 2004-11-08 14:45:01 UTC
Patch #1 is already in RHEL3 U4...

Patch #2 has not been accepted upstream (yet) -- will have to
investigate before I can push it in RHEL3...

Comment 2 John W. Linville 2004-11-17 20:28:20 UTC
Patches for #2 posted internall and upstream on 11/17...

Comment 3 Larry Troan 2004-11-18 02:03:06 UTC
HP has asked that this be fixed in rc1 if possible -- it is NOT a blocker.

Comment 4 Larry Troan 2004-11-18 02:08:39 UTC
Arg ! Wrong Release. HP asks if this can be in RHEL3 U4. It's too
late. Making U5 blocker. 

Comment 10 Ernie Petrides 2004-12-10 02:53:25 UTC
A fix for this problem has just been committed to the RHEL3 U5
patch pool this evening (in kernel version 2.4.21-27.3.EL).


Comment 11 Tim Powers 2005-05-18 13:28:27 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2005-294.html