Bug 114723 - [RHEL3][PATCH]Olympic NIC rt0 cause Kernel Panics.
[RHEL3][PATCH]Olympic NIC rt0 cause Kernel Panics.
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
3.0
i686 Linux
high Severity high
: ---
: ---
Assigned To: Neil Horman
Brian Brock
:
Depends On:
Blocks: 123574
  Show dependency treegraph
 
Reported: 2004-02-01 11:38 EST by Stas Goshtein
Modified: 2007-11-30 17:07 EST (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-12-20 15:54:51 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
patch to correct oops on lobe failure/ring removal in olympic tokenring cards (6.87 KB, patch)
2004-08-05 15:32 EDT, Neil Horman
no flags Details | Diff
new patch to preform same function using keventd, rather than tasklet (7.36 KB, patch)
2004-08-06 16:18 EDT, Neil Horman
no flags Details | Diff
new patch for olympic issue (5.00 KB, patch)
2004-08-09 15:24 EDT, Neil Horman
no flags Details | Diff
more clean up on my last patch (5.52 KB, patch)
2004-08-11 14:17 EDT, Neil Horman
no flags Details | Diff
3rd pass at fixing this bug. (6.63 KB, patch)
2004-08-12 14:06 EDT, Neil Horman
no flags Details | Diff

  None (edit)
Description Stas Goshtein 2004-02-01 11:38:18 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030922

Description of problem:
I have RHEL3 installed on IBM x305 machine with Olympic NIC,
configured as tr0. While media is connected there is not prolem to
rmmod insmod olympic module, but if media was disconnected when the
NIC was up, it will recognize it and write you a message on the
screen. Till now it is ok. But if in this state you will try to
restart network service it will come to getting tr0 up and make kernel
panic.

I know there was a problem with previous 4.0.1.EL kernels, but it is
supposed to be fixed in 9.EL kernel, which is currently installed on a
machine.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1.Configure tr0 - make it up and working
2.Disconnect network media to tr0.
3.Issue "service network restart"
    

Actual Results:  KERNEL PANIC

Expected Results:  Tr0 error message, like error determining network
configuration, like one, when eth0 fails to get to DHCP.

Additional info:
Comment 3 Neil Horman 2004-08-05 15:32:52 EDT
Created attachment 102469 [details]
patch to correct oops on lobe failure/ring removal in olympic tokenring cards

The problem turned out to be the attempted freeing of an irq from within the
interrupt handler of that irq.	I'm not sure why this driver frees the
interrupt on fatal error at all, but since I didn't have a hardware data
sheet/users guide handy to explain it to me, I tried to be as non disruptive as
possible.  This patch corrects the failure by setting up a tasklet to call
free_irq after the handler has completed.  Also, while I was in there, I noted
that the adapter was not resetting properly on ifdown/ifup, so I added a call
to olympic_init from olympic_open to correct that.
Comment 4 Neil Horman 2004-08-06 16:18:35 EDT
Created attachment 102490 [details]
new patch to preform same function using keventd, rather than tasklet

This patch has some corrections in it.	It adds a spinlock to the linked list,
and uses keventd to free the irq rather than a tasklet, so that the free occurs
in a process context.
Comment 5 Neil Horman 2004-08-09 15:24:47 EDT
Created attachment 102535 [details]
new patch for olympic issue

This version of the patch replaces the deffered irq de-registration mechanism
with a direct disabling of the affected interrupts on the card directly.  This
is more in-line with the origional design of the driver, but much safer than
calling free_irq from an interrupt handler directly.  It also fixes a
subsequent oops I hit that is called by an skbuff double free that happens when
you try to ifdown an interface that has been closed by the adapter.
Comment 6 Neil Horman 2004-08-11 14:17:06 EDT
Created attachment 102624 [details]
more clean up on my last patch

cleaning up a few details on my last patch
Comment 7 Neil Horman 2004-08-12 14:06:46 EDT
Created attachment 102664 [details]
3rd pass at fixing this bug.

This patch solves the same problem by removing the free_irq calls from the
interrupt path entirely, allowing olympic_close to do that work. All requisite
memory leak/cleanup problems addressed (as far as I could see), and card
re-init code kept in place.
Comment 8 Ernie Petrides 2004-09-03 20:50:18 EDT
A fix for this problem has just been committed to the RHEL3 U4
patch pool this evening (in kernel version 2.4.21-20.3.EL).
Comment 9 John Flanagan 2004-12-20 15:54:51 EST
An errata has been issued which should help the problem 
described in this bug report. This report is therefore being 
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, 
please follow the link below. You may reopen this bug report 
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2004-550.html

Note You need to log in before you can comment on or make changes to this bug.