Description of problem:
I've reproduced this locally. The crash is ultimately
due to an assumption in e100_free_tcb_pool() that the
pool is, indeed, allocated upon entry to the function.
Or, it could be said that the calling function is at
fault for calling to free the pool when none is
allocated. Either way, e100 tries to dereference a
NULL pointer, then Bad things happen.
The ethtool implementation in this generation of
e100 has some problems dealing with failures. The
e100_open() function can fail if one of its memory
allocations fails, and most ethtool commands that
change parameters will do a "down up" cycle to free
and reallocate the ethtool-modifiable parameters, in
this case, the tx ring.
I've generated a patch that eliminates the panic, and
adds some error returns for a few ethtool commands
(-G tx being one). This is still not quite right,
though, as an "ethtool -g eth0" after such a failure
will claim that there are however many tx buffers
allocated as was requested in the previous (failing)
-G request. The device also will not function,
even though it's nominally up and running.
This driver is outdated; the current driver from Intel is a complete
I'm not sure how much effort we want to put into fixing this one.
The patch is for the RH 2.4.21-18.EL kernel.
(see the IT issue for a discussion of whether the current
Intel driver can be used to replace this older version.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1.run "ethtool -G eth* tx 1024" on an e100 card
Actual Results: Sometimes the kernel panics
Expected Results: Should never panic
Note - I didn't recreate this - the person who filed the issue did.
Created attachment 107227 [details]
Patch supplied with Issue
I think the attachment is busted -- it looks like nothing but HTML to
Created attachment 107250 [details]
The patch (really!)
Try this. Operator error.
As you say, the current version (U4) of the driver is quite different
-- e100_free_tcb_pool() doesn't even exist anymore. The attached
patch won't apply to the current sources.
While I appreciate the patch, I'm going to have to close this as
Since U4 is not the next release (RHEL4 is), and since U4 is not
actually released yet (it's still in beta), I'm reverting this
bug to MODIFIED state. The upgrade of the e100 driver (committed
in kernel version 2.4.21-20.11.EL) has presumably resolved this bug,
which will be set to CLOSED/ERRATA automatically when U4 is released.
Re-opening due to likely back-rev of e100 driver in RHEL3...
A fix for this problem has just been committed to the RHEL3 U4
patch pool this evening (in kernel version 2.4.21-27.EL).
The fix was applied to the back-rev'ed (to 2.3.43-k1) e100 driver.
An errata has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.