Bug 220725 - e1000 driver thinks EEPROM checksum is invalid
e1000 driver thinks EEPROM checksum is invalid
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
9
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Kernel Maintainer List
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-12-24 12:22 EST by Ralf Ertzinger
Modified: 2008-06-04 15:29 EDT (History)
3 users (show)

See Also:
Fixed In Version: F9
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-06-04 15:29:34 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Ralf Ertzinger 2006-12-24 12:22:21 EST
Description of problem:
Since kernel-2.6.19-1.2891.fc7 the e1000 driver thinks that the EEPROM checksum
of my notebook's network controller is invalid. kernel-2.6.19-1.2890.fc7 detects
and drives the card just fine.

2890 boot log:
e1000: 0000:02:00.0: e1000_probe: (PCI Express:2.5Gb/s:Width x1) 00:16:d3:32:bc:a6
e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
e1000: eth0: e1000_watchdog: NIC Link is Up 100 Mbps Half Duplex
e1000: eth0: e1000_watchdog: 10/100 speed: disabling TSO

2891 boot log:
e1000: 0000:02:00.0: e1000_probe: The EEPROM Checksum Is Not Valid
e1000: probe of 0000:02:00.0 failed with error -5


Version-Release number of selected component (if applicable):
kernel-2.6.19-1.2891.fc7

How reproducible:
Always

Steps to Reproduce:
1. Update to 2891, boot
2.
3.
  
Actual results:
No network

Expected results:
Network

Additional info:
Comment 1 Ralf Ertzinger 2006-12-24 13:01:40 EST
Turns out this is not deterministic. The driver has found the card after a reboot.
Comment 2 Tom London 2006-12-25 20:11:28 EST
I see this problem on Thinkpad X60 with various kernels; 
but this only happens on a 'restart'.

Once this occurs, it repeats until I do a poweroff/poweron cycle. [Last night
had about 5 restarts in a row with this problem.]

lspci entry:
02:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet Controller

Any other info that would be useful?
Comment 3 Ralf Ertzinger 2006-12-26 08:19:40 EST
Thinkpad X60s here. The lspci output is the same.
Comment 4 Ralf Ertzinger 2007-02-04 06:23:48 EST
Most probably the BIOS does not fully reset the card on reboot (or botches
something up).
Maybe the driver could do an additional hard reset of the card.
Comment 5 Chuck Ebbert 2007-02-05 11:52:08 EST
There is some good information about this problem at:

http://www.thinkwiki.org/wiki/Problem_with_e1000:_EEPROM_Checksum_Is_Not_Valid

Comment 6 Tom London 2007-02-23 10:32:22 EST
Problem still exists with kernel-2.6.20-1.2940.fc7
Comment 7 Tom London 2007-03-10 14:25:37 EST
Running the script described in #5 seems to have fixed this for me.
Comment 8 Mikko Huhtala 2007-05-18 09:18:59 EDT
I'm using a ThinkPad R50 and kernel 2.6.21-1.3163.fc7. Cold start (power off and
on) works ok, but after issuing the command 'reboot', booting ends with a kernel
panic and the following message:

  EIP: [<f8a8944f>] e1000_clean+0x1d6/0x23a [e1000] SS:ESP 0068:c0762fa8
  Kernel panic - not syncing: Fatal exception in interrupt

This is reproducible: cold start works every time and reboot fails every time.

lspci says that the NIC is Intel 82540P, rev 03

I'm not sure if this is the same problem as the one #5 refers to. Will try the
Lenovo script.

Comment 9 Mikko Huhtala 2007-05-18 09:24:06 EDT
I had a typo in #8, the NIC is 82540EP, rev 03

The Lenovo script says it does not apply to this NIC.

Should I file a separate bug?
Comment 10 Chuck Ebbert 2007-05-18 18:43:25 EDT
(In reply to comment #9)
> I had a typo in #8, the NIC is 82540EP, rev 03
> 
> The Lenovo script says it does not apply to this NIC.
> 
> Should I file a separate bug?
> 

See #240339
Comment 11 Lubomir Kundrak 2007-10-04 04:07:05 EDT
My t60 claimed that my internal e1000 is gone this morning and it will delay
initialization. It turned out to be exactly this problem:


Oct  4 09:37:28 localhost kernel: Intel(R) PRO/1000 Network Driver - version
7.3.20-k2-NAPI
Oct  4 09:37:28 localhost kernel: Copyright (c) 1999-2006 Intel Corporation.
Oct  4 09:37:28 localhost kernel: ACPI: PCI Interrupt 0000:02:00.0[A] -> GSI 16
(level, low) -> IRQ 16
Oct  4 09:37:28 localhost kernel: e1000: 0000:02:00.0: e1000_probe: The EEPROM
Checksum Is Not Valid
Oct  4 09:37:28 localhost kernel: ACPI: PCI interrupt for device 0000:02:00.0
disabled
Oct  4 09:37:28 localhost kernel: e1000: probe of 0000:02:00.0 failed with error -5

I was not able to do anything about that. I did both warm and cold reboots -- no
success. In a horror I wanted to find something that would correct an eprom in
the internets; though when attempting to connect to the interents from my wife's
machine I found out that the network is down. It was chuck [1] who disconnected
the network switch's power cord.

[1] http://skosi.org/~lkundrak/misc/cica.jpeg

When I connected the switch back, the driver did not refuse to load anymore (I
did not reboot, just rmmod/modprobe). I am not able to reproduce that, but
looking at the log I see that this happened once before, a couple of days back
(I was running the same kernel as now then).

My kernel is:

Linux gopher 2.6.22.9-91.fc7 #1 SMP Thu Sep 27 23:10:59 EDT 2007 i686 i686 i386
GNU/Linux

And the ethernet adapter is:

02:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet Controller
02:00.0 0200: 8086:109a
Comment 12 Jesse Brandeburg 2008-01-08 17:34:30 EST
this is the ASPM issue, well documented and fix submitted upstream.

https://bugzilla.redhat.com/show_bug.cgi?id=400561
Comment 13 Jesse Brandeburg 2008-01-08 17:37:00 EST
oops, I was referring to comment 10 in bug #400561, has this patch:
https://bugzilla.redhat.com/attachment.cgi?id=272011

which should fix this issue on T60/61, the two bugs are unrelated otherwise.
Comment 14 Chuck Ebbert 2008-01-08 17:47:12 EST
davej reports getting the invalid checksum error even after the ASPM disable
patch had been applied.
Comment 15 Bug Zapper 2008-05-13 22:31:35 EDT
Changing version to '9' as part of upcoming Fedora 9 GA.
More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 16 Dave Jones 2008-05-28 21:35:51 EDT
This should be fixed now, without the need for module options or the like.  Ralf ?
Comment 17 Ralf Ertzinger 2008-05-29 04:22:19 EDT
It definitely works for me. The reason for this is not known
to me, though, since I also once tried Lenovo's solution
(which did some ethtool magic fiddling with the card EEPROM,
I think)

Note You need to log in before you can comment on or make changes to this bug.