Red Hat Bugzilla – Bug 60592
eeprom checksum corrupted on CS20/DS20L
Last modified: 2007-04-18 12:40:42 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.4)
Description of problem:
Upon installing beta2 kit on DS20L (CS20), system comes up and network is
configured and one can ping nodes on the network.
Enter ifconfig on serial line console, the information is displayed,
but then gets hung - can't type anything on the console.
One can power off the console terminal and log back in,
but sooner or later the console hangs again.
Entering ifconfig, guarantees the serial line will lock up quickly.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. install 7.2 for Alpha beta2 on DS20l (CS20) on serial line
2. boot system with active network
3. type ifconfig on serial console
Actual Results: 4. witness hung console.
Expected Results: serial console continue to operate.
This happened with beta2 on 2 DS20l systems.
It appears that something corrupts the eeprom on the DS20l,
because upon reboot, the h/w diagnostics complain of eeprom checksum error..
John, does this also happen on a DS10 or is it DS20 specific? (just it's easier
for me to get my hands on a DS10) I'm wondering to myself if the inteerupt
routing isn't initialized correctly. (same sort of bug as the 4100 patch was
trying to address) by the description, it sounds like it
This has only happened on the CS20, not the DS10.
That's the CS20, which is to be called the DS20L..
Created attachment 47163 [details]
I can't reproduce this bug on the CS20D we have in Toronto. Attached session
It appears that the eepro100-diag program modified the eeprom.
Diego, did you power down the CS20 and do a fresh install (not upgrade)?
As a result of these installs, we now have 2 CS20 that have bad checksums on the eeprom.
At boot time, message:
intializing GCT/FRU at 1d400
*** Error (eib0.0.0.4.0), Warning, Bad Checksum on eeprom
so at SRM>>> show dev only shows eia0
Upon booting beta2, Entering eepro100-diag -f -ee shows
***** The EEPROM checksum is INCORRECT| *****
the checksum is 0xA125, it should be 0xBABA
Re-installed 7.1, but 7.1 did not have eepro100-diag, so checksum still bad.
Help! How can we reset the eeprom?
Note: eepro100-diag with beta2 is based on v2.05 , on scyld it's v2.07
Note - the serial line problem went away with a different terminal.
the summary should be eeprom checksum corrupted on CS20/DS20L
I had a quick chat with arjan this morning about this.
He'd like us to try using the current cvs version of the eepro100-diag program
I'm attaching the srpm
Created attachment 47464 [details]
Updated kernel-utils package (rpm --rebuild kernel-utils-2.4-3.6.4.src.rpm)
tried to reset the eeprom with the eepro10-diag that was in the rpm attachment,
but still no progress. The command used was:
# /usr/sbin/eepro100-diag -E -f
# /usr/sbin/eepro100-diag -ee -f | less
and again for
Index #1 checksum is 0x8FBD, it should be 0xBABA!
Index #2 checksum is 0xFF00, it should be 0xBABA!
Upone power up, SRM still complains about bad checksum and
show dev still doesn't show eib0
What else can we an try to reset the eeprom?
Will be fixed in kudzu-0.99.34-1; however, that won't reset the EEPROM for you.
You can try running eepro100-diag -f -G3 -w -w ; this *should* sanitize the
eeprom, barring a bug in eepro100-diag.
Tried the command suggested - eepro100-diag -f -G3 -w -w
still results in eeprom checksum incorrect.
and same SRM problem eib0 - not enabled
Created attachment 47960 [details]
Output of 'show dev' after power cycling the machine
I still can't seem to reproduce the problem. I power cycled the machine and
everything came up normally.
Anything else I could try?
Ok, talked to Bill about this.
Seepms that eepro100-diag is now pulled from the kudzu package so this should
not be a future issue.
(so for now zap /usr/sbin/eepro100-diag)
As for what to do in the meantime on the machine thats got the corrupted eeprom,
I don't have an answer.
I'm going to guess this is an onboard controller too so not something that you
can lift off onto an intel box.
It appears that the DS20L was manufactured with two different ethernet chip sets
the Intel 82550 and Intel 82559. This probably explains why we see the problem
in Nashua and you do not see it in Toronto. We should probably compare chip
00:03.0 SCSI storage controller: LSI Logic / Symbios Logic (formerly NCR)
53c1010 66MHz Ultra3 SCSI Adapter (rev 01)
00:04.0 Ethernet controller: Intel Corporation 82557 [Ethernet Pro 100] (rev
08)00:07.0 ISA bridge: Acer Laboratories Inc. [ALi] M1533 PCI to ISA Bridge
[Aladdin IV] (rev c3)
00:10.0 IDE interface: Acer Laboratories Inc. [ALi] M5229 IDE (rev c2)
00:11.0 Non-VGA unclassified device: Acer Laboratories Inc. [ALi] M7101 PMU
01:03.0 Ethernet controller: Intel Corporation 82557 [Ethernet Pro 100] (rev 08)
fixed in next release