Bug 204916 - tg3: Could not obtain valid ethernet address, aborting.
tg3: Could not obtain valid ethernet address, aborting.
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.4
i386 Linux
medium Severity high
: ---
: ---
Assigned To: John W. Linville
Brian Brock
:
: 216871 (view as bug list)
Depends On:
Blocks: 229570
  Show dependency treegraph
 
Reported: 2006-09-01 07:53 EDT by Juanjo Villaplana
Modified: 2010-10-22 01:55 EDT (History)
6 users (show)

See Also:
Fixed In Version: 42.0.8
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-02-13 14:02:33 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
/var/log/messages (31.13 KB, application/octet-stream)
2006-09-01 07:53 EDT, Juanjo Villaplana
no flags Details
Console log (11.75 KB, text/plain)
2006-09-13 03:13 EDT, Juanjo Villaplana
no flags Details
Console log (9.98 KB, text/plain)
2006-09-13 03:27 EDT, Juanjo Villaplana
no flags Details

  None (edit)
Description Juanjo Villaplana 2006-09-01 07:53:12 EDT
Description of problem:

After installing an HP Proliant DL380 G3 with RHEL4 U4 initialization of eth0
(first integrated BCM5703X) fails intermittently.

Kernel messages for a successful startups looks like:

Aug 31 13:02:29 test02 kernel: tg3.c:v3.52-rh (Mar 06, 2006)
Aug 31 13:02:29 test02 kernel: ACPI: PCI interrupt 0000:02:01.0[A] -> GSI 29
(level, low) -> IRQ 193
Aug 31 13:02:29 test02 kernel: eth0: Tigon3 [partno(TBD) rev 1002 PHY(5703)]
(PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet 00:0b:cd:69:ee:79
Aug 31 13:02:29 test02 kernel: eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0]
Split[0] WireSpeed[1] TSOcap[1]
Aug 31 13:02:29 test02 kernel: eth0: dma_rwctrl[769f4000]
Aug 31 13:02:29 test02 kernel: ACPI: PCI interrupt 0000:02:02.0[A] -> GSI 31
(level, low) -> IRQ 201
Aug 31 13:02:29 test02 kernel: eth1: Tigon3 [partno(TBD) rev 1002 PHY(5703)]
(PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet 00:0b:cd:69:ee:78
Aug 31 13:02:29 test02 kernel: eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0]
Split[0] WireSpeed[1] TSOcap[1]
Aug 31 13:02:29 test02 kernel: eth1: dma_rwctrl[769f4000]

but sometimes fails with the following kernel messages:

Aug 31 13:21:26 test02 kernel: tg3.c:v3.52-rh (Mar 06, 2006)
Aug 31 13:21:26 test02 kernel: ACPI: PCI interrupt 0000:02:01.0[A] -> GSI 29
(level, low) -> IRQ 193
Aug 31 13:21:26 test02 kernel: tg3: Could not obtain valid ethernet address,
aborting.
Aug 31 13:21:26 test02 kernel: tg3: probe of 0000:02:01.0 failed with error -22
Aug 31 13:21:26 test02 kernel: ACPI: PCI interrupt 0000:02:02.0[A] -> GSI 31
(level, low) -> IRQ 201
Aug 31 13:21:26 test02 kernel: eth0: Tigon3 [partno(TBD) rev 1002 PHY(5703)]
(PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet 00:0b:cd:69:ee:78
Aug 31 13:21:26 test02 kernel: eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0]
Split[0] WireSpeed[1] TSOcap[1]
Aug 31 13:21:26 test02 kernel: eth0: dma_rwctrl[769f4000]
Aug 31 13:21:26 test02 kernel: ACPI: PCI interrupt 0000:00:0f.2[A] -> GSI 10
(level, low) -> IRQ 10

and the initialization of eth0 fails because now its the second integrated NIC:

ifup: Device eth0 has different MAC address than expected, ignoring.


Version-Release number of selected component (if applicable):

kernel-smp-2.6.9-42.EL and kernel-smp-2.6.9-42.0.2.EL


How reproducible:

Reboot the server several times and get a success, failure, success, failure
sequence.

Steps to Reproduce:
1. Reboot the server
2. Watch the network startup and /var/log/messages
3.
  
Actual results:

eth0 is not always initialized.


Expected results:

eth0 should be always initialized

Additional info:

We have tried tg3.ko from kernel-smp-2.6.9-34.0.2.EL and bcm5700-8.3.17c-1
provided by HP and they work fine.
Comment 1 Juanjo Villaplana 2006-09-01 07:53:12 EDT
Created attachment 135373 [details]
/var/log/messages
Comment 2 John W. Linville 2006-09-12 08:00:28 EDT
A later tg3 update is available in the test kernels here:

   http://people.redhat.com/linville/kernels/rhel4/

Please give those a try and post the results here...thanks!
Comment 3 Juanjo Villaplana 2006-09-13 03:13:26 EDT
Created attachment 136138 [details]
Console log

Hi John,

I was unable to boot the server with this test kernel, as you will see in the
attached console log, the cciss driver didn't initialize correctly.
Comment 4 Juanjo Villaplana 2006-09-13 03:27:47 EDT
Created attachment 136139 [details]
Console log

This is the console log for a successful boot with 2.6.9-42.0.2.EL.

The issue seems to be related to PCI interrupts. This is the lspci output, hope
this will help:

00:00.0 Host bridge: Broadcom CMIC-WS Host Bridge (GC-LE chipset) (rev 13)
00:00.1 Host bridge: Broadcom CMIC-WS Host Bridge (GC-LE chipset)
00:00.2 Host bridge: Broadcom CMIC-LE
00:03.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
00:04.0 System peripheral: Compaq Computer Corporation Integrated Lights Out
Controller (rev 01)
00:04.2 System peripheral: Compaq Computer Corporation Integrated Lights Out 
Processor (rev 01)
00:0f.0 ISA bridge: Broadcom CSB5 South Bridge (rev 93)
00:0f.1 IDE interface: Broadcom CSB5 IDE Controller (rev 93)
00:0f.2 USB Controller: Broadcom OSB4/CSB5 OHCI USB Controller (rev 05)
00:0f.3 Host bridge: Broadcom CSB5 LPC bridge
00:10.0 Host bridge: Broadcom CIOB-X2 PCI-X I/O Bridge (rev 05)
00:10.2 Host bridge: Broadcom CIOB-X2 PCI-X I/O Bridge (rev 05)
00:11.0 Host bridge: Broadcom CIOB-X2 PCI-X I/O Bridge (rev 05)
00:11.2 Host bridge: Broadcom CIOB-X2 PCI-X I/O Bridge (rev 05)
01:03.0 RAID bus controller: Compaq Computer Corporation Smart Array 5i/532
(rev 01)
02:01.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5703X Gigabit
Ethernet (rev 02)
02:02.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5703X Gigabit
Ethernet (rev 02)
06:01.0 RAID bus controller: Compaq Computer Corporation Smart Array 5i/532
(rev 01)
06:1e.0 PCI Hot-plug controller: Compaq Computer Corporation PCI Hotplug
Controller (rev 14)
Comment 5 Andy Gospodarek 2006-10-09 12:58:46 EDT
Please test the kernels listed here:

http://people.redhat.com/agospoda/#rhel4

and let me know if they resolve the issue.  You *may* encounter the same problem
with this build that you had with Linville's, so if you do see instructions
below for rebuilding it:

http://kbase.redhat.com/faq/FAQ_80_4969.shtm
Comment 7 Juanjo Villaplana 2006-10-13 05:18:25 EDT
This kernel works fine.
Comment 8 Chris Verhoef 2006-10-13 09:53:11 EDT
Had the same problem with a HP workstation xw6000

05:02.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5702X Gigabit
Ethernet (rev 02)

Tried the 2.6.9-42.15.EL.gsstest.100320060 kernel. This also works fine.
Comment 10 Chuck Berg 2006-11-02 18:34:04 EST
I had the same problem on an HP DL380 G3 running update 4. But 2.6.9-42.22.EL
works. I noticed this after a re-install to upgrade from RHEL3.

A different DL380 G3, running 2.4.21-15.ELsmp (upgrade to -47 did not help), had
the same problem. But it was both interfaces, and worked after a cold boot. I
switched to HP's bcm5700 on this machine.

I have other DL380s that do not have the same problem. These two trouble
machines were fine for a couple years (and many reboots).

Is it known why this happens? Should I expect my DL380s to come back up without
networking at any random reboot?
Comment 11 Ettore Virzi 2006-11-14 12:29:25 EST
Had the same problem with a HP DL380 G3 RHEL4.4 in production.
When will it be fixed in the distributed RH kernel?

It has completely blocked our RH cluster and I don't want to recompile all the
cluster modules (GFS & c.) for each test kernel I use.

Comment 12 John W. Linville 2006-11-14 12:59:51 EST
Did you test any of the kernels in the previous comments?  Do they fix the 
issues you are seeing?
Comment 13 Ettore Virzi 2006-11-20 04:56:39 EST
Yes 2.6.9-42.15.EL.gsstest.100320060 kernel is ok
No problem with it
Thanks
Comment 14 John W. Linville 2006-12-05 10:13:44 EST
*** Bug 216871 has been marked as a duplicate of this bug. ***
Comment 16 Dennixx 2007-01-04 15:16:16 EST
Any idea when a fixed errata kernel will be released?

Note You need to log in before you can comment on or make changes to this bug.