Bug 75267
Summary: | Tigon3 (3C996B-T) NIC does not start properly | ||
---|---|---|---|
Product: | [Retired] Red Hat Linux | Reporter: | petr |
Component: | kernel | Assignee: | Jeff Garzik <jgarzik> |
Status: | CLOSED ERRATA | QA Contact: | Brian Brock <bbrock> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 8.0 | CC: | anne.possoz, davem, ola, peterm |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i686 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2003-03-04 20:14:09 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
petr
2002-10-06 15:49:26 UTC
could you try increasing the timeout (say double) in the code that tries to find link? This is in the "sleep command in file /etc/sysconfig/network-scripts/network-functions on line 136 (the check_link_down function) OK, I increased sleep 5 to sleep 10 and now 8 of 10 boots are successful, 2 unsuccessful. Probably further increasing the delay would help, but it needs time for testing. But the delay seems to be much longer than in Red Hat Linux 6.2 or 7.3. We also have similar problems with the tg3 driver. Even if the IF comes up, it goes down again in a couple of hours. System: IBM x305 NIC: Built-in Broadcom. Oct 13 19:46:29 ns kernel: tg3.c:v1.0 (Jul 19, 2002) Oct 13 19:46:29 ns kernel: tg3: eth0: Link is up at 100 Mbps, full duplex. Oct 13 19:46:29 ns kernel: tg3: eth0: Flow control is on for TX and on for RX. Oct 13 19:46:29 ns kernel: tg3: eth1: Link is up at 100 Mbps, full duplex. Oct 13 19:46:29 ns kernel: tg3: eth1: Flow control is on for TX and on for RX. Oct 13 19:46:29 ns kernel: tg3: eth1: Link is down. Oct 13 19:46:29 ns kernel: tg3: eth0: Link is down. Oct 13 19:46:29 ns kernel: tg3: eth1: Link is up at 100 Mbps, full duplex. Oct 13 19:46:29 ns kernel: tg3: eth1: Flow control is on for TX and on for RX. Oct 13 19:46:29 ns kernel: tg3: eth0: Link is up at 100 Mbps, full duplex. Oct 13 19:46:29 ns kernel: tg3: eth0: Flow control is on for TX and on for RX. Oct 15 11:59:27 ns kernel: tg3: eth0: Link is down. Oct 15 12:34:06 ns kernel: tg3: eth1: Link is down. Oct 15 12:34:09 ns kernel: tg3: eth1: Link is up at 100 Mbps, full duplex. Oct 15 12:34:09 ns kernel: tg3: eth1: Flow control is off for TX and off for RX. Oct 15 12:51:59 ns kernel: tg3: eth1: Link is up at 100 Mbps, full duplex. Oct 15 12:51:59 ns kernel: tg3: eth1: Flow control is off for TX and off for RX. Oct 15 12:51:59 ns kernel: tg3: eth1: Link is down. Oct 15 12:52:02 ns kernel: tg3: eth1: Link is up at 10 Mbps, half duplex. Oct 15 12:52:02 ns kernel: tg3: eth1: Flow control is off for TX and off for RX. Oct 15 16:08:44 ns kernel: tg3.c:v1.0 (Jul 19, 2002) Oct 15 16:08:46 ns kernel: tg3: eth1: Link is up at 100 Mbps, full duplex. Oct 15 16:08:46 ns kernel: tg3: eth1: Flow control is off for TX and off for RX. Oct 15 16:08:47 ns kernel: tg3: eth1: Link is down. Oct 15 16:08:49 ns kernel: tg3: eth1: Link is up at 10 Mbps, half duplex. Oct 15 16:08:49 ns kernel: tg3: eth1: Flow control is off for TX and off for RX. Same problem on RH 7.3 with stock and latest errata kernel. I also suggest you upprade the priority and take IBM x305 (and others with the same gigabit chip) off the certified hardware list. the erratum kernel we released yesterday has major tg3 updates. do they fix the issue for you? kernel-2.4.18-17.8.0.i686.rpm - behavior changed, but it is worse. With sleep 10 the network won't start up as with previous kernel, it is necessary to increase it at least to sleep 15. This is valid for reboot the whole linux, /etc/init.d/network stop and start works correctly even with sleep 1. I tried to replace tg3 driver by bcm5700 driver, decreased timeout to the original value 5 seconds, and everything works perfectly. Maybe tg3 driver should be removed from the autoconfiguarion? And what's the difference, why there are 2 drivers for the same NIC? In 2.2.30 version of Broadcom bcm5700 driver, there is written following: Note 2: If loading the driver on Red Hat 7.3, Red Hat 2.1 AS, and other newer Red Hat kernels and patches, it is necessary to unload the tg3 driver first if it is loaded. While tg3 is a fully functioning driver written by Red Hat et al, Broadcom recommends users to use the bcm5700 driver written and tested by Broadcom. Use ifconfig to bring down all eth# interfaces used by tg3 and do the following to unload the tg3 driver: rmmod tg3 It may also be necessary to manually edit the file /etc/modules.conf to change interface alias names from tg3 to bcm5700. Example: alias eth0 tg3 Replace tg3 with bcm5700: alias eth0 bcm5700 And it is true, at least in my case bcm5700 behaves much better than tg3. Actually, it's perfectly valid for a link to take 10 to 15 seconds to come up. It's very hardware and driver dependant how long this process takes to complete. So the real bug is in the machanisms used by the dhcp client we ship currently. We have to wait until it actually completes the query to the dhcp server, and this means waiting for however long is the longest a link could take to come up. A proper dhcp client implementation would work asynchronously and therefore work regardless of how long a link state would take to arrive. The tg3 driver is just fine, it works it just takes longer to bring the link up which is perfectly valid behavior. As a side note it is unfortunate that after Broadcom agreed to help us with the tg3 driver, which we will continue to ship and use by default, they instead recommend de-installing our driver in their documentation. Thanks for bringing this to our attention, it was news to us. Ok, some of these reports have actually been fixed in more recently posted rpms. Just to get everybody on the latest page, please use "aragorn2" test rpms, posted at http://people.redhat.com/jgarzik/pub/ This is the latest Red Hat errata kernel for 7.x/8.x, with the recent tg3 bug fixes. |