Red Hat Bugzilla – Bug 136482
DHCP timeouts during Kickstart
Last modified: 2007-11-30 17:07:04 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.7.3)
Description of problem:
As of RHEL 3 Update 2 I cannot Kickstart via DHCP.
This problem is due to the way network interfaces are being brought
up. I'm not sure if the problem lies in Prior to RHEL 3 Update 2,
anaconda would not bring the network interface down and then back up
in order initiate a DHCP request, it would simply do a
"hot-reconfiguration" of the Kickstart interface. In other words, the
i nterface doesn't lose its link with the switch to get the address.
Now, when kickstart requests a DHCP address it completely downs the
interface and then brings it back up. This is a problem for us because
the DHCP timeout is shorter than the time it takes our switch ports
(Cisco 2900) to go into a forwarding state. As a result, our servers
are never able to get their DHCP lease.
I'm not sure if this problem is in anaconda, dhclient, initscripts, or
the tg3 driver itself.
This problem is in RHEL 3 Updates 2 & 3, as well as RHEL 4 Beta 1.
Version-Release number of selected component (if applicable):
RHEL 3 Updates 2 & 3
Steps to Reproduce:
1. Request a new DHCP address via Anaconda/Kickstart
Actual Results: The interface is disabled entirely, then re-enabled,
which causes the switchport to be reset every time.
Expected Results: The interface should not be completely turned off
then on to get a DHCP address.
This is a new behavior for Red Hat Linux. In previous releases (RHEL 3
Update 1 and before) it could get a DHCP address without resetting the
Update 2 actually didn't change the behavior at all, but some drivers
changed and seem to exacerbate the behavior more. Update 3 adds some
fixes and Update 4 (beta to be released soon) adds another set.
I know that when I use the boot.iso from the initial release of RHEL 3
and Update 1 that I don't have this problem. I never lose the link
between my NIC and the switch during DHCP requests. However, on
Updates 2 and 3, I do. This problem persists on RHEL 4 Beta 1.
I observed the same symptoms with U2, U3, and RH4 Beta 1 on a new HP
DL585. If I used a static ip and was willing to cycle through
the "Can't find server" message a few times (1-3), it would go ahead
and install. I don't have access to the switch to tell what it was
02:06.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5704
Gigabit Ethernet (rev 10)
02:06.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5704
Gigabit Ethernet (rev 10)
This is the same bug as Bug#15896 which was marked WONTFIX many years ago.
has more info -- it's related to spanning tree convergence time, which
exceeds the dhcp retry timout period for the second dhcp sequence --
the one where anaconda is just about to mount the nfs install media.
I need to disagree with the WONTFIX of the other bug. I have this problem, and
it's very pervasive on Cisco hardware.
The workaround for this on the network side is to turn on 'spanning-tree
portfast' on an IOS based switch. However, this is not viable in all network
topologies or with all network administration practices.
The purpose of portfast is to cause a port to go into STP forwarding state,
immediately when link comes up, rather than listening for BPDU's, and then
deciding to forward. With portfast turned on, if there is a loop in the network
(for instance someone hooks a switch up to the port, with two uplinks into the
layer 2 infrastructure, you have a loop).
Mass-closing lots of old bugs which are in MODIFIED (and thus presumed to be
fixed). If any of these are still a problem, please reopen or file a new bug
against the release which they're occurring in so they can be properly tracked.