From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.5) Gecko/20011014 Description of problem: Using PXELINUX to grab the kernel and initrd via bootp from a kickstart server, anaconda is unable to pick up an IP address from the DHCP server and reverts back to manual install. Once in manual install mode, you may request a Dynamic IP a second time, and anaconda will successfully get that request. Looking at the DHCP server logs it is clear that the server is responding with a OFFER to both of pump's DISCOVERs, but for some reason the pump portion of anaconda never acknowledges the response. Note that a minor change to loader.c to make it try several times to get a DHCP address seems to solve the problem (address is picked up on the second attempt). Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. setup dhcpd to provide bootp booting services 2. install PXELINUX in tftp dir on server 3. set NIC in server to boot from network 4. select install label, and go. Wait for DHCP request to grab the ks.cfg. Actual Results: The DHCP request fails, which keeps loader from picking up the ks.cfg from the kickstart server (via NFS) fails, and the installer drops back into interactive mode. Expected Results: DHCP request worked and installer shouldn't bug me until the post-install. :-) Additional info: PXELINUX version: 1.63 isc-dhcpd version: tried both dhcp-2.0pl5-8 and dhcp-3.0 DHCP/kickstart server running RHL 6.2 All servers using Intel Pro 100/S NICs (e100 driver on the dhcp server, eepro100 on 7.2 from 2.4.7-10) No other systems (wide variety of OS's and archs) have problems picking up a DHCP address... including floppy based kickstarts for RHL 6.2 to 7.2 A patch that makes anaconda try sending multiple DHCP requests (4) should be attached and seems to solve the problem... IP address is acknowledged on the second attempt.
Created attachment 37986 [details] patch to anaconda to make multiple DHCP reqs for network kickstart
This is strange because we do kickstart PXE boot installs all the time. Any ideas here Matt?
The big thing I'm concerned about is getting anaconda to give DHCP a couple of tries before giving up on a DHCP server. Missing one round of DHCPOFFERs isn't worth dropping out of a kickstart for, IMO. Regarding the actual cause, I think the big problem is that we're using PXELINUX (an offshoot of syslinux) that is being served up from bootp/dhcpd rather than a true PXE server. It seems as if pump is getting confused by something PXELINUX is doing and that after pumpDhcpRun() is run the first time, the problem is reset. I haven't been able to replicate it under any other circumstances.
Not that I'm necessarily against the idea of adding a retry, but we use pxelinux here as well. If you take a look at the dhcp packets on the network, is there anything "strange" about them?
Any more info here?
Yeah, but given the security updates of the past week and some other stuff that has to get done before holidays I haven't had a chance to gather detailed info. I do recall that the "breq" field wasn't matching up with "bresp" on the first try... couldn't tell if they matched on the second try since the info scrolls off the screen. Example: installer thinks its listening for "0xb7b34f54" and dhcp server sees "0xb748514d" and responds accordingly (card's MAC is 00:02:b3:48:51:4d). It didn't look like anything strange or unexpected was happening on the network's side of things. Hopefully, before the end of the week I will have tcpdump logs and have figured out some way to capture all the info pump is spitting out. Please let me know if there's some debugging code I can turn on that might help.
It seems the problem was in the version of pxelinux being used. Using version 1.65 (recent release) as well as 1.52 from the RH 7.2 CDROMs, everything is cool (response from DHCP server picked up right away), but trying a kickstart using 1.63 still seems to trip it up. It'd still be nice to have anaconda not give up so easily. :-)
Multiple passess will take a very long time to time out. Normally everything works fine, but PXELINUX was known to have some bugs that caused things like this.