Red Hat Bugzilla – Bug 586324
qemu pxe boot fails - dhcp timeout
Last modified: 2013-01-09 06:33:21 EST
Description of problem:
After upgrade to F-13 I can no longer boot from network
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1.qemu-kvm -hda rhel5.x86_64.img -net nic -net tap -m 1024 -boot n
DHCP(net0 blah blah blah)........ Connection timed out (0x4c106035)
working network boot
this was working in F12
If I hit ctrl-b to configure gpxe boot, wait for a 5 seconds and then type exit it works fine
Looking into this, it appears to be a problem with dhcp in general using certain network configurations.
Same problem over here.
Go to your QEMU console and do a system_reset, your system will PXE boot just fine.
tested on QEMU 0.13.50
I guess you're all using a bridge on the host to connect the guests network interface. At least I saw this issue with such a setup. Reducing the forward delay of the bridge solves it for me, e.g.
# brctl setfd <bridge name> 4.0
The default setting seems to be 15 seconds, which is too long for the guests PXE to get its DHCP requests delivered the first time.
Forward delay should be set to 0 for bridges in a virt setup. See example ifcfg-br0 config for Fedora here
I have s setup here, where multiple servers each have two physical network interfaces, which are connected to two different switches for redundancy. STP is enabled and the root bridge is non of the Linux servers, thus each one inherits the "forward delay" from the root-bridge via STP: Even if I change the forwarding delay to some lower value, its resetted to the network wide default value of 15 seconds again after some time.
Creating bridges in bridges in not possible with Linux. (one for the physical network cards, one for the virtual guests, both connected to each other).
Changing the PXE-timeout to >=15 would be the easiest work-around, is there some config to do that?
(Alternatively bonding+bridging might work for my setup, but changing a productive system is critical for me.)
Why would the bridge delay default to a non-0 value? That seems unintuitive. Anyway, lowering the delay to 0 seems to fix the problem for me.
(In reply to comment #6)
> Why would the bridge delay default to a non-0 value? That seems unintuitive.
> Anyway, lowering the delay to 0 seems to fix the problem for me.
The Bridge needs time to learn, if enabling forwarding on the port would destroy the tree property of the network and would create a circle: If a package sent out that port arrives again later on any other port, than you have a circle and must not enable port forwarding, which would lead to all packages endlessly circulating in the network.
For virtualization your physical servers are normally a leaf node and you KNOW, that it will never create a loop, since you often have only one physical network interface and multiple virtual interfaces, one for each virtual instance. In this case you can safely disable the delay.
But in my case my physical server has multiple physical network interfaces and (potentially) is a bridge between those different networks connects to each NIC. Therefore its essential that forwarding is ONLY enabled, if the root-bridge decides via the ST-protocol that this host should do the forwarding between the two networks.
Optimally I would have two different bridges:
1. one for the physical NICs with forwarding handled via the ST-protocol
2. a second one for the virtual instances configures as a leaf bridge with forwarding delay set to 0.
Both bridges must be connected, but as far as I know this is not possible with Linuxs standard bridge.
Philipp, thanks for the clarification. In your case, I think that increasing the DHCP timeout would be the only reasonable solution, and it wouldn't hurt those of us who are using bridges with only one physical interface.
In my case, I've manually set the bridge delay on my system, but I don't know how I would have found out that the bridge should have 0 delay. It seems like the kernel could determine that if there were only one physical network interface, then the delay should be 0. Of course, this would not be a simple one-line patch, so it wouldn't be a quick fix.
*** Bug 632712 has been marked as a duplicate of this bug. ***
*** Bug 638735 has been marked as a duplicate of this bug. ***
It sounds like the case that the bridge delay value needing to be some large number like 15 is to accommodate corner case setups.
Most users are going to need the bridge delay to be shorter than the dhcp timeout.
Why have we not lowered the bridge delay to something small? Something less than 4.
Has anyone confirmed that this is still a problem in Fedora 15?
(In reply to comment #12)
> Has anyone confirmed that this is still a problem in Fedora 15?
I've removed workaround DELAY=0 from my configuration, restarted network and tried qemu booting pxe - which failed, there was no difference compared to comment #0
Don't know about F15, but the problem is WORSE in F14 than discussed previously.
My setting of DELAY=0 in ifcfg-br0 is not being honored. Similary, "brctl setfd br0 0" doesn't seem to have any effect (the forward delay is stuck at 15 seconds.)
As a workaround, I was able to turn off STP entirely.
I have run "yum update" today, in case the question of old versions of the software arise.
In any event, the fact that STP is hard to control is a separate bug, the bottom line is that the stock gPXE simply doesn't try very long (4 tries total with a 1,2 and 4 second delays), certainly not like much real hardware which generally retries PXE forever, or else restarts from the beginning of its boot device list, and thus eventually gets back to PXE.
(In reply to comment #14)
> My setting of DELAY=0 in ifcfg-br0 is not being honored. Similary, "brctl
> setfd br0 0" doesn't seem to have any effect (the forward delay is stuck at 15
Just FYI, since I had to read the kernel source to learn this some time ago:
If you have a network with multiple bridges, all non-root-bridges copy over the forwarding delay value as broadcasted by the root bridge. You can only change the FD for a short time, until next STP packet is received from your bridge, which resets the FD value to the one advertised by the root bridge. So if you really know that your bridge is a leaf-bridge with only one physical network link, than you can disable STP and force the FD=0.
If you have two or more physical links with other hosts also having bridges between these links, than you must no disable STP.
Any updates on this?
In cases where STP is required and the forwarding delay learned from the root bridge is more than about 10 seconds, the guest will fail at the pxe/dhcp phase. Since a cisco network default results in a 30-second forwarding delay, this is a going to be a general problem.
Is there some configurable way, or a script we can hack, to add some delay between when the new vnet interface is created and when the client is actually started? Or, can we increase the wait time in the gPXE dhcp client?
Ideally the process of creating the vnet interface and adding it to the bridge would query the bridge forwarding delay, and wait a couple seconds longer than that delay before it returns. This would ensure the new interface is forwarding before any further steps in creating the new guest take place.
Jeff, you summarized the problem correctly.
The current work-around is to set the bridge delay to a lower number.
brctl setfd <bridge name> 4.0
I put this line in my startup scripts.
This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component.
F15 is end of life in a month, so I wouldn't want to change a timeout now and have some unexpected side effect. Can someone confirm this is still an issue in f16 or later? If so, we can move this to gpxe/ipxe and consider changing the default DHCP timeout.
Tested with F16 updated as of today, the problem is unchanged. The startup sequence is:
1: Host creates bridge interface
2: Bridge enters listening state
3: Host creates guest
4: Guest attempts to pxe-boot
5: Bridge enters learning state
6: PXE times out
7: Bridge enters forwarding state
Most servers will attempt to pxe-boot repeatedly if there is no other boot device available. Suggest simply making the pxe bios loop if pxe-boot fails the first time. Perhaps set some limit on the number of loops, 5 or 10 perhaps, to prevent a worthless guest from consuming host resources.
seabios 1.7.1 actually does restart the failed boot process after 60 seconds, I think that would fix things here (though it's a bit heavy handed). Moving this bug to F17 (since F16 is EOL in a couple months and not worth the risk of backporting the new seabios).
seabios-1.7.1-1.fc17 has been submitted as an update for Fedora 17.
* should fix your issue,
* was pushed to the Fedora 17 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing seabios-1.7.1-1.fc17'
as soon as you are able to.
Please go to the following url:
then log in and leave karma (feedback).
seabios-1.7.1-1.fc17 has been pushed to the Fedora 17 stable repository. If problems still persist, please make note of it in this bug report.