Bug 586324

Summary: qemu pxe boot fails - dhcp timeout
Product: [Fedora] Fedora Reporter: Michal Hlavinka <mhlavink>
Component: qemuAssignee: Fedora Virtualization Maintainers <virt-maint>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 17CC: amcnabb, amit.shah, berrange, charles.butterfield, crobinso, dwmw2, frank.arnold, gcosta, gordan, hahn, itamar, jan.smets, jaswinder, jeff, jforbes, jyundt, knoel, mgregg, scottt.tw, virt-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-11-14 21:26:32 EST Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:

Description Michal Hlavinka 2010-04-27 06:05:47 EDT
Description of problem:
After upgrade to F-13 I can no longer boot from network

Version-Release number of selected component (if applicable):
qemu-0.12.3-8.fc13.x86_64

How reproducible:
always

Steps to Reproduce:
1.qemu-kvm -hda rhel5.x86_64.img -net nic -net tap -m 1024 -boot n
2.
3.
  
Actual results:
DHCP(net0 blah blah blah)........ Connection timed out (0x4c106035)

Expected results:
working network boot

Additional info:
this was working in F12

If I hit ctrl-b to configure gpxe boot, wait for a 5 seconds and then type exit it works fine
Comment 1 Justin M. Forbes 2010-04-30 10:54:27 EDT
Looking into this, it appears to be a problem with dhcp in general using certain network configurations.
Comment 2 Smets Jan 2010-05-05 11:08:04 EDT
Same problem over here.

Go to your QEMU console and do a  system_reset, your system will PXE boot just fine.

tested on QEMU 0.13.50
Comment 3 Frank Arnold 2010-11-08 07:24:36 EST
I guess you're all using a bridge on the host to connect the guests network interface. At least I saw this issue with such a setup. Reducing the forward delay of the bridge solves it for me, e.g.

 # brctl setfd <bridge name> 4.0

The default setting seems to be 15 seconds, which is too long for the guests PXE to get its DHCP requests delivered the first time.
Comment 4 Daniel Berrange 2010-11-08 07:27:34 EST
Forward delay should be set to 0 for bridges in a virt setup. See example ifcfg-br0 config for Fedora here

  http://wiki.libvirt.org/page/Networking#Creating_network_initscripts
Comment 5 Philipp Hahn 2010-11-16 10:07:54 EST
I have s setup here, where multiple servers each have two physical network interfaces, which are connected to two different switches for redundancy. STP is enabled and the root bridge is non of the Linux servers, thus each one inherits the "forward delay" from the root-bridge via STP: Even if I change the forwarding delay to some lower value, its resetted to the network wide default value of 15 seconds again after some time.
Creating bridges in bridges in not possible with Linux. (one for the physical network cards, one for the virtual guests, both connected to each other).

Changing the PXE-timeout to >=15 would be the easiest work-around, is there some config to do that?

(Alternatively bonding+bridging might work for my setup, but changing a productive system is critical for me.)
Comment 6 Andrew McNabb 2010-11-18 20:31:46 EST
Why would the bridge delay default to a non-0 value?  That seems unintuitive.  Anyway, lowering the delay to 0 seems to fix the problem for me.
Comment 7 Philipp Hahn 2010-11-19 02:15:44 EST
(In reply to comment #6)
> Why would the bridge delay default to a non-0 value?  That seems unintuitive. 
> Anyway, lowering the delay to 0 seems to fix the problem for me.

The Bridge needs time to learn, if enabling forwarding on the port would destroy the tree property of the network and would create a circle: If a package sent out that port arrives again later on any other port, than you have a circle and must not enable port forwarding, which would lead to all packages endlessly circulating in the network.

For virtualization your physical servers are normally a leaf node and you KNOW, that it will never create a loop, since you often have only one physical network interface and multiple virtual interfaces, one for each virtual instance. In this case you can safely disable the delay.

But in my case my physical server has multiple physical network interfaces and (potentially) is a bridge between those different networks connects to each NIC. Therefore its essential that forwarding is ONLY enabled, if the root-bridge decides via the ST-protocol that this host should do the forwarding between the two networks.

Optimally I would have two different bridges:
1. one for the physical NICs with forwarding handled via the ST-protocol
2. a second one for the virtual instances configures as a leaf bridge with forwarding delay set to 0.
Both bridges must be connected, but as far as I know this is not possible with Linuxs standard bridge.
Comment 8 Andrew McNabb 2010-11-19 11:05:22 EST
Philipp, thanks for the clarification.  In your case, I think that increasing the DHCP timeout would be the only reasonable solution, and it wouldn't hurt those of us who are using bridges with only one physical interface.

In my case, I've manually set the bridge delay on my system, but I don't know how I would have found out that the bridge should have 0 delay.  It seems like the kernel could determine that if there were only one physical network interface, then the delay should be 0.  Of course, this would not be a simple one-line patch, so it wouldn't be a quick fix.
Comment 9 Bill Burns 2010-11-20 08:12:18 EST
*** Bug 632712 has been marked as a duplicate of this bug. ***
Comment 10 Michael Gregg 2011-03-04 16:53:30 EST
*** Bug 638735 has been marked as a duplicate of this bug. ***
Comment 11 Michael Gregg 2011-03-04 17:05:47 EST
It sounds like the case that the bridge delay value needing to be some large number like 15 is to accommodate corner case setups. 

Most users are going to need the bridge delay to be shorter than the dhcp timeout. 

Why have we not lowered the bridge delay to something small? Something less than 4.
Comment 12 Andrew McNabb 2011-05-02 17:35:13 EDT
Has anyone confirmed that this is still a problem in Fedora 15?
Comment 13 Michal Hlavinka 2011-05-03 04:16:22 EDT
(In reply to comment #12)
> Has anyone confirmed that this is still a problem in Fedora 15?

I've removed workaround DELAY=0 from my configuration, restarted network and tried qemu booting pxe - which failed, there was no difference compared to comment #0
Comment 14 Charles Butterfield 2011-06-17 16:36:53 EDT
Don't know about F15, but the problem is WORSE in F14 than discussed previously.

My setting of DELAY=0 in ifcfg-br0 is not being honored.  Similary, "brctl setfd br0 0" doesn't seem to have any effect (the forward delay is stuck at 15 seconds.)

As a workaround, I was able to turn off STP entirely.

I have run "yum update" today, in case the question of old versions of the software arise.
Comment 15 Charles Butterfield 2011-06-18 22:02:54 EDT
In any event, the fact that STP is hard to control is a separate bug, the bottom line is that the stock gPXE simply doesn't try very long (4 tries total with a 1,2 and 4 second delays), certainly not like much real hardware which generally retries PXE forever, or else restarts from the beginning of its boot device list, and thus eventually gets back to PXE.
Comment 16 Philipp Hahn 2011-06-21 03:22:36 EDT
(In reply to comment #14)
> My setting of DELAY=0 in ifcfg-br0 is not being honored.  Similary, "brctl
> setfd br0 0" doesn't seem to have any effect (the forward delay is stuck at 15
> seconds.)

Just FYI, since I had to read the kernel source to learn this some time ago:
If you have a network with multiple bridges, all non-root-bridges copy over the forwarding delay value as broadcasted by the root bridge. You can only change the FD for a short time, until next STP packet is received from your bridge, which resets the FD value to the one advertised by the root bridge. So if you really know that your bridge is a leaf-bridge with only one physical network link, than you can disable STP and force the FD=0.
If you have two or more physical links with other hosts also having bridges between these links, than you must no disable STP.
Comment 17 Jeff Thomas 2012-02-09 17:56:06 EST
Any updates on this?

In cases where STP is required and the forwarding delay learned from the root bridge is more than about 10 seconds, the guest will fail at the pxe/dhcp phase. Since a cisco network default results in a 30-second forwarding delay, this is a going to be a general problem.

Is there some configurable way, or a script we can hack, to add some delay between when the new vnet interface is created and when the client is actually started?  Or, can we increase the wait time in the gPXE dhcp client?

Ideally the process of creating the vnet interface and adding it to the bridge would query the bridge forwarding delay, and wait a couple seconds longer than that delay before it returns. This would ensure the new interface is forwarding before any further steps in creating the new guest take place.
Comment 18 Michael Gregg 2012-02-09 18:27:58 EST
Jeff, you summarized the problem correctly. 

The current work-around is to set the bridge delay to a lower number.

Like this:

brctl setfd <bridge name> 4.0

I put this line in my startup scripts.
Comment 19 Fedora Admin XMLRPC Client 2012-03-15 13:55:27 EDT
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.
Comment 20 Cole Robinson 2012-05-28 20:20:20 EDT
F15 is end of life in a month, so I wouldn't want to change a timeout now and have some unexpected side effect. Can someone confirm this is still an issue in f16 or later? If so, we can move this to gpxe/ipxe and consider changing the default DHCP timeout.
Comment 21 Jeff Thomas 2012-05-29 16:24:31 EDT
Tested with F16 updated as of today, the problem is unchanged.  The startup sequence is:

1: Host creates bridge interface
2: Bridge enters listening state
3: Host creates guest
4: Guest attempts to pxe-boot
5: Bridge enters learning state
6: PXE times out
7: Bridge enters forwarding state

Most servers will attempt to pxe-boot repeatedly if there is no other boot device available. Suggest simply making the pxe bios loop if pxe-boot fails the first time. Perhaps set some limit on the number of loops, 5 or 10 perhaps, to prevent a worthless guest from consuming host resources.
Comment 22 Cole Robinson 2012-10-27 20:46:38 EDT
seabios 1.7.1 actually does restart the failed boot process after 60 seconds, I think that would fix things here (though it's a bit heavy handed). Moving this bug to F17 (since F16 is EOL in a couple months and not worth the risk of backporting the new seabios).
Comment 23 Fedora Update System 2012-10-27 21:28:47 EDT
seabios-1.7.1-1.fc17 has been submitted as an update for Fedora 17.
https://admin.fedoraproject.org/updates/seabios-1.7.1-1.fc17
Comment 24 Fedora Update System 2012-10-29 23:49:56 EDT
Package seabios-1.7.1-1.fc17:
* should fix your issue,
* was pushed to the Fedora 17 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing seabios-1.7.1-1.fc17'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-17201/seabios-1.7.1-1.fc17
then log in and leave karma (feedback).
Comment 25 Fedora Update System 2012-11-14 21:26:35 EST
seabios-1.7.1-1.fc17 has been pushed to the Fedora 17 stable repository.  If problems still persist, please make note of it in this bug report.