Description of problem: PXE booting of KVM VMs doesn't work in any mode (private, NAT or bridged), with any of the emulated NICs (RTL, Intel, PCNET, virtio). DHCP fails with "No IP address" message. However, once the machine is set up, DHCP works fine on it, so the problem appears to be related to PXE. Version-Release number of selected component (if applicable): etherboot-zroms-kvm-5.4.4-10.el5 (also tried: etherboot-zroms-kvm-5.4.4-10.el5.0.sl etherboot-zroms-kvm-5.4.4-10.el5.centos just to make sure) kmod-kvm-83-105.el5_4.9 kvm-83-105.el5_4.9 libvirt-python-0.6.3-20.1.el5_4 virt-manager-0.6.1-8.el5 python-virtinst-0.400.3-5.el5 libvirt-0.6.3-20.1.el5_4 How reproducible: Every time. Steps to Reproduce: 1. Set up a new virtual machine image in virt-manager 2. Set it to PXE boot 3. It will fail to obtain the IP address from an external DHCP server on the network. Actual results: PXE booting fails at DHCP address acquisition stage with "No IP address". Expected results: PXE booting acquires IP address via DHCP. Additional info: DHCP server is another RHEL5 machine (not the host machine). I saw a similar bug report for FC11, but the FC11 etherboot package source rpm doesn't build on RHEL5. The problem seems specific to PXE booting as DHCP does work on the guest machine when accessed from the guest OS.
Can you please send the relevant tcpdump of the tap device?
I'm not sure what you are referring to here. No tap device is used. I'm using bridged networking and there are no tap interfaces. tcpdumps from the host: # tcpdump -i eth0 | grep -i dhcp tcpdump: WARNING: eth0: no IPv4 address assigned tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes 13:56:51.115525 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 54:52:00:55:b9:47 (oui Unknown), length: 548 13:56:51.116693 IP sentinel.internal.net.bootps > 10.2.252.210.bootpc: BOOTP/DHCP, Reply, length: 300 13:56:53.571498 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 54:52:00:55:b9:47 (oui Unknown), length: 548 13:56:53.572388 IP sentinel.internal.net.bootps > 10.2.252.210.bootpc: BOOTP/DHCP, Reply, length: 300 # tcpdump -i br0 | grep -i dhcp tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on br0, link-type EN10MB (Ethernet), capture size 96 bytes 14:02:22.976581 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 54:52:00:55:b9:47 (oui Unknown), length: 548 14:02:22.997652 IP sentinel.internal.net.bootps > 10.2.252.210.bootpc: BOOTP/DHCP, Reply, length: 300 14:02:25.299680 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 54:52:00:55:b9:47 (oui Unknown), length: 548 14:02:25.300533 IP sentinel.internal.net.bootps > 10.2.252.210.bootpc: BOOTP/DHCP, Reply, length: 300 # tcpdump -i vnet0 | grep -i dhcp tcpdump: WARNING: vnet0: no IPv4 address assigned tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on vnet0, link-type EN10MB (Ethernet), capture size 96 bytes 14:06:11.098615 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 54:52:00:55:b9:47 (oui Unknown), length: 548 14:06:11.997217 IP sentinel.internal.net.bootps > 10.2.252.210.bootpc: BOOTP/DHCP, Reply, length: 300 14:06:13.370596 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 54:52:00:55:b9:47 (oui Unknown), length: 548 14:06:13.371924 IP sentinel.internal.net.bootps > 10.2.252.210.bootpc: BOOTP/DHCP, Reply, length: 300 The following lines are present in host's sysctl.conf: net.ipv4.ip_forward = 1 net.ipv4.conf.default.proxy_arp = 1 net.ipv4.conf.br0.proxy_arp = 1 When the guest VM is set to PXE boot via DHCP, this doesn't work. However, if the VM boots off a local disk image, DHCP requests for an IP address to work, and the VM's interface gets an IP address assigned as expected.
What's the qemu cmdline? What's vnet0? Is it the tap device? Can you turn STP off for the bridge (brctl stp BRIDGE_NAME off) and set forwarding delay to 0.1 (brctl setfd BRIDGE 0.1)
qemu command line: /usr/libexec/qemu-kvm -S -M pc -m 512 -smp 1 -name OpenVZ-OSR1 -uuid 455dc246-42ac-bd20-08e7-301f2fbd24cb -no-kvm-pit-reinjection -monitor pty -pidfile /var/run/libvirt/qemu//OpenVZ-OSR1.pid -boot n -drive file=/var/lib/libvirt/images/OpenVZ-OSR1.img,if=ide,index=0 -drive file=/var/lib/libvirt/images/OpenVZ-OSR.img,if=ide,index=1 -drive file=,if=ide,media=cdrom,index=2 -net nic,macaddr=54:52:00:55:b9:47,vlan=0,model=e1000 -net tap,fd=15,script=,vlan=0,ifname=vnet0 -serial pty -parallel none -usb -vnc 127.0.0.1:0 -k en-gb According to that, vnet0 does, indeed, appear to be a tap device. After setting: # brctl stp br0 off # brctl setfd br0 0.1 there is no difference in behaviour. The VM still doesn't PXE boot. It just gets stuck indefinitely looking for an IP: Search for server (DHCP)....No IP address .No IP address .No IP address .No IP address
That's weird since it does work for others. What's the version of the pxe server? Can you retry it using legitimate mac address or at least use 'locally administered address' meaning the first byte should be for example '02' (02:52:00:55:b9:47)
Changing the MAC address made no difference. When you say PXE server, do you mean DHCP server? dhcp-3.0.5-21.el5_4.1
The bug originator reports that user mode networking PXE boot doesn't work, which I found strange, since it's been working for me under RHEL 5.X flawlessly. I have found a similar issue (PXE boot failing) on upstream kvm builds, but it's restricted to TAP networking. I haven't tested PXE boot with TAP networking under RHEL5.X, so this bug might proceed. But with user mode, I am pretty sure it works.
I forgot to mention: Since TAP mode is what most people will want to do when using KVM in the field, it's potentially a serious bug.
Hi, I was having the exact same problem. I found a work-around. My KVM XML definition showed that the guest's network-interface was using the virtio driver. Here is what I did: * I copied the guest's XML file. * Undefined the guest using 'virsh undefine'. * Replaced the virtio interface in the XML-file with the 'rtl8139' driver. * Defined the guest using 'virsh define <xmlfile>'. * Started the guest using 'virst start'. I hope this is information is useful to you. kind regards, Egon
Hi hit the same issue, this worked for me: iptables -I FORWARD -m physdev --physdev-is-bridged -j ACCEPT As can be seen in: http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5.4/html/Virtualization_Guide/sect-Virtualization-Network_Configuration-Bridged_networking_with_libvirt.html
This request was evaluated by Red Hat Product Management for inclusion in the current release of Red Hat Enterprise Linux. Because the affected component is not scheduled to be updated in the current release, Red Hat is unfortunately unable to address this request at this time. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux.
We've had a similar problem, and ended up discovering it was a problem with the way the bridge is setup. We no longer see PXE boot problems when the bridge is setup like that: /usr/sbin/brctl setfd [bridge-name] 0 /usr/sbin/brctl stp [bridge-name] off Would the originator try to setup his bridge like that and let us know the results?
How would you set that in the ifcfg file?
Whatever is the script that creates your bridge, you have to make sure it executes the brctl command right after the bridge is created. On comment#4, you mention that you've executed those commands, except that you've used 0.1 instead of 0 on the 'brctl setfd' command. Try again with 0, maybe this might solve the problem.
Reproducible with RHEL6Server 6.0 (qemu-kvm-0.12.1.2-2.113.el6_0.3.x86_64) : "brctl setfd bridge_intra 0" is required to PXEboot. Additional info : - no DHCP request is retrieved on the DHCP-server, hence no IP-address is assigned to the DomU ; - when executing 'dhcp' at the gPXE-prompt within the DomU, a DHCP-request is retrieved and an IP-address is subsequently assigned to the DomU.
Is it working with setfd 0 ? If that the case this is not a bug at all.
I an reasonably confident now that this is actually a duplicate of this Fedora bug: https://bugzilla.redhat.com/show_bug.cgi?id=586324 Setting DELAY=0 in the bridge's ifcfg configuration file cures the problem.
If that the case it is not a bug since bridges have forwarding delay and packets are paused/dropped until it passes. Can you please close the bug as worksforme?
I don't think I have permissions to close bugs. The bridge delay issue on VM hosts is quite real - it should at least be documented in big red letters, since it stops PXE booting working in KVM. But there are different bugs open for that. :)
worked for me! thanks brctl setfd br0 0 brctl stp br0 off