Bug 533684
| Summary: | PXE booting of KVM VMs doesn't work | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 5 | Reporter: | Gordan Bobic <gordan> |
| Component: | etherboot | Assignee: | Glauber Costa <gcosta> |
| Status: | CLOSED NOTABUG | QA Contact: | Virtualization Bugs <virt-bugs> |
| Severity: | medium | Docs Contact: | |
| Priority: | low | ||
| Version: | 5.4 | CC: | cpelland, d.bz-redhat, fabella.wesley, lmr, redhat2, tburke |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2010-11-25 15:32:52 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 580948 | ||
|
Description
Gordan Bobic
2009-11-08 13:58:55 UTC
Can you please send the relevant tcpdump of the tap device? I'm not sure what you are referring to here. No tap device is used. I'm using bridged networking and there are no tap interfaces. tcpdumps from the host: # tcpdump -i eth0 | grep -i dhcp tcpdump: WARNING: eth0: no IPv4 address assigned tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes 13:56:51.115525 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 54:52:00:55:b9:47 (oui Unknown), length: 548 13:56:51.116693 IP sentinel.internal.net.bootps > 10.2.252.210.bootpc: BOOTP/DHCP, Reply, length: 300 13:56:53.571498 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 54:52:00:55:b9:47 (oui Unknown), length: 548 13:56:53.572388 IP sentinel.internal.net.bootps > 10.2.252.210.bootpc: BOOTP/DHCP, Reply, length: 300 # tcpdump -i br0 | grep -i dhcp tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on br0, link-type EN10MB (Ethernet), capture size 96 bytes 14:02:22.976581 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 54:52:00:55:b9:47 (oui Unknown), length: 548 14:02:22.997652 IP sentinel.internal.net.bootps > 10.2.252.210.bootpc: BOOTP/DHCP, Reply, length: 300 14:02:25.299680 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 54:52:00:55:b9:47 (oui Unknown), length: 548 14:02:25.300533 IP sentinel.internal.net.bootps > 10.2.252.210.bootpc: BOOTP/DHCP, Reply, length: 300 # tcpdump -i vnet0 | grep -i dhcp tcpdump: WARNING: vnet0: no IPv4 address assigned tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on vnet0, link-type EN10MB (Ethernet), capture size 96 bytes 14:06:11.098615 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 54:52:00:55:b9:47 (oui Unknown), length: 548 14:06:11.997217 IP sentinel.internal.net.bootps > 10.2.252.210.bootpc: BOOTP/DHCP, Reply, length: 300 14:06:13.370596 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 54:52:00:55:b9:47 (oui Unknown), length: 548 14:06:13.371924 IP sentinel.internal.net.bootps > 10.2.252.210.bootpc: BOOTP/DHCP, Reply, length: 300 The following lines are present in host's sysctl.conf: net.ipv4.ip_forward = 1 net.ipv4.conf.default.proxy_arp = 1 net.ipv4.conf.br0.proxy_arp = 1 When the guest VM is set to PXE boot via DHCP, this doesn't work. However, if the VM boots off a local disk image, DHCP requests for an IP address to work, and the VM's interface gets an IP address assigned as expected. What's the qemu cmdline? What's vnet0? Is it the tap device? Can you turn STP off for the bridge (brctl stp BRIDGE_NAME off) and set forwarding delay to 0.1 (brctl setfd BRIDGE 0.1) qemu command line: /usr/libexec/qemu-kvm -S -M pc -m 512 -smp 1 -name OpenVZ-OSR1 -uuid 455dc246-42ac-bd20-08e7-301f2fbd24cb -no-kvm-pit-reinjection -monitor pty -pidfile /var/run/libvirt/qemu//OpenVZ-OSR1.pid -boot n -drive file=/var/lib/libvirt/images/OpenVZ-OSR1.img,if=ide,index=0 -drive file=/var/lib/libvirt/images/OpenVZ-OSR.img,if=ide,index=1 -drive file=,if=ide,media=cdrom,index=2 -net nic,macaddr=54:52:00:55:b9:47,vlan=0,model=e1000 -net tap,fd=15,script=,vlan=0,ifname=vnet0 -serial pty -parallel none -usb -vnc 127.0.0.1:0 -k en-gb According to that, vnet0 does, indeed, appear to be a tap device. After setting: # brctl stp br0 off # brctl setfd br0 0.1 there is no difference in behaviour. The VM still doesn't PXE boot. It just gets stuck indefinitely looking for an IP: Search for server (DHCP)....No IP address .No IP address .No IP address .No IP address That's weird since it does work for others. What's the version of the pxe server? Can you retry it using legitimate mac address or at least use 'locally administered address' meaning the first byte should be for example '02' (02:52:00:55:b9:47) Changing the MAC address made no difference. When you say PXE server, do you mean DHCP server? dhcp-3.0.5-21.el5_4.1 The bug originator reports that user mode networking PXE boot doesn't work, which I found strange, since it's been working for me under RHEL 5.X flawlessly. I have found a similar issue (PXE boot failing) on upstream kvm builds, but it's restricted to TAP networking. I haven't tested PXE boot with TAP networking under RHEL5.X, so this bug might proceed. But with user mode, I am pretty sure it works. I forgot to mention: Since TAP mode is what most people will want to do when using KVM in the field, it's potentially a serious bug. Hi, I was having the exact same problem. I found a work-around. My KVM XML definition showed that the guest's network-interface was using the virtio driver. Here is what I did: * I copied the guest's XML file. * Undefined the guest using 'virsh undefine'. * Replaced the virtio interface in the XML-file with the 'rtl8139' driver. * Defined the guest using 'virsh define <xmlfile>'. * Started the guest using 'virst start'. I hope this is information is useful to you. kind regards, Egon Hi hit the same issue, this worked for me: iptables -I FORWARD -m physdev --physdev-is-bridged -j ACCEPT As can be seen in: http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5.4/html/Virtualization_Guide/sect-Virtualization-Network_Configuration-Bridged_networking_with_libvirt.html This request was evaluated by Red Hat Product Management for inclusion in the current release of Red Hat Enterprise Linux. Because the affected component is not scheduled to be updated in the current release, Red Hat is unfortunately unable to address this request at this time. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. We've had a similar problem, and ended up discovering it was a problem with the way the bridge is setup. We no longer see PXE boot problems when the bridge is setup like that: /usr/sbin/brctl setfd [bridge-name] 0 /usr/sbin/brctl stp [bridge-name] off Would the originator try to setup his bridge like that and let us know the results? How would you set that in the ifcfg file? Whatever is the script that creates your bridge, you have to make sure it executes the brctl command right after the bridge is created. On comment#4, you mention that you've executed those commands, except that you've used 0.1 instead of 0 on the 'brctl setfd' command. Try again with 0, maybe this might solve the problem. Reproducible with RHEL6Server 6.0 (qemu-kvm-0.12.1.2-2.113.el6_0.3.x86_64) : "brctl setfd bridge_intra 0" is required to PXEboot. Additional info : - no DHCP request is retrieved on the DHCP-server, hence no IP-address is assigned to the DomU ; - when executing 'dhcp' at the gPXE-prompt within the DomU, a DHCP-request is retrieved and an IP-address is subsequently assigned to the DomU. Is it working with setfd 0 ? If that the case this is not a bug at all. I an reasonably confident now that this is actually a duplicate of this Fedora bug: https://bugzilla.redhat.com/show_bug.cgi?id=586324 Setting DELAY=0 in the bridge's ifcfg configuration file cures the problem. If that the case it is not a bug since bridges have forwarding delay and packets are paused/dropped until it passes. Can you please close the bug as worksforme? I don't think I have permissions to close bugs. The bridge delay issue on VM hosts is quite real - it should at least be documented in big red letters, since it stops PXE booting working in KVM. But there are different bugs open for that. :) worked for me! thanks brctl setfd br0 0 brctl stp br0 off |