Bug 1462351
Summary: | UEFI enabled VMs cannot boot via PXE | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Dan Yasny <dyasny> |
Component: | ovmf | Assignee: | Laszlo Ersek <lersek> |
Status: | CLOSED NOTABUG | QA Contact: | FuXiangChun <xfu> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 7.4 | CC: | chayang, dyasny, juzhang, lersek, michen, mlammon, sasha |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-06-23 16:13:59 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Dan Yasny
2017-06-16 19:51:47 UTC
Hi Dan, OVMF (and AAVMF, in aarch64 guests) can perfectly well PXE boot, we have verified that on several independent occasions. In the majority of cases, PXE boot failure with OVMF/AAVMF can be tracked to incorrect DHCP/PXE server configuration. At the moment, I see three config issues in the information you provided. The first two are unrelated to the PXE issue, but I'll mention them for completeness. (1) When using OVMF virtual machines, it is strongly preferred to employ <boot order='N'/> elements in the domain XML under the specific <interface> and <disk> elements, over the legacy <boot dev='network'/> style elements under <os>. You can read about the background for example in bug 1323085. (BTW the per-device <boot order='N'/> elements are preferred by the libvirt docs as well; it's generally superior to the <boot> elements under <os>. You may have multiple instances of the same type of device, and only the per-device elements allow you to specify their boot order uniquely.) (2) Your snippet <os> <type arch='x86_64' machine='pc-q35-rhel7.3.0'>hvm</type> <loader readonly='yes' type='pflash' >/usr/share/OVMF/OVMF_CODE.secboot.fd</loader> <nvram>/var/lib/libvirt/qemu/nvram/OVMF_VARS.fd</nvram> </os> <features> <acpi/> <apic/> <smm state='on'/> </features> seems outdated. (2a) The machine type should be pc-q35-rhel7.4.0 (matching the first RHEL7 release where OVMF will be fully supported). pc-q35-rhel7.3.0 will likely work, but SMI broadcast won't be available to the guest, which can lead to performance and stability problems. So please stick with pc-q35-rhel7.4.0 or later. Correspondingly, please use the latest qemu-kvm-rhev-2.9.0-* package. (Relatedly -- no need for installing OVMF from the Supplementary channel; starting with RHEL-7.4, OVMF will be part of Server/x86_64. Please refer to bug 1329559 comment 41 and onward.) (2b) Your <loader> element does not spell out the @secure='yes' attribute. Without this, QEMU won't actually restrict pflash chip accesses to code that runs in SMM, therefore a malicious guest kernel can overwrite authenticated UEFI variables with direct hardware access, without going through the firmware-level verification. This defeats Secure Boot. The latest version of virt-manager / virt-install should produce @secure='yes' automatically, please refer to bug 1387479 (especially the comments near the end). (2c) Your <nvram> element doesn't really look like it was filled in by libvirtd. Using the default libvirt configuration (namely having a commented-out "nvram" stanza in /etc/libvirt/qemu.conf), libvirt will automatically know on RHEL-7.4 where to look for the varstore template that matches the "OVMF_CODE.secboot.fd" firmware binary. It will instantiate the VM's private varstore file from that, and place its pathname in the <nvram> element too. If you have multiple OVMF firmware binaries on your system, then customizing the "nvram" stanza (and restarting libvirtd) make sense, but even in that case, hand-editing the <nvram> element shouldn't be necessary. In general, the auto-generated pathnames in <nvram> have the following format: <nvram>/var/lib/libvirt/qemu/nvram/<GUEST_NAME>_VARS.fd</nvram> (3) Now, about the PXE failure itself. You mention that "pxelinux.0" is not downloaded (or maybe it is downloaded, but never launched.) Either of those outcomes is actually right. "pxelinux.0" is a boot loader binary that is meant for legacy BIOS systems, it is not suitable for UEFI systems -- it is not an EFI executable. So even if it was downloaded, it could not be started by OVMF. Regarding the question why it may not have been downloaded at all -- if the DHCP server is configured correctly, then it recognizes the architecture identifier of the DHCP client, and responds with a boot file pathname that matches that architecture. IOW, a well-configured DHCP server will never report "pxelinux.0" as the bootfile to an EFI DHCP client; that would be an architecture mismatch. Please refer to the RHEL7 installation guide, for example, about setting up TFTP for UEFI clients: <https://da.gd/Rh7EfiPxe>. (See under "21.1.2. Configuring a PXE Server for UEFI-based AMD64 and Intel 64 Clients".) This configuration works. In particular, note if option architecture-type = 00:07 { filename "shim.efi"; } else { filename "pxelinux/pxelinux.0"; } You can read more about the arch IDs in question at <https://da.gd/DhcpArchId>. (4) Bonus advice: whenever reporting OVMF issues, please capture the OVMF debug log, and attach it to the report. Instructions can be found for example in bug 1450345 comment 2 bullet (5). (5) Some more background on iPXE as it relates to OVMF. (I see you mentioned iPXE above.) In the "ipxe-roms-qemu" package, we provide such virtual NIC option ROMs that are built from the iPXE project in a *stripped down* manner. These ROMs are "combined" (i.e., legacy BIOS + UEFI) PCI Expansion ROMs, and the UEFI half of each oprom contains only a minimal Simple Network Protocol driver from iPXE. In other words, iPXE only provides the lowest level NIC driver to the EFI environment, and all the DHCP and PXE booting logic comes from the edk2 project modules that are built into OVMF. If you are using the virtio-net NIC, you can entirely disable iPXE, beacuse OVMF contains a built-in Simple Network Protocol driver for virtio-net. To do this, add the following to your <interface> element: <rom bar='off'/> Can you please confirm whether with the above updates (as necessary) PXE boot works for you too? Thanks! Laszlo Thank you for the detailed reply. I rebuilt my machine with 7.4 from scratch and built another reproducer. I must still be doing something wrong though, because I still cannot pxeboot domxml for the client (tried with and without the rom bar option): <domain type='kvm' id='20'> <name>test</name> <uuid>092bd7a7-017e-4606-9cf8-0ad4a81f3c42</uuid> <memory unit='KiB'>4194304</memory> <currentMemory unit='KiB'>4194304</currentMemory> <vcpu placement='static'>4</vcpu> <resource> <partition>/machine</partition> </resource> <os> <type arch='x86_64' machine='pc-q35-rhel7.3.0'>hvm</type> <loader readonly='yes' secure='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE.secboot.fd</loader> <nvram>/var/lib/libvirt/qemu/nvram/test_VARS.fd</nvram> <bootmenu enable='no'/> </os> <features> <acpi/> <apic/> <vmport state='off'/> <smm state='on'/> </features> <cpu mode='custom' match='exact' check='full'> <model fallback='forbid'>Skylake-Client</model> <vendor>Intel</vendor> <feature policy='disable' name='ds'/> <feature policy='disable' name='acpi'/> <feature policy='require' name='ss'/> <feature policy='disable' name='ht'/> <feature policy='disable' name='tm'/> <feature policy='disable' name='pbe'/> <feature policy='disable' name='dtes64'/> <feature policy='disable' name='monitor'/> <feature policy='disable' name='ds_cpl'/> <feature policy='require' name='vmx'/> <feature policy='disable' name='smx'/> <feature policy='disable' name='est'/> <feature policy='disable' name='tm2'/> <feature policy='disable' name='xtpr'/> <feature policy='disable' name='pdcm'/> <feature policy='disable' name='dca'/> <feature policy='disable' name='osxsave'/> <feature policy='require' name='tsc_adjust'/> <feature policy='require' name='pdpe1gb'/> <feature policy='disable' name='mpx'/> <feature policy='disable' name='xsavec'/> <feature policy='disable' name='xgetbv1'/> <feature policy='require' name='hypervisor'/> </cpu> <clock offset='utc'> <timer name='rtc' tickpolicy='catchup'/> <timer name='pit' tickpolicy='delay'/> <timer name='hpet' present='no'/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <pm> <suspend-to-mem enabled='no'/> <suspend-to-disk enabled='no'/> </pm> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/var/lib/libvirt/images/40g.img'/> <backingStore/> <target dev='vda' bus='virtio'/> <alias name='virtio-disk0'/> <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/> </disk> <controller type='usb' index='0' model='ich9-ehci1'> <alias name='usb'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x1d' function='0x7'/> </controller> <controller type='usb' index='0' model='ich9-uhci1'> <alias name='usb'/> <master startport='0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x1d' function='0x0' multifunction='on'/> </controller> <controller type='usb' index='0' model='ich9-uhci2'> <alias name='usb'/> <master startport='2'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x1d' function='0x1'/> </controller> <controller type='usb' index='0' model='ich9-uhci3'> <alias name='usb'/> <master startport='4'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x1d' function='0x2'/> </controller> <controller type='sata' index='0'> <alias name='ide'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/> </controller> <controller type='pci' index='0' model='pcie-root'> <alias name='pcie.0'/> </controller> <controller type='pci' index='1' model='pcie-root-port'> <model name='ioh3420'/> <target chassis='1' port='0x10'/> <alias name='pci.1'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0' multifunction='on'/> </controller> <controller type='pci' index='2' model='dmi-to-pci-bridge'> <model name='i82801b11-bridge'/> <alias name='pci.2'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x1e' function='0x0'/> </controller> <controller type='pci' index='3' model='pci-bridge'> <model name='pci-bridge'/> <target chassisNr='3'/> <alias name='pci.3'/> <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/> </controller> <controller type='pci' index='4' model='pcie-root-port'> <model name='ioh3420'/> <target chassis='4' port='0x11'/> <alias name='pci.4'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x1'/> </controller> <controller type='pci' index='5' model='pcie-root-port'> <model name='ioh3420'/> <target chassis='5' port='0x12'/> <alias name='pci.5'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x2'/> </controller> <controller type='pci' index='6' model='pcie-root-port'> <model name='ioh3420'/> <target chassis='6' port='0x13'/> <alias name='pci.6'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x3'/> </controller> <controller type='pci' index='7' model='pcie-root-port'> <model name='ioh3420'/> <target chassis='7' port='0x14'/> <alias name='pci.7'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x4'/> </controller> <controller type='virtio-serial' index='0'> <alias name='virtio-serial0'/> <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/> </controller> <interface type='network'> <mac address='52:54:00:df:59:fe'/> <source network='default' bridge='virbr0'/> <target dev='vnet1'/> <model type='virtio'/> <boot order='1'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/> </interface> <serial type='pty'> <source path='/dev/pts/3'/> <target port='0'/> <alias name='serial0'/> </serial> <console type='pty' tty='/dev/pts/3'> <source path='/dev/pts/3'/> <target type='serial' port='0'/> <alias name='serial0'/> </console> <channel type='unix'> <source mode='bind' path='/var/lib/libvirt/qemu/channel/target/domain-20-test/org.qemu.guest_agent.0'/> <target type='virtio' name='org.qemu.guest_agent.0' state='disconnected'/> <alias name='channel0'/> <address type='virtio-serial' controller='0' bus='0' port='1'/> </channel> <channel type='spicevmc'> <target type='virtio' name='com.redhat.spice.0' state='disconnected'/> <alias name='channel1'/> <address type='virtio-serial' controller='0' bus='0' port='2'/> </channel> <input type='tablet' bus='usb'> <alias name='input0'/> <address type='usb' bus='0' port='1'/> </input> <input type='mouse' bus='ps2'> <alias name='input1'/> </input> <input type='keyboard' bus='ps2'> <alias name='input2'/> </input> <graphics type='spice' port='5900' autoport='yes' listen='127.0.0.1'> <listen type='address' address='127.0.0.1'/> <image compression='off'/> </graphics> <sound model='ich6'> <alias name='sound0'/> <address type='pci' domain='0x0000' bus='0x03' slot='0x01' function='0x0'/> </sound> <video> <model type='qxl' ram='65536' vram='65536' vgamem='16384' heads='1' primary='yes'/> <alias name='video0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/> </video> <redirdev bus='usb' type='spicevmc'> <alias name='redir0'/> <address type='usb' bus='0' port='2'/> </redirdev> <redirdev bus='usb' type='spicevmc'> <alias name='redir1'/> <address type='usb' bus='0' port='3'/> </redirdev> <memballoon model='virtio'> <stats period='5'/> <alias name='balloon0'/> <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/> </memballoon> </devices> <seclabel type='dynamic' model='selinux' relabel='yes'> <label>system_u:system_r:svirt_t:s0:c68,c194</label> <imagelabel>system_u:object_r:svirt_image_t:s0:c68,c194</imagelabel> </seclabel> <seclabel type='dynamic' model='dac' relabel='yes'> <label>+107:+107</label> <imagelabel>+107:+107</imagelabel> </seclabel> </domain> Versions on the host: [root@sealusa5 ~]# rpm -q OVMF OVMF-20170228-5.gitc325e41585e3.el7.noarch [root@sealusa5 ~]# rpm -q libvirt libvirt-3.2.0-10.el7.x86_64 [root@sealusa5 ~]# rpm -qa |grep qemu qemu-kvm-common-rhev-2.6.0-28.el7_3.9.x86_64 libvirt-daemon-driver-qemu-3.2.0-10.el7.x86_64 qemu-img-rhev-2.6.0-28.el7_3.9.x86_64 ipxe-roms-qemu-20170123-1.git4e85b27.el7.noarch qemu-kvm-rhev-2.6.0-28.el7_3.9.x86_64 Capabilities: <capabilities> <host> <uuid>00000000-0000-0000-0000-0cc47ad2918e</uuid> <cpu> <arch>x86_64</arch> <model>Broadwell</model> <vendor>Intel</vendor> <topology sockets='1' cores='10' threads='2'/> <feature name='vme'/> <feature name='ds'/> <feature name='acpi'/> <feature name='ss'/> <feature name='ht'/> <feature name='tm'/> <feature name='pbe'/> <feature name='dtes64'/> <feature name='monitor'/> <feature name='ds_cpl'/> <feature name='vmx'/> <feature name='smx'/> <feature name='est'/> <feature name='tm2'/> <feature name='xtpr'/> <feature name='pdcm'/> <feature name='dca'/> <feature name='osxsave'/> <feature name='f16c'/> <feature name='rdrand'/> <feature name='arat'/> <feature name='tsc_adjust'/> <feature name='cmt'/> <feature name='xsaveopt'/> <feature name='mbm_total'/> <feature name='mbm_local'/> <feature name='pdpe1gb'/> <feature name='abm'/> <feature name='invtsc'/> <pages unit='KiB' size='4'/> <pages unit='KiB' size='2048'/> <pages unit='KiB' size='1048576'/> </cpu> <power_management> <suspend_mem/> <suspend_disk/> <suspend_hybrid/> </power_management> <migration_features> <live/> <uri_transports> <uri_transport>tcp</uri_transport> <uri_transport>rdma</uri_transport> </uri_transports> </migration_features> <topology> <cells num='2'> <cell id='0'> <memory unit='KiB'>134104160</memory> <pages unit='KiB' size='4'>33526040</pages> <pages unit='KiB' size='2048'>0</pages> <pages unit='KiB' size='1048576'>0</pages> <distances> <sibling id='0' value='10'/> <sibling id='1' value='21'/> </distances> <cpus num='20'> <cpu id='0' socket_id='0' core_id='0' siblings='0,20'/> <cpu id='1' socket_id='0' core_id='1' siblings='1,21'/> <cpu id='2' socket_id='0' core_id='2' siblings='2,22'/> <cpu id='3' socket_id='0' core_id='3' siblings='3,23'/> <cpu id='4' socket_id='0' core_id='4' siblings='4,24'/> <cpu id='5' socket_id='0' core_id='8' siblings='5,25'/> <cpu id='6' socket_id='0' core_id='9' siblings='6,26'/> <cpu id='7' socket_id='0' core_id='10' siblings='7,27'/> <cpu id='8' socket_id='0' core_id='11' siblings='8,28'/> <cpu id='9' socket_id='0' core_id='12' siblings='9,29'/> <cpu id='20' socket_id='0' core_id='0' siblings='0,20'/> <cpu id='21' socket_id='0' core_id='1' siblings='1,21'/> <cpu id='22' socket_id='0' core_id='2' siblings='2,22'/> <cpu id='23' socket_id='0' core_id='3' siblings='3,23'/> <cpu id='24' socket_id='0' core_id='4' siblings='4,24'/> <cpu id='25' socket_id='0' core_id='8' siblings='5,25'/> <cpu id='26' socket_id='0' core_id='9' siblings='6,26'/> <cpu id='27' socket_id='0' core_id='10' siblings='7,27'/> <cpu id='28' socket_id='0' core_id='11' siblings='8,28'/> <cpu id='29' socket_id='0' core_id='12' siblings='9,29'/> </cpus> </cell> <cell id='1'> <memory unit='KiB'>134217728</memory> <pages unit='KiB' size='4'>33554432</pages> <pages unit='KiB' size='2048'>0</pages> <pages unit='KiB' size='1048576'>0</pages> <distances> <sibling id='0' value='21'/> <sibling id='1' value='10'/> </distances> <cpus num='20'> <cpu id='10' socket_id='1' core_id='0' siblings='10,30'/> <cpu id='11' socket_id='1' core_id='1' siblings='11,31'/> <cpu id='12' socket_id='1' core_id='2' siblings='12,32'/> <cpu id='13' socket_id='1' core_id='3' siblings='13,33'/> <cpu id='14' socket_id='1' core_id='4' siblings='14,34'/> <cpu id='15' socket_id='1' core_id='8' siblings='15,35'/> <cpu id='16' socket_id='1' core_id='9' siblings='16,36'/> <cpu id='17' socket_id='1' core_id='10' siblings='17,37'/> <cpu id='18' socket_id='1' core_id='11' siblings='18,38'/> <cpu id='19' socket_id='1' core_id='12' siblings='19,39'/> <cpu id='30' socket_id='1' core_id='0' siblings='10,30'/> <cpu id='31' socket_id='1' core_id='1' siblings='11,31'/> <cpu id='32' socket_id='1' core_id='2' siblings='12,32'/> <cpu id='33' socket_id='1' core_id='3' siblings='13,33'/> <cpu id='34' socket_id='1' core_id='4' siblings='14,34'/> <cpu id='35' socket_id='1' core_id='8' siblings='15,35'/> <cpu id='36' socket_id='1' core_id='9' siblings='16,36'/> <cpu id='37' socket_id='1' core_id='10' siblings='17,37'/> <cpu id='38' socket_id='1' core_id='11' siblings='18,38'/> <cpu id='39' socket_id='1' core_id='12' siblings='19,39'/> </cpus> </cell> </cells> </topology> <secmodel> <model>selinux</model> <doi>0</doi> <baselabel type='kvm'>system_u:system_r:svirt_t:s0</baselabel> <baselabel type='qemu'>system_u:system_r:svirt_tcg_t:s0</baselabel> </secmodel> <secmodel> <model>dac</model> <doi>0</doi> <baselabel type='kvm'>+107:+107</baselabel> <baselabel type='qemu'>+107:+107</baselabel> </secmodel> </host> <guest> <os_type>hvm</os_type> <arch name='i686'> <wordsize>32</wordsize> <emulator>/usr/libexec/qemu-kvm</emulator> <machine maxCpus='240'>pc-i440fx-rhel7.3.0</machine> <machine canonical='pc-i440fx-rhel7.3.0' maxCpus='240'>pc</machine> <machine maxCpus='240'>pc-i440fx-rhel7.0.0</machine> <machine maxCpus='240'>rhel6.3.0</machine> <machine maxCpus='240'>rhel6.4.0</machine> <machine maxCpus='240'>rhel6.0.0</machine> <machine maxCpus='240'>pc-i440fx-rhel7.1.0</machine> <machine maxCpus='240'>pc-i440fx-rhel7.2.0</machine> <machine maxCpus='240'>pc-q35-rhel7.3.0</machine> <machine canonical='pc-q35-rhel7.3.0' maxCpus='240'>q35</machine> <machine maxCpus='240'>rhel6.5.0</machine> <machine maxCpus='240'>rhel6.6.0</machine> <machine maxCpus='240'>rhel6.1.0</machine> <machine maxCpus='240'>rhel6.2.0</machine> <domain type='qemu'/> <domain type='kvm'> <emulator>/usr/libexec/qemu-kvm</emulator> </domain> </arch> <features> <cpuselection/> <deviceboot/> <disksnapshot default='on' toggle='no'/> <acpi default='on' toggle='yes'/> <apic default='on' toggle='no'/> <pae/> <nonpae/> </features> </guest> <guest> <os_type>hvm</os_type> <arch name='x86_64'> <wordsize>64</wordsize> <emulator>/usr/libexec/qemu-kvm</emulator> <machine maxCpus='240'>pc-i440fx-rhel7.3.0</machine> <machine canonical='pc-i440fx-rhel7.3.0' maxCpus='240'>pc</machine> <machine maxCpus='240'>pc-i440fx-rhel7.0.0</machine> <machine maxCpus='240'>rhel6.3.0</machine> <machine maxCpus='240'>rhel6.4.0</machine> <machine maxCpus='240'>rhel6.0.0</machine> <machine maxCpus='240'>pc-i440fx-rhel7.1.0</machine> <machine maxCpus='240'>pc-i440fx-rhel7.2.0</machine> <machine maxCpus='240'>pc-q35-rhel7.3.0</machine> <machine canonical='pc-q35-rhel7.3.0' maxCpus='240'>q35</machine> <machine maxCpus='240'>rhel6.5.0</machine> <machine maxCpus='240'>rhel6.6.0</machine> <machine maxCpus='240'>rhel6.1.0</machine> <machine maxCpus='240'>rhel6.2.0</machine> <domain type='qemu'/> <domain type='kvm'> <emulator>/usr/libexec/qemu-kvm</emulator> </domain> </arch> <features> <cpuselection/> <deviceboot/> <disksnapshot default='on' toggle='no'/> <acpi default='on' toggle='yes'/> <apic default='on' toggle='no'/> </features> </guest> </capabilities> qemu.conf snip: nvram = [ "/usr/share/OVMF/OVMF_CODE.fd:/usr/share/OVMF/OVMF_VARS.fd", "/usr/share/OVMF/OVMF_CODE.secboot.fd:/usr/share/OVMF/OVMF_VARS.fd", "/usr/share/AAVMF/AAVMF_CODE.fd:/usr/share/AAVMF/AAVMF_VARS.fd" ] DHCP server config: find /var/lib/tftpboot/ /var/lib/tftpboot/ /var/lib/tftpboot/pxelinux /var/lib/tftpboot/pxelinux/pxelinux.cfg /var/lib/tftpboot/pxelinux/pxelinux.cfg/default /var/lib/tftpboot/pxelinux/pxelinux.cfg/efidefault /var/lib/tftpboot/pxelinux/centos7 /var/lib/tftpboot/pxelinux/centos7/vmlinuz /var/lib/tftpboot/pxelinux/centos7/initrd.img /var/lib/tftpboot/pxelinux/BOOTX64.EFI /var/lib/tftpboot/pxelinux/grub.cfg /var/lib/tftpboot/pxelinux/msgs /var/lib/tftpboot/pxelinux/msgs/boot.msg /var/lib/tftpboot/pxelinux/pxelinux.0 /var/lib/tftpboot/uefi /var/lib/tftpboot/uefi/vmlinuz /var/lib/tftpboot/uefi/initrd.img /var/lib/tftpboot/centos7 /var/lib/tftpboot/centos7/vmlinuz /var/lib/tftpboot/centos7/initrd.img dhcpd.conf: option arch code 93 = unsigned integer 16; # RFC4578 allow booting; allow bootp; subnet 192.168.122.0 netmask 255.255.255.0 { option routers 192.168.122.1; range 192.168.122.100 192.168.122.200; class "pxeclients" { match if substring (option vendor-class-identifier, 0, 9) = "PXEClient"; next-server 192.168.122.2; if option arch = 00:07 { filename "pxelinux/BOOTX64.EFI"; } else { filename "pxelinux/pxelinux.0"; } } } ----------------------- Could you please provide the same information for me to compare? There either is a bug, or I'm missing some small detail Thanks This looks like a good config on the surface. (Personally I don't use dhcpd but dnsmasq, as configured by libvirt, for netbooting; but it's also possible to make it work with dhcpd.) What is your "pxelinux/BOOTX64.EFI" exactly? Is it a copy of "shim.efi"? What symptoms do you see? The same as before? Generally I use tcpdump and the dhcp/tftp server logs at this point to see what happens. For reference, my config is: $ virsh net-dumpxml --inactive default <network> <name>default</name> <uuid>4ae8659d-38be-4f65-93b2-fd2758bfdc61</uuid> <forward mode='nat'/> <bridge name='virbr0' stp='on' delay='0'/> <mac address='52:54:00:c3:97:eb'/> <dns> <host ip='192.168.122.1'> <hostname>nfsserver</hostname> </host> </dns> <ip address='192.168.122.1' netmask='255.255.255.0'> <tftp root='/var/lib/dnsmasq'/> <dhcp> <range start='192.168.122.2' end='192.168.122.254'/> <bootp file='shim.efi'/> </dhcp> </ip> </network> $ ls -l /var/lib/dnsmasq/ -rw-r--r--. 1 root root 209 2015-04-10 11:29:25 +0200 grub.cfg -rw-r--r--. 1 root root 1069520 2014-12-01 11:47:40 +0100 grubx64.efi -rw-r--r--. 1 root root 37500320 2014-12-01 11:50:19 +0100 initrd.img -rw-r--r--. 1 root root 1285808 2014-12-01 11:47:58 +0100 shim.efi -rw-r--r--. 1 root root 5024976 2014-12-01 11:48:16 +0100 vmlinuz $ cat /var/lib/dnsmasq/grub.cfg set timeout=600 menuentry 'install RHEL-7.1-20141127.0-Server over NFS' { linuxefi vmlinuz ip=dhcp inst.repo=nfs:nfsserver:/mnt/data/isos/RHEL-7.1-20141127.0-Server-x86_64-dvd1.iso initrdefi initrd.img } IOW I have both the dhcp and tftp servers on the same machine. Perhaps using a separate "next-server" in your dhcpd.conf exposes a real issue; I'm not so sure about that. Just to narrow it down, can you put both daemons on the same server? Thanks. Also, since you are using a libvirt guest with virtual networking: <interface type='network'> http://libvirt.org/formatdomain.html#elementsNICSVirtual http://libvirt.org/formatnetwork.html#examplesNAT am I correct to think that you have a DHCP server conflict on the 192.168.122.x subnet? Namely, - one DHCP server is provided by libvirt itself: it generates a config file for "dnsmasq" from the network definition in "/etc/libvirt/qemu/networks/default.xml", and launches dnsmasq, - and the second DHCP server on the same subnet is your own dhcpd. I think this should not be done. One subnet should be server by a single dhcp server + tftp server pair. A manually configured dhcp server is appropriate if you use bridged networking for the guest: <interface type='bridge'> http://libvirt.org/formatdomain.html#elementsNICSBridge http://libvirt.org/formatnetwork.html#examplesBridge (In reply to Laszlo Ersek from comment #5) > This looks like a good config on the surface. (Personally I don't use dhcpd > but dnsmasq, as configured by libvirt, for netbooting; but it's also > possible to make it work with dhcpd.) > > What is your "pxelinux/BOOTX64.EFI" exactly? Is it a copy of "shim.efi"? > a copy of EFI/BOOT/grubx64.efi > What symptoms do you see? The same as before? Generally I use tcpdump and > the dhcp/tftp server logs at this point to see what happens. There must be another bug, because once I switch to EFI I lose the VM console. So far I try to monitor thigns from the PXE machine side > > For reference, my config is: > > $ virsh net-dumpxml --inactive default > <network> > <name>default</name> > <uuid>4ae8659d-38be-4f65-93b2-fd2758bfdc61</uuid> > <forward mode='nat'/> > <bridge name='virbr0' stp='on' delay='0'/> > <mac address='52:54:00:c3:97:eb'/> > <dns> > <host ip='192.168.122.1'> > <hostname>nfsserver</hostname> > </host> > </dns> > <ip address='192.168.122.1' netmask='255.255.255.0'> > <tftp root='/var/lib/dnsmasq'/> > <dhcp> > <range start='192.168.122.2' end='192.168.122.254'/> > <bootp file='shim.efi'/> > </dhcp> > </ip> > </network> > > $ ls -l /var/lib/dnsmasq/ > > -rw-r--r--. 1 root root 209 2015-04-10 11:29:25 +0200 grub.cfg > -rw-r--r--. 1 root root 1069520 2014-12-01 11:47:40 +0100 grubx64.efi > -rw-r--r--. 1 root root 37500320 2014-12-01 11:50:19 +0100 initrd.img > -rw-r--r--. 1 root root 1285808 2014-12-01 11:47:58 +0100 shim.efi > -rw-r--r--. 1 root root 5024976 2014-12-01 11:48:16 +0100 vmlinuz > > $ cat /var/lib/dnsmasq/grub.cfg > > set timeout=600 > menuentry 'install RHEL-7.1-20141127.0-Server over NFS' { > linuxefi vmlinuz ip=dhcp > inst.repo=nfs:nfsserver:/mnt/data/isos/RHEL-7.1-20141127.0-Server-x86_64- > dvd1.iso > initrdefi initrd.img > } > > IOW I have both the dhcp and tftp servers on the same machine. Perhaps using > a separate "next-server" in your dhcpd.conf exposes a real issue; I'm not so > sure about that. Just to narrow it down, can you put both daemons on the > same server? 192.168.122.2 is in fact the same machine. I added the next-server option later on because I saw it used in a manual, but with no changes in the result. The main difference is that you seem to be using the hypervisor as the PXE host, while in my case I create a separate VM to perform that role, and cancel DHCP on the libvirt network. > > Thanks. (In reply to Dan Yasny from comment #7) > (In reply to Laszlo Ersek from comment #5) > There must be another bug, because once I switch to EFI I lose the VM > console. Ugh, that's very strange, I've never encountered such a problem. Anyway... can you capture the OVMF debug log please? Please see point (4) in comment 3. > The main difference is that you seem to be using the hypervisor as the PXE > host, while in my case I create a separate VM to perform that role, and > cancel DHCP on the libvirt network. Give me a few minutes to try to replicate this on my laptop. Thanks. OK, so I managed to set up the cross-VM DHCP and PXE boot. Netboot works fine. I'll capture the relevant bits here. (01) My laptop runs RHEL-7.4 Beta, and the VM providing the DHCP and PXE server also runs RHEL-7.4 Beta. (02) On my laptop, I created a new virtual network, with DHCP disabled (so that DHCP would not be served from the host side to guests on this subnet): > $ virsh net-dumpxml --inactive cross-vm-dhcp > <network> > <name>cross-vm-dhcp</name> > <uuid>3104697e-70f1-4a3f-b9ea-c53bdeb2beb1</uuid> > <forward mode='nat'/> > <bridge name='virbr2' stp='on' delay='0'/> > <mac address='52:54:00:18:64:e7'/> > <ip address='192.168.124.1' netmask='255.255.255.0'> > </ip> > </network> (03) I modified one of my preexistent RHEL-7.4 Beta guests ("ovmf.rhel7.q35") so that it would become the DHCP & TFTP server on this new network. Namely, I added the following XML element to its domain XML: > <interface type='network'> > <mac address='52:54:00:0a:04:6c'/> > <source network='cross-vm-dhcp'/> > <model type='virtio'/> > <address type='pci' domain='0x0000' bus='0x02' slot='0x03' > function='0x0'/> > </interface> Here bus='0x02' identifies the legacy PCI bridge that hangs off of the DMI-to-PCI bridge [*], since this is a Q35 board. And slot='0x03' is simply a free slot on bus='0x02'. [*] For reference (note: this is preexistent config): > <controller type='pci' index='0' model='pcie-root'/> > <controller type='pci' index='1' model='dmi-to-pci-bridge'> > <model name='i82801b11-bridge'/> > <address type='pci' domain='0x0000' bus='0x00' slot='0x1e' > function='0x0'/> > </controller> > <controller type='pci' index='2' model='pci-bridge'> > <model name='pci-bridge'/> > <target chassisNr='2'/> > <address type='pci' domain='0x0000' bus='0x01' slot='0x01' > function='0x0'/> > </controller> (04) I booted "ovmf.rhel7.q35", logged in, and with the usual "nm-connection-editor" GUI, I configured the new interface: - disable (ignore) IPv6 on the new NIC - give a static IPv4 address (192.168.124.2) to the new NIC - as gateway, use 192.168.124.1 (refer to point (02)) (05) Still in the DHCP / PXE server guest, I assigned metric 110 to the new interface, just to be sure: > nmcli connection > [prints list of connections, with names and UUIDs] > > nmcli connection modify uuid 'ba149df2-20f8-4883-89c6-e1ab08369d0c' \ > ipv4.route-metric 110 (06) At this point, performed the steps under <https://da.gd/Rh7EfiPxe>. Some notes: (a) The "firewall-cmd --add-service=tftp" command didn't stick. I had to invoke "firewall-config" as root from the GUI, and enable "tftp" for both the Permanent and Runtime configs. (b) The contents of my "/etc/dhcp/dhcpd.conf" (using backslashes for readability): > option architecture-type code 93 = unsigned integer 16; > > subnet 192.168.124.0 netmask 255.255.255.0 { > option routers 192.168.124.1; > option domain-name-servers 192.168.124.1; > range 192.168.124.100 192.168.124.200; > class "pxeclients" { > match if substring (option vendor-class-identifier, 0, 9) = \ > "PXEClient"; > next-server 192.168.124.2; > if option architecture-type = 00:07 { > filename "shim.efi"; > } else { > filename "pxelinux/pxelinux.0"; > } > } > } (c) The "shim.efi" and "grubx64.efi" binaries were simply copied from under "/boot/efi/EFI/redhat", to "/var/lib/tftpboot". (d) I populated the "/var/lib/tftpboot/images/RHEL-7.4-20170616.3/" directory from http://.../RHEL-7.4-20170616.3/compose/Server/x86_64/os/images/pxeboot/ downloading the files "initrd.img" and "vmlinuz". (e) Contents of "/var/lib/tftpboot/grub.cfg" (using backslashes here for readability): > set timeout=60 > menuentry 'install RHEL-7.4-20170616.3 from HTTP' { > linuxefi images/RHEL-7.4-20170616.3/vmlinuz ip=dhcp \ > inst.repo=http://.../RHEL-7.4-20170616.3/compose/Server/x86_64/os/ > initrdefi images/RHEL-7.4-20170616.3/initrd.img > } (f) ultimately this is the directory structure (only regular files listed): > /var/lib/tftpboot/grub.cfg > /var/lib/tftpboot/grubx64.efi > /var/lib/tftpboot/images/RHEL-7.4-20170616.3/initrd.img > /var/lib/tftpboot/images/RHEL-7.4-20170616.3/vmlinuz > /var/lib/tftpboot/shim.efi (g) Importantly, once all files were in place, I ran: chmod -cR u=rwX,g=rX,o=rX /var/lib/tftpboot/ restorecon -FvvR /var/lib/tftpboot/ (h) Regarding the "tftp.socket" and "tftp.service" systemd units, DO NOT enable or start those (contrarily to the Installation Guide instructions). Dan told me on IRC that his setup included xinetd, and systemd's said units conflict with that. So, it's best to explicitly disable and stop these systemd units. (07) This is my "/etc/xinetd.d/tftp" file: > service tftp > { > disable = no > socket_type = dgram > protocol = udp > wait = yes > user = root > server = /usr/sbin/in.tftpd > server_args = -s /var/lib/tftpboot > disable = yes > per_source = 11 > cps = 100 2 > flags = IPv4 > } (08) After this, make sure both dhcpd and xinetd are enabled and running, with "systemctl is-enabled", "systemctl status", and also "journalctl". (In particular, xinetd's log messages should confirm that "tftp" is enabled.) Again, systemd's own "tftp.socket" and "tftp.service" units should be disabled & stopped with systemctl. (09) I created a new guest ("ovmf.rhel7.dhclient.q35") as DHCP / PXE client. These are the XML elements worth mentioning: > <domain type='kvm'> > <os> > <type arch='x86_64' machine='pc-q35-rhel7.4.0'>hvm</type> > <loader readonly='yes' secure='yes' > type='pflash'>/usr/share/OVMF/OVMF_CODE.secboot.fd</loader> > <nvram > >/var/lib/libvirt/qemu/nvram/ovmf.rhel7.dhclient.q35_VARS.fd</nvram> > <bootmenu enable='yes' timeout='3000'/> > </os> > <features> > <acpi/> > <apic/> > <pae/> > <smm state='on'/> > </features> > <devices> > <interface type='network'> > <mac address='52:54:00:1b:10:3e'/> > <source network='cross-vm-dhcp'/> > <model type='virtio'/> > <boot order='1'/> > <address type='pci' domain='0x0000' bus='0x02' slot='0x03' > function='0x0'/> > </interface> > </devices> > </domain> (10) After launching "ovmf.rhel7.dhclient.q35", it net-booted (ultimately) to the Anaconda welcome GUI. So, the use case works fine for me, it's just that setting up the DHCP / TFTP server is quite tedious. ... The packages that had to be installed were: "tftp-server", "xinetd", and "dhcp". It looks like you've been using straight PXE boot, not iPXE, any chance you still have the configuration working and you could confirm iPXE also works? Hi Dan, In comment 3 I wrote, > (5) Some more background on iPXE as it relates to OVMF. (I see you > mentioned iPXE above.) In the "ipxe-roms-qemu" package, we provide such > virtual NIC option ROMs that are built from the iPXE project in a > *stripped down* manner. These ROMs are "combined" (i.e., legacy BIOS + > UEFI) PCI Expansion ROMs, and the UEFI half of each oprom contains only a > minimal Simple Network Protocol driver from iPXE. In other words, iPXE > only provides the lowest level NIC driver to the EFI environment, and all > the DHCP and PXE booting logic comes from the edk2 project modules that > are built into OVMF. This means that, when using OVMF with ipxe-roms-qemu, the UEFI iPXE option ROMs will *not* provide the full-blown iPXE capabilities that you may be used to. This is done *deliberately*. The history behind the decision is somewhat sordid. I will provide you with links below, for background. The end result remains, if you want the full iPXE capabilities chained from OVMF PXE boot, then you will have to PXE-boot a standalone iPXE UEFI executable from OVMF (specifying "ipxe.efi" in place of "shim.efi", as "filename" in "dhcpd.conf"). The "ipxe.efi" binary is available from the "ipxe-bootimgs" package; under pathname "/usr/share/ipxe/ipxe.efi". Any bugs that occur in such a setup, after OVMF netboots "ipxe.efi", should be reported against the "ipxe" Bugzilla component. References: - http://lists.ipxe.org/pipermail/ipxe-devel/2015-February/003979.html - http://lists.ipxe.org/pipermail/ipxe-devel/2015-April/004085.html - http://lists.ipxe.org/pipermail/ipxe-devel/2015-July/004295.html - http://lists.nongnu.org/archive/html/qemu-devel/2015-07/msg04440.html - http://lists.nongnu.org/archive/html/qemu-devel/2015-07/msg04442.html - http://lists.nongnu.org/archive/html/qemu-devel/2015-07/msg04626.html - http://git.qemu.org/?p=qemu.git;a=commit;h=cf2b4b5b77a7 - bug 1084561 - bug 1181980 - bug 1295673 - http://wiki.qemu-project.org/IpxeDownstreamForQemu (In reply to Laszlo Ersek from comment #12) > If you want the full iPXE capabilities chained from OVMF PXE boot, then > you will have to PXE-boot a standalone iPXE UEFI executable from OVMF > (specifying "ipxe.efi" in place of "shim.efi", as "filename" in > "dhcpd.conf"). The "ipxe.efi" binary is available from the "ipxe-bootimgs" > package; under pathname "/usr/share/ipxe/ipxe.efi". This is called "chainloading" and the config bits are documented on the ipxe.org website: - http://ipxe.org/howto/chainloading - http://ipxe.org/howto/dhcpd#pxe_chainloading I still have my test env, and I can try testing this for you. (Such a setup is a first for me though.) (In reply to Dan Yasny from comment #11) > It looks like you've been using straight PXE boot, not iPXE, any chance > you still have the configuration working and you could confirm iPXE also > works? OK, so this is what I managed to do, when including iPXE. I have two scenarios: (1) The first scenario is where iPXE is *added* to shim.efi and grubx64.efi. This scenario proceeds quite well with netbooting, but it fails ultimately, and I couldn't get it to work. (2) The second scenario is where iPXE *replaces* shim.efi and grubx64.efi, loads its own script file, and boots the kernel + initrd directly. (0) Common steps, in the VM that runs xinetd+tftp / dhcpd ("ovmf.rhel7.q35"), on top of the steps described in comment 9: (0.1) Install the standalone iPXE UEFI binary, so that it can be served via TFTP: > yum install ipxe-bootimgs > cp -aiv /usr/share/ipxe/ipxe.efi /var/lib/tftpboot/ > restorecon -Fvv /var/lib/tftpboot/ipxe.efi (0.2) For completeness of "dhcpd.conf" below, also install the standalone iPXE BIOS binary: > cp -aiv /usr/share/ipxe/undionly.kpxe /var/lib/tftpboot/ > restorecon -Fvv /var/lib/tftpboot/undionly.kpxe (1) For the first scenario: (1.1) Edit "/etc/dhcp/dhcpd.conf" as documented under <http://ipxe.org/howto/chainloading> and <http://ipxe.org/howto/dhcpd#pxe_chainloading>: > option architecture-type code 93 = unsigned integer 16; > > subnet 192.168.124.0 netmask 255.255.255.0 { > option routers 192.168.124.1; > option domain-name-servers 192.168.124.1; > range 192.168.124.100 192.168.124.200; > class "pxeclients" { > match if substring (option vendor-class-identifier, 0, 9) = > "PXEClient"; > next-server 192.168.124.2; > if exists user-class and option user-class = "iPXE" { > # second stage: iPXE is booting, serve shim or pxelinux > if option architecture-type = 00:07 { > filename "shim.efi"; > } else { > filename "pxelinux/pxelinux.0"; > } > } else { > # first stage, firmware is booting, serve iPXE > if option architecture-type = 00:07 { > filename "ipxe.efi"; > } else { > filename "undionly.kpxe"; > } > } > } > } (1.2) Restart dhcpd: > systemctl restart dhcpd.service (1.3) Launch the VM that PXE-boots. (1.4) With this configuration, the following happens: - OVMF successfully downloads and runs "ipxe.efi", - "ipxe.efi" successfully downloads and runs "shim.efi", - "shim.efi" successfully downloads and runs "grubx64.efi", - unfortunately, "grubx64.efi" fails to download its config file, "grub.cfg", and bombs out to the grub shell. - This behavior is unchanged if "dhcpd.conf" specifies "grubx64.efi" rather than "shim.efi", for the second stage. In that case, "ipxe.efi" successfully downloads and runs "grubx64.efi", but the latter fails the exact same way. (2) For the second scenario: (2.1) Create the following ipxe command script, in "/var/lib/tftpboot/ipxe.cfg" (lines broken up with backslashes here for readability): > #!ipxe > > kernel images/RHEL-7.4-20170616.3/vmlinuz initrd=initrd.img ip=dhcp \ > inst.repo=http://.../RHEL-7.4-20170616.3/compose/Server/x86_64/os/ > initrd images/RHEL-7.4-20170616.3/initrd.img > boot NOTE: the "initrd=initrd.img" kernel parameter is *required*. It must match the last pathname component ("initrd.img") from the "initrd" iPXE command. Otherwise the kernel will not find the initial ramdisk. (2.2) Run > restorecon -Fvv /var/lib/tftpboot/ipxe.cfg (2.3) Edit "/etc/dhcp/dhcpd.conf" as follows: > option architecture-type code 93 = unsigned integer 16; > > subnet 192.168.124.0 netmask 255.255.255.0 { > option routers 192.168.124.1; > option domain-name-servers 192.168.124.1; > range 192.168.124.100 192.168.124.200; > class "pxeclients" { > match if substring (option vendor-class-identifier, 0, 9) = > "PXEClient"; > next-server 192.168.124.2; > if exists user-class and option user-class = "iPXE" { > # second stage: iPXE is booting, serve command script > filename "ipxe.cfg"; > } else { > # first stage, firmware is booting, serve iPXE > if option architecture-type = 00:07 { > filename "ipxe.efi"; > } else { > filename "undionly.kpxe"; > } > } > } > } (2.4) Restart dhcpd: > systemctl restart dhcpd.service (2.5) Launch the VM that PXE-boots. (2.6) In this configuration, - OVMF successfully downloads and runs "ipxe.efi", - "ipxe.efi" downloads and interprets the command script "ipxe.cfg", - "ipxe.efi" downloads the kernel and the initrd over TFTP, and launches the kernel. - The kernel finds the initial ramdisk because we explicitly tell it under what name to look for the ramdisk, with the "initrd=..." option. Grub sets this cmdline option automatically, with the "linuxefi" and "initrdefi" commands, but when using iPXE on an EFI system, the option has to be passed manually. Hat tip to <https://doc.rogerwhittaker.org.uk/ipxe-installation-and-EFI/> for the reminder. To summarize, OVMF + ipxe.efi work fine as well, but you have to exclude grub from that scenario -- instead, iPXE must download an ipxe command script, and boot the kernel and the initrd based on that command script. Also, don't forget the initrd=... kernel cmdline option, which must match the last pathname component of the "initrd" iPXE command. (... Dependent on your use case, you might even consider the replacement of shim and grub with iPXE -- and a flexible, customized iPXE command script -- a bonus.) |