RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1462351 - UEFI enabled VMs cannot boot via PXE
Summary: UEFI enabled VMs cannot boot via PXE
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: ovmf
Version: 7.4
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: Laszlo Ersek
QA Contact: FuXiangChun
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-06-16 19:51 UTC by Dan Yasny
Modified: 2017-07-11 06:57 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-06-23 16:13:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Dan Yasny 2017-06-16 19:51:47 UTC
Description of problem:
I have set up a KVM VM on RHEL7.4 wirh a PXE service (tried vanilla dhcp/tftp/pxe stack as well as a full fledged Ironic with the same results in fact)

A set of other VMs are supposed to be installed via a PXE service in the first VM. 

Installing a regular, BIOS based VM with     <boot dev='network'/> in the domxml works fine

Installed OVMF from RHEL 7.4 beta supplementary channel, enabled it and tried to deploy while watching logs on the PXE machine: DHCP is picked up but tftp download fails

Installed the upstream ovmf from kraxel.org - same result.

Also tried this in a libvirt NAT network as well as a real bridge with a physical interface and with OVS. 

Configuration details:
RHEL 7.4 OVMF: OVMF-20170228-5.gitc325e41585e3.el7.noarch
Upstream OVMF: edk2.git-ovmf-x64-0-20170613.b2757.g46e2632.noarch

qemu-kvm-rhev-2.6.0-28.el7_3.10.x86_64
qemu-kvm-common-rhev-2.6.0-28.el7_3.10.x86_64
libvirt-daemon-driver-qemu-3.2.0-9.el7.x86_64
ipxe-roms-qemu-20170123-1.git4e85b27.el7.noarch
qemu-img-rhev-2.6.0-28.el7_3.10.x86_64

Also tried with the most current ipxe from ipxe.org

libvirt-daemon-kvm-3.2.0-9.el7.x86_64
libvirt-daemon-driver-network-3.2.0-9.el7.x86_64
libvirt-daemon-driver-lxc-3.2.0-9.el7.x86_64
libvirt-daemon-driver-storage-rbd-3.2.0-9.el7.x86_64
libvirt-daemon-driver-nodedev-3.2.0-9.el7.x86_64
libvirt-daemon-driver-storage-mpath-3.2.0-9.el7.x86_64
libvirt-daemon-driver-storage-scsi-3.2.0-9.el7.x86_64
libvirt-daemon-driver-storage-iscsi-3.2.0-9.el7.x86_64
libvirt-3.2.0-9.el7.x86_64
libvirt-libs-3.2.0-9.el7.x86_64
libvirt-daemon-driver-interface-3.2.0-9.el7.x86_64
libvirt-python-3.2.0-3.el7.x86_64
libvirt-client-3.2.0-9.el7.x86_64
libvirt-daemon-driver-qemu-3.2.0-9.el7.x86_64
libvirt-daemon-driver-storage-logical-3.2.0-9.el7.x86_64
libvirt-daemon-driver-storage-3.2.0-9.el7.x86_64
libvirt-daemon-driver-nwfilter-3.2.0-9.el7.x86_64
libvirt-glib-1.0.0-1.el7.x86_64
libvirt-daemon-config-network-3.2.0-9.el7.x86_64
libvirt-daemon-driver-storage-core-3.2.0-9.el7.x86_64
libvirt-daemon-driver-storage-disk-3.2.0-9.el7.x86_64
libvirt-daemon-driver-secret-3.2.0-9.el7.x86_64
libvirt-daemon-driver-storage-gluster-3.2.0-9.el7.x86_64
libvirt-daemon-3.2.0-9.el7.x86_64
libvirt-daemon-config-nwfilter-3.2.0-9.el7.x86_64


VM config snips:
RHEL7.4 OVMF VM:
<os>
    <type arch='x86_64' machine='pc-q35-rhel7.3.0'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE.secboot.fd</loader>
    <nvram>/var/lib/libvirt/qemu/nvram/OVMF_VARS.fd</nvram>
</os>
  <features>
    <acpi/>
    <apic/>
    <smm state='on'/>
  </features>


Upstream tianomode configured VM:
  <os>
    <type arch='x86_64' machine='pc-i440fx-rhel7.3.0'>hvm</type>
    <loader type='rom'>/usr/share/edk2.git/ovmf-x64/OVMF_CODE-pure-efi.fd</loader>
  </os>



Version-Release number of selected component (if applicable):
see above

How reproducible:
always

Steps to Reproduce:
1. install RHEL7.4, create VMs and try to PXEboot them
2.
3.

Actual results:
DHCP goes through but tftp request for pxelinux.0 does not come through according to the PXE server logs

Expected results:
Should be able to pxeboot both EFI and BIOS VMs.

Comment 3 Laszlo Ersek 2017-06-16 22:52:18 UTC
Hi Dan,

OVMF (and AAVMF, in aarch64 guests) can perfectly well PXE boot, we have
verified that on several independent occasions. In the majority of cases,
PXE boot failure with OVMF/AAVMF can be tracked to incorrect DHCP/PXE server
configuration.

At the moment, I see three config issues in the information you provided.
The first two are unrelated to the PXE issue, but I'll mention them for
completeness.

(1) When using OVMF virtual machines, it is strongly preferred to employ

  <boot order='N'/>

elements in the domain XML under the specific <interface> and <disk>
elements, over the legacy

  <boot dev='network'/>

style elements under <os>. You can read about the background for example in
bug 1323085.

(BTW the per-device <boot order='N'/> elements are preferred by the libvirt
docs as well; it's generally superior to the <boot> elements under <os>. You
may have multiple instances of the same type of device, and only the
per-device elements allow you to specify their boot order uniquely.)

(2) Your snippet

  <os>
    <type arch='x86_64' machine='pc-q35-rhel7.3.0'>hvm</type>
    <loader readonly='yes' type='pflash'
     >/usr/share/OVMF/OVMF_CODE.secboot.fd</loader>
    <nvram>/var/lib/libvirt/qemu/nvram/OVMF_VARS.fd</nvram>
  </os>
  <features>
    <acpi/>
    <apic/>
    <smm state='on'/>
  </features>

seems outdated.

(2a) The machine type should be pc-q35-rhel7.4.0 (matching the first RHEL7
release where OVMF will be fully supported). pc-q35-rhel7.3.0 will likely
work, but SMI broadcast won't be available to the guest, which can lead to
performance and stability problems. So please stick with pc-q35-rhel7.4.0 or
later. Correspondingly, please use the latest qemu-kvm-rhev-2.9.0-* package.

(Relatedly -- no need for installing OVMF from the Supplementary channel;
starting with RHEL-7.4, OVMF will be part of Server/x86_64. Please refer to
bug 1329559 comment 41 and onward.)

(2b) Your <loader> element does not spell out the @secure='yes' attribute.
Without this, QEMU won't actually restrict pflash chip accesses to code that
runs in SMM, therefore a malicious guest kernel can overwrite authenticated
UEFI variables with direct hardware access, without going through the
firmware-level verification. This defeats Secure Boot.

The latest version of virt-manager / virt-install should produce
@secure='yes' automatically, please refer to bug 1387479 (especially the
comments near the end).

(2c) Your <nvram> element doesn't really look like it was filled in by
libvirtd. Using the default libvirt configuration (namely having a
commented-out "nvram" stanza in /etc/libvirt/qemu.conf), libvirt will
automatically know on RHEL-7.4 where to look for the varstore template that
matches the "OVMF_CODE.secboot.fd" firmware binary. It will instantiate the
VM's private varstore file from that, and place its pathname in the <nvram>
element too.

If you have multiple OVMF firmware binaries on your system, then customizing
the "nvram" stanza (and restarting libvirtd) make sense, but even in that
case, hand-editing the <nvram> element shouldn't be necessary. In general,
the auto-generated pathnames in <nvram> have the following format:

  <nvram>/var/lib/libvirt/qemu/nvram/<GUEST_NAME>_VARS.fd</nvram>

(3) Now, about the PXE failure itself. You mention that "pxelinux.0" is not
downloaded (or maybe it is downloaded, but never launched.)

Either of those outcomes is actually right. "pxelinux.0" is a boot loader
binary that is meant for legacy BIOS systems, it is not suitable for UEFI
systems -- it is not an EFI executable. So even if it was downloaded, it
could not be started by OVMF.

Regarding the question why it may not have been downloaded at all -- if the
DHCP server is configured correctly, then it recognizes the architecture
identifier of the DHCP client, and responds with a boot file pathname that
matches that architecture. IOW, a well-configured DHCP server will never
report "pxelinux.0" as the bootfile to an EFI DHCP client; that would be an
architecture mismatch.

Please refer to the RHEL7 installation guide, for example, about setting up
TFTP for UEFI clients: <https://da.gd/Rh7EfiPxe>. (See under "21.1.2.
Configuring a PXE Server for UEFI-based AMD64 and Intel 64 Clients".) This
configuration works. In particular, note

  if option architecture-type = 00:07 {
    filename "shim.efi";
  } else {
    filename "pxelinux/pxelinux.0";
  }

You can read more about the arch IDs in question at
<https://da.gd/DhcpArchId>.

(4) Bonus advice: whenever reporting OVMF issues, please capture the OVMF
debug log, and attach it to the report. Instructions can be found for
example in bug 1450345 comment 2 bullet (5).

(5) Some more background on iPXE as it relates to OVMF. (I see you mentioned
iPXE above.) In the "ipxe-roms-qemu" package, we provide such virtual NIC
option ROMs that are built from the iPXE project in a *stripped down*
manner. These ROMs are "combined" (i.e., legacy BIOS + UEFI) PCI Expansion
ROMs, and the UEFI half of each oprom contains only a minimal Simple Network
Protocol driver from iPXE. In other words, iPXE only provides the lowest
level NIC driver to the EFI environment, and all the DHCP and PXE booting
logic comes from the edk2 project modules that are built into OVMF.

If you are using the virtio-net NIC, you can entirely disable iPXE, beacuse
OVMF contains a built-in Simple Network Protocol driver for virtio-net. To
do this, add the following to your <interface> element:

  <rom bar='off'/>


Can you please confirm whether with the above updates (as necessary) PXE
boot works for you too?

Thanks!
Laszlo

Comment 4 Dan Yasny 2017-06-20 17:40:06 UTC
Thank you for the detailed reply. I rebuilt my machine with 7.4 from scratch and built another reproducer. I must still be doing something wrong though, because I still cannot pxeboot

domxml for the client (tried with and without the rom bar option):
<domain type='kvm' id='20'>
  <name>test</name>
  <uuid>092bd7a7-017e-4606-9cf8-0ad4a81f3c42</uuid>
  <memory unit='KiB'>4194304</memory>
  <currentMemory unit='KiB'>4194304</currentMemory>
  <vcpu placement='static'>4</vcpu>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-q35-rhel7.3.0'>hvm</type>
    <loader readonly='yes' secure='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE.secboot.fd</loader>
    <nvram>/var/lib/libvirt/qemu/nvram/test_VARS.fd</nvram>
    <bootmenu enable='no'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <vmport state='off'/>
    <smm state='on'/>
  </features>
  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>Skylake-Client</model>
    <vendor>Intel</vendor>
    <feature policy='disable' name='ds'/>
    <feature policy='disable' name='acpi'/>
    <feature policy='require' name='ss'/>
    <feature policy='disable' name='ht'/>
    <feature policy='disable' name='tm'/>
    <feature policy='disable' name='pbe'/>
    <feature policy='disable' name='dtes64'/>
    <feature policy='disable' name='monitor'/>
    <feature policy='disable' name='ds_cpl'/>
    <feature policy='require' name='vmx'/>
    <feature policy='disable' name='smx'/>
    <feature policy='disable' name='est'/>
    <feature policy='disable' name='tm2'/>
    <feature policy='disable' name='xtpr'/>
    <feature policy='disable' name='pdcm'/>
    <feature policy='disable' name='dca'/>
    <feature policy='disable' name='osxsave'/>
    <feature policy='require' name='tsc_adjust'/>
    <feature policy='require' name='pdpe1gb'/>
    <feature policy='disable' name='mpx'/>
    <feature policy='disable' name='xsavec'/>
    <feature policy='disable' name='xgetbv1'/>
    <feature policy='require' name='hypervisor'/>
  </cpu>
  <clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <pm>
    <suspend-to-mem enabled='no'/>
    <suspend-to-disk enabled='no'/>
  </pm>
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/libvirt/images/40g.img'/>
      <backingStore/>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
    </disk>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <alias name='usb'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1d' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <alias name='usb'/>
      <master startport='0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1d' function='0x0' multifunction='on'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci2'>
      <alias name='usb'/>
      <master startport='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1d' function='0x1'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci3'>
      <alias name='usb'/>
      <master startport='4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1d' function='0x2'/>
    </controller>
    <controller type='sata' index='0'>
      <alias name='ide'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pcie-root'>
      <alias name='pcie.0'/>
    </controller>
    <controller type='pci' index='1' model='pcie-root-port'>
      <model name='ioh3420'/>
      <target chassis='1' port='0x10'/>
      <alias name='pci.1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0' multifunction='on'/>
    </controller>
    <controller type='pci' index='2' model='dmi-to-pci-bridge'>
      <model name='i82801b11-bridge'/>
      <alias name='pci.2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1e' function='0x0'/>
    </controller>
    <controller type='pci' index='3' model='pci-bridge'>
      <model name='pci-bridge'/>
      <target chassisNr='3'/>
      <alias name='pci.3'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
    </controller>
    <controller type='pci' index='4' model='pcie-root-port'>
      <model name='ioh3420'/>
      <target chassis='4' port='0x11'/>
      <alias name='pci.4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x1'/>
    </controller>
    <controller type='pci' index='5' model='pcie-root-port'>
      <model name='ioh3420'/>
      <target chassis='5' port='0x12'/>
      <alias name='pci.5'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x2'/>
    </controller>
    <controller type='pci' index='6' model='pcie-root-port'>
      <model name='ioh3420'/>
      <target chassis='6' port='0x13'/>
      <alias name='pci.6'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x3'/>
    </controller>
    <controller type='pci' index='7' model='pcie-root-port'>
      <model name='ioh3420'/>
      <target chassis='7' port='0x14'/>
      <alias name='pci.7'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x4'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <alias name='virtio-serial0'/>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
    </controller>
    <interface type='network'>
      <mac address='52:54:00:df:59:fe'/>
      <source network='default' bridge='virbr0'/>
      <target dev='vnet1'/>
      <model type='virtio'/>
      <boot order='1'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
    </interface>
    <serial type='pty'>
      <source path='/dev/pts/3'/>
      <target port='0'/>
      <alias name='serial0'/>
    </serial>
    <console type='pty' tty='/dev/pts/3'>
      <source path='/dev/pts/3'/>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
    </console>
    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/channel/target/domain-20-test/org.qemu.guest_agent.0'/>
      <target type='virtio' name='org.qemu.guest_agent.0' state='disconnected'/>
      <alias name='channel0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <channel type='spicevmc'>
      <target type='virtio' name='com.redhat.spice.0' state='disconnected'/>
      <alias name='channel1'/>
      <address type='virtio-serial' controller='0' bus='0' port='2'/>
    </channel>
    <input type='tablet' bus='usb'>
      <alias name='input0'/>
      <address type='usb' bus='0' port='1'/>
    </input>
    <input type='mouse' bus='ps2'>
      <alias name='input1'/>
    </input>
    <input type='keyboard' bus='ps2'>
      <alias name='input2'/>
    </input>
    <graphics type='spice' port='5900' autoport='yes' listen='127.0.0.1'>
      <listen type='address' address='127.0.0.1'/>
      <image compression='off'/>
    </graphics>
    <sound model='ich6'>
      <alias name='sound0'/>
      <address type='pci' domain='0x0000' bus='0x03' slot='0x01' function='0x0'/>
    </sound>
    <video>
      <model type='qxl' ram='65536' vram='65536' vgamem='16384' heads='1' primary='yes'/>
      <alias name='video0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
    </video>
    <redirdev bus='usb' type='spicevmc'>
      <alias name='redir0'/>
      <address type='usb' bus='0' port='2'/>
    </redirdev>
    <redirdev bus='usb' type='spicevmc'>
      <alias name='redir1'/>
      <address type='usb' bus='0' port='3'/>
    </redirdev>
    <memballoon model='virtio'>
      <stats period='5'/>
      <alias name='balloon0'/>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
    </memballoon>
  </devices>
  <seclabel type='dynamic' model='selinux' relabel='yes'>
    <label>system_u:system_r:svirt_t:s0:c68,c194</label>
    <imagelabel>system_u:object_r:svirt_image_t:s0:c68,c194</imagelabel>
  </seclabel>
  <seclabel type='dynamic' model='dac' relabel='yes'>
    <label>+107:+107</label>
    <imagelabel>+107:+107</imagelabel>
  </seclabel>
</domain>

Versions on the host:
[root@sealusa5 ~]# rpm -q OVMF
OVMF-20170228-5.gitc325e41585e3.el7.noarch
[root@sealusa5 ~]# rpm -q libvirt
libvirt-3.2.0-10.el7.x86_64
[root@sealusa5 ~]# rpm -qa |grep qemu
qemu-kvm-common-rhev-2.6.0-28.el7_3.9.x86_64
libvirt-daemon-driver-qemu-3.2.0-10.el7.x86_64
qemu-img-rhev-2.6.0-28.el7_3.9.x86_64
ipxe-roms-qemu-20170123-1.git4e85b27.el7.noarch
qemu-kvm-rhev-2.6.0-28.el7_3.9.x86_64

Capabilities:
<capabilities>

  <host>
    <uuid>00000000-0000-0000-0000-0cc47ad2918e</uuid>
    <cpu>
      <arch>x86_64</arch>
      <model>Broadwell</model>
      <vendor>Intel</vendor>
      <topology sockets='1' cores='10' threads='2'/>
      <feature name='vme'/>
      <feature name='ds'/>
      <feature name='acpi'/>
      <feature name='ss'/>
      <feature name='ht'/>
      <feature name='tm'/>
      <feature name='pbe'/>
      <feature name='dtes64'/>
      <feature name='monitor'/>
      <feature name='ds_cpl'/>
      <feature name='vmx'/>
      <feature name='smx'/>
      <feature name='est'/>
      <feature name='tm2'/>
      <feature name='xtpr'/>
      <feature name='pdcm'/>
      <feature name='dca'/>
      <feature name='osxsave'/>
      <feature name='f16c'/>
      <feature name='rdrand'/>
      <feature name='arat'/>
      <feature name='tsc_adjust'/>
      <feature name='cmt'/>
      <feature name='xsaveopt'/>
      <feature name='mbm_total'/>
      <feature name='mbm_local'/>
      <feature name='pdpe1gb'/>
      <feature name='abm'/>
      <feature name='invtsc'/>
      <pages unit='KiB' size='4'/>
      <pages unit='KiB' size='2048'/>
      <pages unit='KiB' size='1048576'/>
    </cpu>
    <power_management>
      <suspend_mem/>
      <suspend_disk/>
      <suspend_hybrid/>
    </power_management>
    <migration_features>
      <live/>
      <uri_transports>
        <uri_transport>tcp</uri_transport>
        <uri_transport>rdma</uri_transport>
      </uri_transports>
    </migration_features>
    <topology>
      <cells num='2'>
        <cell id='0'>
          <memory unit='KiB'>134104160</memory>
          <pages unit='KiB' size='4'>33526040</pages>
          <pages unit='KiB' size='2048'>0</pages>
          <pages unit='KiB' size='1048576'>0</pages>
          <distances>
            <sibling id='0' value='10'/>
            <sibling id='1' value='21'/>
          </distances>
          <cpus num='20'>
            <cpu id='0' socket_id='0' core_id='0' siblings='0,20'/>
            <cpu id='1' socket_id='0' core_id='1' siblings='1,21'/>
            <cpu id='2' socket_id='0' core_id='2' siblings='2,22'/>
            <cpu id='3' socket_id='0' core_id='3' siblings='3,23'/>
            <cpu id='4' socket_id='0' core_id='4' siblings='4,24'/>
            <cpu id='5' socket_id='0' core_id='8' siblings='5,25'/>
            <cpu id='6' socket_id='0' core_id='9' siblings='6,26'/>
            <cpu id='7' socket_id='0' core_id='10' siblings='7,27'/>
            <cpu id='8' socket_id='0' core_id='11' siblings='8,28'/>
            <cpu id='9' socket_id='0' core_id='12' siblings='9,29'/>
            <cpu id='20' socket_id='0' core_id='0' siblings='0,20'/>
            <cpu id='21' socket_id='0' core_id='1' siblings='1,21'/>
            <cpu id='22' socket_id='0' core_id='2' siblings='2,22'/>
            <cpu id='23' socket_id='0' core_id='3' siblings='3,23'/>
            <cpu id='24' socket_id='0' core_id='4' siblings='4,24'/>
            <cpu id='25' socket_id='0' core_id='8' siblings='5,25'/>
            <cpu id='26' socket_id='0' core_id='9' siblings='6,26'/>
            <cpu id='27' socket_id='0' core_id='10' siblings='7,27'/>
            <cpu id='28' socket_id='0' core_id='11' siblings='8,28'/>
            <cpu id='29' socket_id='0' core_id='12' siblings='9,29'/>
          </cpus>
        </cell>
        <cell id='1'>
          <memory unit='KiB'>134217728</memory>
          <pages unit='KiB' size='4'>33554432</pages>
          <pages unit='KiB' size='2048'>0</pages>
          <pages unit='KiB' size='1048576'>0</pages>
          <distances>
            <sibling id='0' value='21'/>
            <sibling id='1' value='10'/>
          </distances>
          <cpus num='20'>
            <cpu id='10' socket_id='1' core_id='0' siblings='10,30'/>
            <cpu id='11' socket_id='1' core_id='1' siblings='11,31'/>
            <cpu id='12' socket_id='1' core_id='2' siblings='12,32'/>
            <cpu id='13' socket_id='1' core_id='3' siblings='13,33'/>
            <cpu id='14' socket_id='1' core_id='4' siblings='14,34'/>
            <cpu id='15' socket_id='1' core_id='8' siblings='15,35'/>
            <cpu id='16' socket_id='1' core_id='9' siblings='16,36'/>
            <cpu id='17' socket_id='1' core_id='10' siblings='17,37'/>
            <cpu id='18' socket_id='1' core_id='11' siblings='18,38'/>
            <cpu id='19' socket_id='1' core_id='12' siblings='19,39'/>
            <cpu id='30' socket_id='1' core_id='0' siblings='10,30'/>
            <cpu id='31' socket_id='1' core_id='1' siblings='11,31'/>
            <cpu id='32' socket_id='1' core_id='2' siblings='12,32'/>
            <cpu id='33' socket_id='1' core_id='3' siblings='13,33'/>
            <cpu id='34' socket_id='1' core_id='4' siblings='14,34'/>
            <cpu id='35' socket_id='1' core_id='8' siblings='15,35'/>
            <cpu id='36' socket_id='1' core_id='9' siblings='16,36'/>
            <cpu id='37' socket_id='1' core_id='10' siblings='17,37'/>
            <cpu id='38' socket_id='1' core_id='11' siblings='18,38'/>
            <cpu id='39' socket_id='1' core_id='12' siblings='19,39'/>
          </cpus>
        </cell>
      </cells>
    </topology>
    <secmodel>
      <model>selinux</model>
      <doi>0</doi>
      <baselabel type='kvm'>system_u:system_r:svirt_t:s0</baselabel>
      <baselabel type='qemu'>system_u:system_r:svirt_tcg_t:s0</baselabel>
    </secmodel>
    <secmodel>
      <model>dac</model>
      <doi>0</doi>
      <baselabel type='kvm'>+107:+107</baselabel>
      <baselabel type='qemu'>+107:+107</baselabel>
    </secmodel>
  </host>

  <guest>
    <os_type>hvm</os_type>
    <arch name='i686'>
      <wordsize>32</wordsize>
      <emulator>/usr/libexec/qemu-kvm</emulator>
      <machine maxCpus='240'>pc-i440fx-rhel7.3.0</machine>
      <machine canonical='pc-i440fx-rhel7.3.0' maxCpus='240'>pc</machine>
      <machine maxCpus='240'>pc-i440fx-rhel7.0.0</machine>
      <machine maxCpus='240'>rhel6.3.0</machine>
      <machine maxCpus='240'>rhel6.4.0</machine>
      <machine maxCpus='240'>rhel6.0.0</machine>
      <machine maxCpus='240'>pc-i440fx-rhel7.1.0</machine>
      <machine maxCpus='240'>pc-i440fx-rhel7.2.0</machine>
      <machine maxCpus='240'>pc-q35-rhel7.3.0</machine>
      <machine canonical='pc-q35-rhel7.3.0' maxCpus='240'>q35</machine>
      <machine maxCpus='240'>rhel6.5.0</machine>
      <machine maxCpus='240'>rhel6.6.0</machine>
      <machine maxCpus='240'>rhel6.1.0</machine>
      <machine maxCpus='240'>rhel6.2.0</machine>
      <domain type='qemu'/>
      <domain type='kvm'>
        <emulator>/usr/libexec/qemu-kvm</emulator>
      </domain>
    </arch>
    <features>
      <cpuselection/>
      <deviceboot/>
      <disksnapshot default='on' toggle='no'/>
      <acpi default='on' toggle='yes'/>
      <apic default='on' toggle='no'/>
      <pae/>
      <nonpae/>
    </features>
  </guest>

  <guest>
    <os_type>hvm</os_type>
    <arch name='x86_64'>
      <wordsize>64</wordsize>
      <emulator>/usr/libexec/qemu-kvm</emulator>
      <machine maxCpus='240'>pc-i440fx-rhel7.3.0</machine>
      <machine canonical='pc-i440fx-rhel7.3.0' maxCpus='240'>pc</machine>
      <machine maxCpus='240'>pc-i440fx-rhel7.0.0</machine>
      <machine maxCpus='240'>rhel6.3.0</machine>
      <machine maxCpus='240'>rhel6.4.0</machine>
      <machine maxCpus='240'>rhel6.0.0</machine>
      <machine maxCpus='240'>pc-i440fx-rhel7.1.0</machine>
      <machine maxCpus='240'>pc-i440fx-rhel7.2.0</machine>
      <machine maxCpus='240'>pc-q35-rhel7.3.0</machine>
      <machine canonical='pc-q35-rhel7.3.0' maxCpus='240'>q35</machine>
      <machine maxCpus='240'>rhel6.5.0</machine>
      <machine maxCpus='240'>rhel6.6.0</machine>
      <machine maxCpus='240'>rhel6.1.0</machine>
      <machine maxCpus='240'>rhel6.2.0</machine>
      <domain type='qemu'/>
      <domain type='kvm'>
        <emulator>/usr/libexec/qemu-kvm</emulator>
      </domain>
    </arch>
    <features>
      <cpuselection/>
      <deviceboot/>
      <disksnapshot default='on' toggle='no'/>
      <acpi default='on' toggle='yes'/>
      <apic default='on' toggle='no'/>
    </features>
  </guest>

</capabilities>


qemu.conf snip:
nvram = [
   "/usr/share/OVMF/OVMF_CODE.fd:/usr/share/OVMF/OVMF_VARS.fd",
   "/usr/share/OVMF/OVMF_CODE.secboot.fd:/usr/share/OVMF/OVMF_VARS.fd",
   "/usr/share/AAVMF/AAVMF_CODE.fd:/usr/share/AAVMF/AAVMF_VARS.fd"
]


DHCP server config:
find /var/lib/tftpboot/
/var/lib/tftpboot/
/var/lib/tftpboot/pxelinux
/var/lib/tftpboot/pxelinux/pxelinux.cfg
/var/lib/tftpboot/pxelinux/pxelinux.cfg/default
/var/lib/tftpboot/pxelinux/pxelinux.cfg/efidefault
/var/lib/tftpboot/pxelinux/centos7
/var/lib/tftpboot/pxelinux/centos7/vmlinuz
/var/lib/tftpboot/pxelinux/centos7/initrd.img
/var/lib/tftpboot/pxelinux/BOOTX64.EFI
/var/lib/tftpboot/pxelinux/grub.cfg
/var/lib/tftpboot/pxelinux/msgs
/var/lib/tftpboot/pxelinux/msgs/boot.msg
/var/lib/tftpboot/pxelinux/pxelinux.0
/var/lib/tftpboot/uefi
/var/lib/tftpboot/uefi/vmlinuz
/var/lib/tftpboot/uefi/initrd.img
/var/lib/tftpboot/centos7
/var/lib/tftpboot/centos7/vmlinuz
/var/lib/tftpboot/centos7/initrd.img

dhcpd.conf:
option arch code 93 = unsigned integer 16; # RFC4578
allow booting;
allow bootp;
subnet 192.168.122.0 netmask 255.255.255.0 {
    option routers 192.168.122.1;
    range 192.168.122.100 192.168.122.200;
    class "pxeclients" {
        match if substring (option vendor-class-identifier, 0, 9) = "PXEClient";
        next-server 192.168.122.2;

        if option arch = 00:07 {
            filename "pxelinux/BOOTX64.EFI";
            }
        else {
            filename "pxelinux/pxelinux.0";
        }

    }
}



-----------------------


Could you please provide the same information for me to compare? There either is a bug, or I'm missing some small detail

Thanks

Comment 5 Laszlo Ersek 2017-06-20 18:06:55 UTC
This looks like a good config on the surface. (Personally I don't use dhcpd but dnsmasq, as configured by libvirt, for netbooting; but it's also possible to make it work with dhcpd.)

What is your "pxelinux/BOOTX64.EFI" exactly? Is it a copy of "shim.efi"?

What symptoms do you see? The same as before? Generally I use tcpdump and the dhcp/tftp server logs at this point to see what happens.

For reference, my config is:

$ virsh net-dumpxml --inactive default
<network>
  <name>default</name>
  <uuid>4ae8659d-38be-4f65-93b2-fd2758bfdc61</uuid>
  <forward mode='nat'/>
  <bridge name='virbr0' stp='on' delay='0'/>
  <mac address='52:54:00:c3:97:eb'/>
  <dns>
    <host ip='192.168.122.1'>
      <hostname>nfsserver</hostname>
    </host>
  </dns>
  <ip address='192.168.122.1' netmask='255.255.255.0'>
    <tftp root='/var/lib/dnsmasq'/>
    <dhcp>
      <range start='192.168.122.2' end='192.168.122.254'/>
      <bootp file='shim.efi'/>
    </dhcp>
  </ip>
</network>

$ ls -l /var/lib/dnsmasq/

-rw-r--r--. 1 root root      209 2015-04-10 11:29:25 +0200 grub.cfg
-rw-r--r--. 1 root root  1069520 2014-12-01 11:47:40 +0100 grubx64.efi
-rw-r--r--. 1 root root 37500320 2014-12-01 11:50:19 +0100 initrd.img
-rw-r--r--. 1 root root  1285808 2014-12-01 11:47:58 +0100 shim.efi
-rw-r--r--. 1 root root  5024976 2014-12-01 11:48:16 +0100 vmlinuz

$ cat /var/lib/dnsmasq/grub.cfg

set timeout=600
menuentry 'install RHEL-7.1-20141127.0-Server over NFS' {
  linuxefi  vmlinuz ip=dhcp inst.repo=nfs:nfsserver:/mnt/data/isos/RHEL-7.1-20141127.0-Server-x86_64-dvd1.iso
  initrdefi initrd.img
}

IOW I have both the dhcp and tftp servers on the same machine. Perhaps using a separate "next-server" in your dhcpd.conf exposes a real issue; I'm not so sure about that. Just to narrow it down, can you put both daemons on the same server?

Thanks.

Comment 6 Laszlo Ersek 2017-06-20 18:19:37 UTC
Also, since you are using a libvirt guest with virtual networking:

  <interface type='network'>

  http://libvirt.org/formatdomain.html#elementsNICSVirtual
  http://libvirt.org/formatnetwork.html#examplesNAT

am I correct to think that you have a DHCP server conflict on the 192.168.122.x subnet?

Namely,
- one DHCP server is provided by libvirt itself: it generates a config file for
  "dnsmasq" from the network definition in
  "/etc/libvirt/qemu/networks/default.xml", and launches dnsmasq,

- and the second DHCP server on the same subnet is your own dhcpd.

I think this should not be done. One subnet should be server by a single dhcp server + tftp server pair. A manually configured dhcp server is appropriate if you use bridged networking for the guest:

  <interface type='bridge'>

  http://libvirt.org/formatdomain.html#elementsNICSBridge
  http://libvirt.org/formatnetwork.html#examplesBridge

Comment 7 Dan Yasny 2017-06-20 18:20:47 UTC
(In reply to Laszlo Ersek from comment #5)
> This looks like a good config on the surface. (Personally I don't use dhcpd
> but dnsmasq, as configured by libvirt, for netbooting; but it's also
> possible to make it work with dhcpd.)
> 
> What is your "pxelinux/BOOTX64.EFI" exactly? Is it a copy of "shim.efi"?
> 

a copy of EFI/BOOT/grubx64.efi 

> What symptoms do you see? The same as before? Generally I use tcpdump and
> the dhcp/tftp server logs at this point to see what happens.

There must be another bug, because once I switch to EFI I lose the VM console. So far I try to monitor thigns from the PXE machine side

> 
> For reference, my config is:
> 
> $ virsh net-dumpxml --inactive default
> <network>
>   <name>default</name>
>   <uuid>4ae8659d-38be-4f65-93b2-fd2758bfdc61</uuid>
>   <forward mode='nat'/>
>   <bridge name='virbr0' stp='on' delay='0'/>
>   <mac address='52:54:00:c3:97:eb'/>
>   <dns>
>     <host ip='192.168.122.1'>
>       <hostname>nfsserver</hostname>
>     </host>
>   </dns>
>   <ip address='192.168.122.1' netmask='255.255.255.0'>
>     <tftp root='/var/lib/dnsmasq'/>
>     <dhcp>
>       <range start='192.168.122.2' end='192.168.122.254'/>
>       <bootp file='shim.efi'/>
>     </dhcp>
>   </ip>
> </network>
> 
> $ ls -l /var/lib/dnsmasq/
> 
> -rw-r--r--. 1 root root      209 2015-04-10 11:29:25 +0200 grub.cfg
> -rw-r--r--. 1 root root  1069520 2014-12-01 11:47:40 +0100 grubx64.efi
> -rw-r--r--. 1 root root 37500320 2014-12-01 11:50:19 +0100 initrd.img
> -rw-r--r--. 1 root root  1285808 2014-12-01 11:47:58 +0100 shim.efi
> -rw-r--r--. 1 root root  5024976 2014-12-01 11:48:16 +0100 vmlinuz
> 
> $ cat /var/lib/dnsmasq/grub.cfg
> 
> set timeout=600
> menuentry 'install RHEL-7.1-20141127.0-Server over NFS' {
>   linuxefi  vmlinuz ip=dhcp
> inst.repo=nfs:nfsserver:/mnt/data/isos/RHEL-7.1-20141127.0-Server-x86_64-
> dvd1.iso
>   initrdefi initrd.img
> }
> 
> IOW I have both the dhcp and tftp servers on the same machine. Perhaps using
> a separate "next-server" in your dhcpd.conf exposes a real issue; I'm not so
> sure about that. Just to narrow it down, can you put both daemons on the
> same server?

192.168.122.2 is in fact the same machine. I added the next-server option later on because I saw it used in a manual, but with no changes in the result. 


The main difference is that you seem to be using the hypervisor as the PXE host, while in my case I create a separate VM to perform that role, and cancel DHCP on the libvirt network. 

> 
> Thanks.

Comment 8 Laszlo Ersek 2017-06-20 18:36:48 UTC
(In reply to Dan Yasny from comment #7)
> (In reply to Laszlo Ersek from comment #5)

> There must be another bug, because once I switch to EFI I lose the VM
> console.

Ugh, that's very strange, I've never encountered such a problem.

Anyway... can you capture the OVMF debug log please? Please see point (4) in comment 3.

> The main difference is that you seem to be using the hypervisor as the PXE
> host, while in my case I create a separate VM to perform that role, and
> cancel DHCP on the libvirt network. 

Give me a few minutes to try to replicate this on my laptop. Thanks.

Comment 9 Laszlo Ersek 2017-06-20 21:55:36 UTC
OK, so I managed to set up the cross-VM DHCP and PXE boot. Netboot works
fine. I'll capture the relevant bits here.

(01) My laptop runs RHEL-7.4 Beta, and the VM providing the DHCP and PXE
     server also runs RHEL-7.4 Beta.

(02) On my laptop, I created a new virtual network, with DHCP disabled (so
     that DHCP would not be served from the host side to guests on this
     subnet):

> $ virsh net-dumpxml --inactive cross-vm-dhcp
> <network>
>   <name>cross-vm-dhcp</name>
>   <uuid>3104697e-70f1-4a3f-b9ea-c53bdeb2beb1</uuid>
>   <forward mode='nat'/>
>   <bridge name='virbr2' stp='on' delay='0'/>
>   <mac address='52:54:00:18:64:e7'/>
>   <ip address='192.168.124.1' netmask='255.255.255.0'>
>   </ip>
> </network>

(03) I modified one of my preexistent RHEL-7.4 Beta guests
     ("ovmf.rhel7.q35") so that it would become the DHCP & TFTP server on
     this new network. Namely, I added the following XML element to its
     domain XML:

> <interface type='network'>
>   <mac address='52:54:00:0a:04:6c'/>
>   <source network='cross-vm-dhcp'/>
>   <model type='virtio'/>
>   <address type='pci' domain='0x0000' bus='0x02' slot='0x03'
>    function='0x0'/>
> </interface>

     Here bus='0x02' identifies the legacy PCI bridge that hangs off of the
     DMI-to-PCI bridge [*], since this is a Q35 board. And slot='0x03' is
     simply a free slot on bus='0x02'.

     [*] For reference (note: this is preexistent config):

> <controller type='pci' index='0' model='pcie-root'/>
> <controller type='pci' index='1' model='dmi-to-pci-bridge'>
>   <model name='i82801b11-bridge'/>
>   <address type='pci' domain='0x0000' bus='0x00' slot='0x1e'
>    function='0x0'/>
> </controller>
> <controller type='pci' index='2' model='pci-bridge'>
>   <model name='pci-bridge'/>
>   <target chassisNr='2'/>
>   <address type='pci' domain='0x0000' bus='0x01' slot='0x01'
>    function='0x0'/>
> </controller>

(04) I booted "ovmf.rhel7.q35", logged in, and with the usual
     "nm-connection-editor" GUI, I configured the new interface:

     - disable (ignore) IPv6 on the new NIC
     - give a static IPv4 address (192.168.124.2) to the new NIC
     - as gateway, use 192.168.124.1 (refer to point (02))

(05) Still in the DHCP / PXE server guest, I assigned metric 110 to the new
     interface, just to be sure:

> nmcli connection
> [prints list of connections, with names and UUIDs]
>
> nmcli connection modify uuid 'ba149df2-20f8-4883-89c6-e1ab08369d0c' \
>   ipv4.route-metric 110

(06) At this point, performed the steps under <https://da.gd/Rh7EfiPxe>.
     Some notes:

     (a) The "firewall-cmd --add-service=tftp" command didn't stick. I had
         to invoke "firewall-config" as root from the GUI, and enable "tftp"
         for both the Permanent and Runtime configs.

     (b) The contents of my "/etc/dhcp/dhcpd.conf" (using backslashes for
         readability):

> option architecture-type code 93 = unsigned integer 16;
>
> subnet 192.168.124.0 netmask 255.255.255.0 {
>   option routers 192.168.124.1;
>   option domain-name-servers 192.168.124.1;
>   range 192.168.124.100 192.168.124.200;
>   class "pxeclients" {
>     match if substring (option vendor-class-identifier, 0, 9) = \
>       "PXEClient";
>     next-server 192.168.124.2;
>     if option architecture-type = 00:07 {
>       filename "shim.efi";
>     } else {
>       filename "pxelinux/pxelinux.0";
>     }
>   }
> }

     (c) The "shim.efi" and "grubx64.efi" binaries were simply copied from
         under "/boot/efi/EFI/redhat", to "/var/lib/tftpboot".

     (d) I populated the "/var/lib/tftpboot/images/RHEL-7.4-20170616.3/"
         directory from

     http://.../RHEL-7.4-20170616.3/compose/Server/x86_64/os/images/pxeboot/

         downloading the files "initrd.img" and "vmlinuz".

     (e) Contents of "/var/lib/tftpboot/grub.cfg" (using backslashes here
         for readability):

> set timeout=60
> menuentry 'install RHEL-7.4-20170616.3 from HTTP' {
>   linuxefi images/RHEL-7.4-20170616.3/vmlinuz ip=dhcp \
>     inst.repo=http://.../RHEL-7.4-20170616.3/compose/Server/x86_64/os/
>   initrdefi images/RHEL-7.4-20170616.3/initrd.img
> }

     (f) ultimately this is the directory structure (only regular files
         listed):

> /var/lib/tftpboot/grub.cfg
> /var/lib/tftpboot/grubx64.efi
> /var/lib/tftpboot/images/RHEL-7.4-20170616.3/initrd.img
> /var/lib/tftpboot/images/RHEL-7.4-20170616.3/vmlinuz
> /var/lib/tftpboot/shim.efi

     (g) Importantly, once all files were in place, I ran:

         chmod -cR u=rwX,g=rX,o=rX /var/lib/tftpboot/
         restorecon -FvvR /var/lib/tftpboot/

     (h) Regarding the "tftp.socket" and "tftp.service" systemd units, DO
         NOT enable or start those (contrarily to the Installation Guide
         instructions). Dan told me on IRC that his setup included xinetd,
         and systemd's said units conflict with that. So, it's best to
         explicitly disable and stop these systemd units.

(07) This is my "/etc/xinetd.d/tftp" file:

> service tftp
> {
>         disable                 = no
>         socket_type             = dgram
>         protocol                = udp
>         wait                    = yes
>         user                    = root
>         server                  = /usr/sbin/in.tftpd
>         server_args             = -s /var/lib/tftpboot
>         disable                 = yes
>         per_source              = 11
>         cps                     = 100 2
>         flags                   = IPv4
> }

(08) After this, make sure both dhcpd and xinetd are enabled and running,
     with "systemctl is-enabled", "systemctl status", and also "journalctl".
     (In particular, xinetd's log messages should confirm that "tftp" is
     enabled.) Again, systemd's own "tftp.socket" and "tftp.service" units
     should be disabled & stopped with systemctl.

(09) I created a new guest ("ovmf.rhel7.dhclient.q35") as DHCP / PXE client.
     These are the XML elements worth mentioning:

> <domain type='kvm'>
>   <os>
>     <type arch='x86_64' machine='pc-q35-rhel7.4.0'>hvm</type>
>     <loader readonly='yes' secure='yes'
>      type='pflash'>/usr/share/OVMF/OVMF_CODE.secboot.fd</loader>
>     <nvram
>      >/var/lib/libvirt/qemu/nvram/ovmf.rhel7.dhclient.q35_VARS.fd</nvram>
>     <bootmenu enable='yes' timeout='3000'/>
>   </os>
>   <features>
>     <acpi/>
>     <apic/>
>     <pae/>
>     <smm state='on'/>
>   </features>
>   <devices>
>     <interface type='network'>
>       <mac address='52:54:00:1b:10:3e'/>
>       <source network='cross-vm-dhcp'/>
>       <model type='virtio'/>
>       <boot order='1'/>
>       <address type='pci' domain='0x0000' bus='0x02' slot='0x03'
>        function='0x0'/>
>     </interface>
>   </devices>
> </domain>

(10) After launching "ovmf.rhel7.dhclient.q35", it net-booted (ultimately)
     to the Anaconda welcome GUI.

So, the use case works fine for me, it's just that setting up the DHCP /
TFTP server is quite tedious.

Comment 10 Laszlo Ersek 2017-06-20 22:13:31 UTC
... The packages that had to be installed were: "tftp-server", "xinetd", and "dhcp".

Comment 11 Dan Yasny 2017-06-22 13:35:02 UTC
It looks like you've been using straight PXE boot, not iPXE, any chance you still have the configuration working and you could confirm iPXE also works?

Comment 12 Laszlo Ersek 2017-06-22 14:19:09 UTC
Hi Dan,

In comment 3 I wrote,

> (5) Some more background on iPXE as it relates to OVMF. (I see you
> mentioned iPXE above.) In the "ipxe-roms-qemu" package, we provide such
> virtual NIC option ROMs that are built from the iPXE project in a
> *stripped down* manner. These ROMs are "combined" (i.e., legacy BIOS +
> UEFI) PCI Expansion ROMs, and the UEFI half of each oprom contains only a
> minimal Simple Network Protocol driver from iPXE. In other words, iPXE
> only provides the lowest level NIC driver to the EFI environment, and all
> the DHCP and PXE booting logic comes from the edk2 project modules that
> are built into OVMF.

This means that, when using OVMF with ipxe-roms-qemu, the UEFI iPXE option
ROMs will *not* provide the full-blown iPXE capabilities that you may be
used to. This is done *deliberately*.

The history behind the decision is somewhat sordid. I will provide you with
links below, for background. The end result remains, if you want the full
iPXE capabilities chained from OVMF PXE boot, then you will have to PXE-boot
a standalone iPXE UEFI executable from OVMF (specifying "ipxe.efi" in place
of "shim.efi", as "filename" in "dhcpd.conf"). The "ipxe.efi" binary is
available from the "ipxe-bootimgs" package; under pathname
"/usr/share/ipxe/ipxe.efi".

Any bugs that occur in such a setup, after OVMF netboots "ipxe.efi", should
be reported against the "ipxe" Bugzilla component.

References:
- http://lists.ipxe.org/pipermail/ipxe-devel/2015-February/003979.html
- http://lists.ipxe.org/pipermail/ipxe-devel/2015-April/004085.html
- http://lists.ipxe.org/pipermail/ipxe-devel/2015-July/004295.html
- http://lists.nongnu.org/archive/html/qemu-devel/2015-07/msg04440.html
- http://lists.nongnu.org/archive/html/qemu-devel/2015-07/msg04442.html
- http://lists.nongnu.org/archive/html/qemu-devel/2015-07/msg04626.html
- http://git.qemu.org/?p=qemu.git;a=commit;h=cf2b4b5b77a7
- bug 1084561
- bug 1181980
- bug 1295673
- http://wiki.qemu-project.org/IpxeDownstreamForQemu

Comment 14 Laszlo Ersek 2017-06-22 14:40:45 UTC
(In reply to Laszlo Ersek from comment #12)

> If you want the full iPXE capabilities chained from OVMF PXE boot, then
> you will have to PXE-boot a standalone iPXE UEFI executable from OVMF
> (specifying "ipxe.efi" in place of "shim.efi", as "filename" in
> "dhcpd.conf"). The "ipxe.efi" binary is available from the "ipxe-bootimgs"
> package; under pathname "/usr/share/ipxe/ipxe.efi".

This is called "chainloading" and the config bits are documented on the
ipxe.org website:
- http://ipxe.org/howto/chainloading
- http://ipxe.org/howto/dhcpd#pxe_chainloading

I still have my test env, and I can try testing this for you. (Such a setup
is a first for me though.)

Comment 15 Laszlo Ersek 2017-06-22 19:12:23 UTC
(In reply to Dan Yasny from comment #11)

> It looks like you've been using straight PXE boot, not iPXE, any chance
> you still have the configuration working and you could confirm iPXE also
> works?

OK, so this is what I managed to do, when including iPXE. I have two
scenarios:

(1) The first scenario is where iPXE is *added* to shim.efi and grubx64.efi.
    This scenario proceeds quite well with netbooting, but it fails
    ultimately, and I couldn't get it to work.

(2) The second scenario is where iPXE *replaces* shim.efi and grubx64.efi,
    loads its own script file, and boots the kernel + initrd directly.


(0) Common steps, in the VM that runs xinetd+tftp / dhcpd
    ("ovmf.rhel7.q35"), on top of the steps described in comment 9:

(0.1) Install the standalone iPXE UEFI binary, so that it can be served via
      TFTP:

> yum install ipxe-bootimgs
> cp -aiv /usr/share/ipxe/ipxe.efi /var/lib/tftpboot/
> restorecon -Fvv /var/lib/tftpboot/ipxe.efi

(0.2) For completeness of "dhcpd.conf" below, also install the standalone
      iPXE BIOS binary:

> cp -aiv /usr/share/ipxe/undionly.kpxe /var/lib/tftpboot/
> restorecon -Fvv /var/lib/tftpboot/undionly.kpxe


(1) For the first scenario:

(1.1) Edit "/etc/dhcp/dhcpd.conf" as documented under
      <http://ipxe.org/howto/chainloading> and
      <http://ipxe.org/howto/dhcpd#pxe_chainloading>:

> option architecture-type code 93 = unsigned integer 16;
>
> subnet 192.168.124.0 netmask 255.255.255.0 {
>   option routers 192.168.124.1;
>   option domain-name-servers 192.168.124.1;
>   range 192.168.124.100 192.168.124.200;
>   class "pxeclients" {
>     match if substring (option vendor-class-identifier, 0, 9) =
>       "PXEClient";
>     next-server 192.168.124.2;
>     if exists user-class and option user-class = "iPXE" {
>       # second stage: iPXE is booting, serve shim or pxelinux
>       if option architecture-type = 00:07 {
>         filename "shim.efi";
>       } else {
>         filename "pxelinux/pxelinux.0";
>       }
>     } else {
>       # first stage, firmware is booting, serve iPXE
>       if option architecture-type = 00:07 {
>         filename "ipxe.efi";
>       } else {
>         filename "undionly.kpxe";
>       }
>     }
>   }
> }

(1.2) Restart dhcpd:

> systemctl restart dhcpd.service

(1.3) Launch the VM that PXE-boots.

(1.4) With this configuration, the following happens:

      - OVMF successfully downloads and runs "ipxe.efi",
      - "ipxe.efi" successfully downloads and runs "shim.efi",
      - "shim.efi" successfully downloads and runs "grubx64.efi",
      - unfortunately, "grubx64.efi" fails to download its config file,
        "grub.cfg", and bombs out to the grub shell.
      - This behavior is unchanged if "dhcpd.conf" specifies "grubx64.efi"
        rather than "shim.efi", for the second stage. In that case,
        "ipxe.efi" successfully downloads and runs "grubx64.efi", but the
        latter fails the exact same way.


(2) For the second scenario:

(2.1) Create the following ipxe command script, in
      "/var/lib/tftpboot/ipxe.cfg" (lines broken up with backslashes here
      for readability):

> #!ipxe
>
> kernel images/RHEL-7.4-20170616.3/vmlinuz initrd=initrd.img ip=dhcp \
>   inst.repo=http://.../RHEL-7.4-20170616.3/compose/Server/x86_64/os/
> initrd images/RHEL-7.4-20170616.3/initrd.img
> boot

      NOTE: the "initrd=initrd.img" kernel parameter is *required*. It must
      match the last pathname component ("initrd.img") from the "initrd"
      iPXE command. Otherwise the kernel will not find the initial ramdisk.

(2.2) Run

> restorecon -Fvv /var/lib/tftpboot/ipxe.cfg

(2.3) Edit "/etc/dhcp/dhcpd.conf" as follows:

> option architecture-type code 93 = unsigned integer 16;
>
> subnet 192.168.124.0 netmask 255.255.255.0 {
>   option routers 192.168.124.1;
>   option domain-name-servers 192.168.124.1;
>   range 192.168.124.100 192.168.124.200;
>   class "pxeclients" {
>     match if substring (option vendor-class-identifier, 0, 9) =
>       "PXEClient";
>     next-server 192.168.124.2;
>     if exists user-class and option user-class = "iPXE" {
>       # second stage: iPXE is booting, serve command script
>       filename "ipxe.cfg";
>     } else {
>       # first stage, firmware is booting, serve iPXE
>       if option architecture-type = 00:07 {
>         filename "ipxe.efi";
>       } else {
>         filename "undionly.kpxe";
>       }
>     }
>   }
> }

(2.4) Restart dhcpd:

> systemctl restart dhcpd.service

(2.5) Launch the VM that PXE-boots.

(2.6) In this configuration,

      - OVMF successfully downloads and runs "ipxe.efi",
      - "ipxe.efi" downloads and interprets the command script "ipxe.cfg",
      - "ipxe.efi" downloads the kernel and the initrd over TFTP, and
        launches the kernel.
      - The kernel finds the initial ramdisk because we explicitly tell it
        under what name to look for the ramdisk, with the "initrd=..."
        option. Grub sets this cmdline option automatically, with the
        "linuxefi" and "initrdefi" commands, but when using iPXE on an EFI
        system, the option has to be passed manually. Hat tip to
        <https://doc.rogerwhittaker.org.uk/ipxe-installation-and-EFI/> for
        the reminder.

Comment 16 Laszlo Ersek 2017-06-22 19:23:36 UTC
To summarize, OVMF + ipxe.efi work fine as well, but you have to exclude grub from that scenario -- instead, iPXE must download an ipxe command script, and boot the kernel and the initrd based on that command script. Also, don't forget the initrd=... kernel cmdline option, which must match the last pathname component of the "initrd" iPXE command.

(... Dependent on your use case, you might even consider the replacement of shim and grub with iPXE -- and a flexible, customized iPXE command script -- a bonus.)


Note You need to log in before you can comment on or make changes to this bug.