Description of problem: virtio-net OVMF guest took 6 min to PXE boot RHEL 7 kernel Version-Release number of selected component (if applicable): ipxe-roms-qemu-20130517-6.gitc4bce43.el7.noarch ipxe.git-1.0.0-1553.b203.gabf875a.x86_64 qemu-kvm-rhev-2.1.2-18.el7.x86_64 ipxe-bootimgs-20130517-6.gitc4bce43.el7.noarch ipxe-roms-20130517-6.gitc4bce43.el7.noarch OVMF-20140822-4.git9ece15a.el7.x86_64 kernel-3.10.0-221.el7.x86_64 pxe kernel -> from RHEL 7.0 GA ISO How reproducible: 1/1 Steps to Reproduce: 1. deploy a uefi capable pxe server detail configurateion https://bugzilla.redhat.com/show_bug.cgi?id=1181934#c0 2. boot a OVMF guest, using efi virtio-net driver -> by disable rom file 2015-01-14 08:38:30.244+0000: starting up LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=spice /usr/libexec/qemu-kvm -name rhel7.0 -S -machine pc-i440fx-rhel7.1.0,accel=kvm,usb=off -cpu Opteron_G5 -drive file=/usr/share/OVMF/OVMF_CODE.fd,if=pflash,format=raw,unit=0,readonly=on -drive file=/var/lib/libvirt/qemu/nvram/rhel7.0_VARS.fd,if=pflash,format=raw,unit=1 -m 2048 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid bc4e2b8d-d62a-40d5-bcc4-8fecbfd56ebe -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/rhel7.0.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot menu=on,strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 -drive file=/home/uefi-pxe-grub2.qcow2,if=none,id=drive-virtio-disk0,format=qcow2 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=2 \ \ \ \ \ -netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=27 \ \ \ \ \ -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:18:aa:b2,bus=pci.0,addr=0x3,romfile=,bootindex=1 \ \ \ \ \ -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -device usb-tablet,id=input0 -spice port=5902,addr=0.0.0.0,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vgamem_mb=8,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1 -chardev spicevmc,id=charredir2,name=usbredir -device usb-redir,chardev=charredir2,id=redir2 -chardev spicevmc,id=charredir3,name=usbredir -device usb-redir,chardev=charredir3,id=redir3 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -msg timestamp=on char device redirected to /dev/pts/4 (label charserial0) main_channel_link: add main channel client main_channel_handle_parsed: net test: invalid values, latency 0 roundtrip 1296. assuming highbandwidth red_dispatcher_set_cursor_peer: inputs_connect: inputs channel client create Domain id=11 is tainted: custom-monitor 3. boot into efi env, and select boot from the NIC Actual results: the VM took 6 min to finish transfer vmlinuz and initrd.img from tftp server Expected results: transferring faster, SeaBIOS based VM could finish within 1 min -r--r--r--. 2 root root 34M Jan 13 18:04 initrd.img -r-xr-xr-x. 2 root root 4.7M Jan 13 18:04 vmlinuz Additional info: virsh dumpxml rhel7.0 <domain type='kvm' id='11'> <name>rhel7.0</name> <uuid>bc4e2b8d-d62a-40d5-bcc4-8fecbfd56ebe</uuid> <memory unit='KiB'>2097152</memory> <currentMemory unit='KiB'>2097152</currentMemory> <vcpu placement='static'>2</vcpu> <resource> <partition>/machine</partition> </resource> <os> <type arch='x86_64' machine='pc-i440fx-rhel7.1.0'>hvm</type> <loader readonly='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE.fd</loader> <nvram>/var/lib/libvirt/qemu/nvram/rhel7.0_VARS.fd</nvram> <bootmenu enable='yes'/> </os> <features> <acpi/> <apic/> <pae/> </features> <cpu mode='custom' match='exact'> <model fallback='allow'>Opteron_G5</model> </cpu> <clock offset='utc'> <timer name='rtc' tickpolicy='catchup'/> <timer name='pit' tickpolicy='delay'/> <timer name='hpet' present='no'/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <pm> <suspend-to-mem enabled='no'/> <suspend-to-disk enabled='no'/> </pm> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/home/uefi-pxe-grub2.qcow2'/> <backingStore/> <target dev='vda' bus='virtio'/> <boot order='2'/> <alias name='virtio-disk0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </disk> <controller type='usb' index='0' model='ich9-ehci1'> <alias name='usb0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x7'/> </controller> <controller type='usb' index='0' model='ich9-uhci1'> <alias name='usb0'/> <master startport='0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0' multifunction='on'/> </controller> <controller type='usb' index='0' model='ich9-uhci2'> <alias name='usb0'/> <master startport='2'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x1'/> </controller> <controller type='usb' index='0' model='ich9-uhci3'> <alias name='usb0'/> <master startport='4'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x2'/> </controller> <controller type='pci' index='0' model='pci-root'> <alias name='pci.0'/> </controller> <controller type='virtio-serial' index='0'> <alias name='virtio-serial0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </controller> <interface type='bridge'> <mac address='52:54:00:18:aa:b2'/> <source bridge='uefi-pxe'/> <target dev='vnet2'/> <model type='virtio'/> <boot order='1'/> <alias name='net0'/> <rom file=''/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> <serial type='pty'> <source path='/dev/pts/4'/> <target port='0'/> <alias name='serial0'/> </serial> <console type='pty' tty='/dev/pts/4'> <source path='/dev/pts/4'/> <target type='serial' port='0'/> <alias name='serial0'/> </console> <channel type='spicevmc'> <target type='virtio' name='com.redhat.spice.0' state='disconnected'/> <alias name='channel0'/> <address type='virtio-serial' controller='0' bus='0' port='1'/> </channel> <input type='tablet' bus='usb'> <alias name='input0'/> </input> <input type='mouse' bus='ps2'/> <input type='keyboard' bus='ps2'/> <graphics type='spice' port='5902' autoport='yes' listen='0.0.0.0'> <listen type='address' address='0.0.0.0'/> </graphics> <sound model='ich6'> <alias name='sound0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </sound> <video> <model type='qxl' ram='65536' vram='65536' vgamem='8192' heads='1'/> <alias name='video0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </video> <redirdev bus='usb' type='spicevmc'> <alias name='redir0'/> </redirdev> <redirdev bus='usb' type='spicevmc'> <alias name='redir1'/> </redirdev> <redirdev bus='usb' type='spicevmc'> <alias name='redir2'/> </redirdev> <redirdev bus='usb' type='spicevmc'> <alias name='redir3'/> </redirdev> <memballoon model='virtio'> <alias name='balloon0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/> </memballoon> </devices> <seclabel type='dynamic' model='selinux' relabel='yes'> <label>system_u:system_r:svirt_t:s0:c213,c426</label> <imagelabel>system_u:object_r:svirt_image_t:s0:c213,c426</imagelabel> </seclabel> </domain>
I hope firefox developers will burn in a very special corner of hell; I *AGAIN* lost a fucking bugzilla comment because I opened another tab that froze firefox completely.
So, here's that comment again. Much shorter version because I lost my patience. This is not an OVMF bug. It's a grub2 bug that is known. OVMF only downloads the grub image that you prepare with grub-mkstandalone. The vmlinuz and initrd files are downloaded by grub2, using the virtio-net SimpleNetworkProtocol that OVMF provides. The grub2 bug is that grub2 doesn't reopen the network with exclusive access, therefore the edk2 ARP service remains running, and steals packets from grub2, throwing off the TFTP transfer. I thought however that this grub2 bug was fixed (in Fedora at least). What grub2 version are you exactly using? CC'ing pjones.
(In reply to Laszlo Ersek from comment #2) > I hope firefox developers will burn in a very special corner of hell; I > *AGAIN* lost a fucking bugzilla comment because I opened another tab that > froze firefox completely. Take it easy, Thunderbird is more unstable, it crash once per day(sometimes more) on my PC. Thinking this could be a grub bug too, pasting the versions grub2-tools-2.02-0.16.el7.x86_64 grubby-8.28-11.el7.x86_64 grub2-efi-modules-2.02-0.16.el7.x86_64 grub2-efi-2.02-0.16.el7.x86_64 the file grub2x64-with-cfg.efi used for pxe booting was created by packages above, and cmd as below: grub2-mkstandalone -d /usr/lib/grub/x86_64-efi -O x86_64-efi --fonts="unicode" -o grub2x64-with-cfg.efi boot/grub/grub.cfg
(In reply to Laszlo Ersek from comment #3) > So, here's that comment again. Much shorter version because I lost my > patience. > > This is not an OVMF bug. It's a grub2 bug that is known. OVMF only downloads > the grub image that you prepare with grub-mkstandalone. The vmlinuz and > initrd files are downloaded by grub2, using the virtio-net > SimpleNetworkProtocol that OVMF provides. > > The grub2 bug is that grub2 doesn't reopen the network with exclusive > access, therefore the edk2 ARP service remains running, and steals packets > from grub2, throwing off the TFTP transfer. > > I thought however that this grub2 bug was fixed (in Fedora at least). What > grub2 version are you exactly using? CC'ing pjones. Aha, when I submitting the C#3, seeing a conflict, then submit it appening to your C#3, mine became C#4. the versions are in C#4, that is grub2-efi-2.02-0.16.el7.x86_64 moving to grub2
I have no clue if that version includes the fix. Looking at rhpkg, dist-git commit 1861df67 dumped about 150 patches into grub2, one of which contains the bugfix. Can you please retry without "grub2-mkstandalone"? Just copy the grubx64.efi and shim.efi binaries from the normal packages to the TFTP directory, and use shim.efi as "bootfile" in the DHCP config. That's what our own RHEL-7 Installation Guide says as well. A very simple grub.cfg file in the TFTP directory will suffice, like: set timeout=60 menuentry 'install RHEL-7.1-20141127.0-Server over NFS' { linuxefi vmlinuz ip=dhcp inst.repo=nfs:nfsserver:/mnt/data/isos/RHEL-7.1-20141127.0-Server-x86_64-dvd1.iso initrdefi initrd.img } See also: http://post-office.corp.redhat.com/archives/virt-arm/2014-December/msg00004.html
change bz title to reflect the component changing
(In reply to Laszlo Ersek from comment #6) > I have no clue if that version includes the fix. Looking at rhpkg, dist-git > commit 1861df67 dumped about 150 patches into grub2, one of which contains > the bugfix. > > Can you please retry without "grub2-mkstandalone"? Just copy the grubx64.efi > and shim.efi binaries from the normal packages to the TFTP directory, and > use shim.efi as "bootfile" in the DHCP config. That's what our own RHEL-7 > Installation Guide says as well. > I tried the setup from official doc first, unfortunately, it lies. I followed every letter from this doc, it just dont work, that's why I had to make a scratch build of grub image. https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html /Installation_Guide/chap-installation-server-setup.html > A very simple grub.cfg file in the TFTP directory will suffice, like: I tried similar cfg for about 100 times yesterday, grub2 wont read cfg on tftp. Then I tried embed it, it work https://bugzilla.redhat.com/show_bug.cgi?id=1181980#c4 > > set timeout=60 > menuentry 'install RHEL-7.1-20141127.0-Server over NFS' { > linuxefi vmlinuz ip=dhcp > inst.repo=nfs:nfsserver:/mnt/data/isos/RHEL-7.1-20141127.0-Server-x86_64- > dvd1.iso > initrdefi initrd.img > } > > See also: > http://post-office.corp.redhat.com/archives/virt-arm/2014-December/msg00004. > html
(In reply to Laszlo Ersek from comment #6) > See also: > http://post-office.corp.redhat.com/archives/virt-arm/2014-December/msg00004. > html the ISO compose in the url was build around 20141017, should be same version to me rpm -q grub2-efi -i | grep -i build Build Date : Fri 10 Oct 2014 04:53:02 AM CST accord to brew system, the one I am using is the latest grub2-efi-2.02-0.16.el7.x86_64 https://bugzilla.redhat.com/show_bug.cgi?id=1181980#c4
I don't think the installation guide lies, you just need to load shim.efi first, not grubx64.efi directly. Grub works differently when it is loaded by shim. Peter explained this once on an internal list, I can't recall the details; but you *really* need to load shim.efi first (ie. name it in the "bootfile" setting).
(In reply to Laszlo Ersek from comment #10) > I don't think the installation guide lies, you just need to load shim.efi > first, not grubx64.efi directly. Grub works differently when it is loaded by > shim. Peter explained this once on an internal list, I can't recall the > details; but you *really* need to load shim.efi first (ie. name it in the > "bootfile" setting). I tried with failure yesterday, it just wont load grub2. I will try that again... cd /tftproot ls shim.efi grubx64.efi grub.cfg snip of /etc/dhcp/dhcpd.conf ========== range 192.168.0.2 192.168.0.250; class "pxeclients" { match if substring (option vendor-class-identifier, 0, 9) = "PXEClient"; next-server 192.168.0.1; if option arch = 00:07 { filename "grub2x64-with-cfg.efi"; // failed with shim.efi from ISO } else if option arch = 00:09 { filename "grub2x64-with-cfg.efi"; } } } ===========
(In reply to Xiaoqing Wei from comment #11) > (In reply to Laszlo Ersek from comment #10) > > I don't think the installation guide lies, you just need to load shim.efi > > first, not grubx64.efi directly. Grub works differently when it is loaded by > > shim. Peter explained this once on an internal list, I can't recall the > > details; but you *really* need to load shim.efi first (ie. name it in the > > "bootfile" setting). > > I tried with failure yesterday, it just wont load grub2. > I will try that again... > > cd /tftproot > ls shim.efi grubx64.efi grub.cfg > > > snip of /etc/dhcp/dhcpd.conf > ========== > range 192.168.0.2 192.168.0.250; > > class "pxeclients" { > match if substring (option vendor-class-identifier, 0, 9) = "PXEClient"; > next-server 192.168.0.1; > if option arch = 00:07 { > filename "grub2x64-with-cfg.efi"; // failed with shim.efi from ISO What do you mean "failed with shim.efi from ISO"? In my testing (according to the RHEL7 Installation Guide), I took the shim.efi binary from the shim RPM.
(In reply to Laszlo Ersek from comment #12) > What do you mean "failed with shim.efi from ISO"? > > In my testing (according to the RHEL7 Installation Guide), I took the > shim.efi binary from the shim RPM. Sorry, my bad, not state it clearly, I used rpm2cpio to convert shim(signed version).rpm and grub2-efi.rpm and took their files. my first attempt was cfg similar to your input in 1181980#c6, and failed :-( shim load grubx64.efi but stops there, so I treat it as fail to load a kernel. but with more attempt and google, now it work, turn out to be my grub.cfg needs a update :-) ========= working grub.cfg set timeout=5 menuentry 'Red Hat Enterprise Linux Server release 7.0 GA' --class os { insmod net insmod efinet insmod tftp insmod gzio insmod part_gpt insmod efi_gop insmod efi_uga # dhcp, tftp server in my network set net_default_server=192.168.0.1 echo 'Network status: ' net_ls_cards net_ls_addr net_ls_routes echo 'Loading Red Hat Enterprise Linux Server release 7.0 GA kernel ...' linuxefi (tftp)/rhel70/vmlinuz ip=dhcp \ inst.repo=nfs:192.168.0.1:/home/installation_source/RHEL7.0GA/RHEL-7.0-20140507.0-Server-x86_64-dvd1.iso echo 'Loading Red Hat Enterprise Linux Server release 7.0 GA initial ramdisk ...' initrdefi (tftp)/rhel70/initrd.img } =========
(In reply to Xiaoqing Wei from comment #13) this cfg still wont work for iPXE rom works with ( -romfile=""), using EFI stack. will upload screen dump on iPXE fail to load later
Created attachment 980345 [details] ipxe screen shot pic for https://bugzilla.redhat.com/show_bug.cgi?id=1181980#c14
Very interesting; I don't understand why you need such an elaborate grub.cfg while for me a very simple one works. Might be related to our different network setup, not sure. Anyway, thank you for confirming that it works with OVMF's builtin virtio-net driver (comment 14); seems like grub2 is not at fault. The screenshot references <http://ipxe.org/err/7f048002>. The location it gives is "image/efi_image.c (line 204)". Since upstream git commit c4bce43 (which is our current fork-off point), iPXE has seen a huge number of changes. For the file in question: $ git log --oneline --reverse c4bce43..master -- src/image/efi_image.c f473b9c [efi] Disable SNP devices when running iPXE as the application c3b6ccf [efi] Allow for interception of boot services calls by loaded image b53d4ae [efi] Unload started images only on failure 79419a1 [efi] Fill in loaded image's DeviceHandle if firmware fails to do so 4a480f1 [efi] Avoid unnecessarily passing pointers to EFI_HANDLEs 3b42ed4 [efi] Provide centralised definitions of commonly-used GUIDs 7b3cc18 [efi] Open device path protocol only at point of use 2bf428c [efi] Move abstract device path and handle functions to efi_utils.c 0cc2f42 [efi] Wrap any images loaded by our wrapped image 5d9fbf3 [efi] Provide dummy device path in efi_image_probe() From these, the first one (commit f473b9c) looks remotely relevant: commit f473b9c3f66a2166129e1f60774f56e673423c5a Author: Michael Brown <mcb30> Date: Fri Mar 14 14:16:05 2014 +0000 [efi] Disable SNP devices when running iPXE as the application Some UEFI builds will set up a timer to continuously poll any SNP devices. This can drain packets from the network device's receive queue before iPXE gets a chance to process them. Use netdev_rx_[un]freeze() to explicitly indicate when we expect our network devices to be driven via the external SNP API (as we do with the UNDI API on the standard BIOS build), and disable the SNP API except when receive queue processing is frozen. Signed-off-by: Michael Brown <mcb30> But in this case iPXE doesn't run as the application; grub2 is the application. iPXE only provides SNP (simple network protocol). In any case I think this bug should be moved to the ipxe component, given comment 14.
(click "Unwrap comments" at the top) Okay, I reproduced this issue with iPXE (same (ie. most recent) version as reported in comment 0). The symptoms are a bit different -- even shim's download fails at 15% -- but the end result is the same. Importantly, the OVMF debug log (*) is littered with messages such as: MnpReceivePacket: Size error, HL:TL = 1073233750:1536. This is logged by OVMF's MNP (managed network protocol) driver, which sits on top of the SNP provided by iPXE. Function MnpReceivePacket(), file "MdeModulePkg/Universal/Network/MnpDxe/MnpIo.c": > // > // Receive packet through Snp. > // > Status = Snp->Receive (Snp, &HeaderSize, &BufLen, BufPtr, NULL, NULL, NULL); > if (EFI_ERROR (Status)) { > DEBUG_CODE ( > if (Status != EFI_NOT_READY) { > DEBUG ((EFI_D_WARN, "MnpReceivePacket: Snp->Receive() = %r.\n", Status)); > } > ); > > return Status; > } > > // > // Sanity check. > // > if ((HeaderSize != Snp->Mode->MediaHeaderSize) || (BufLen < HeaderSize)) { > DEBUG ( > (EFI_D_WARN, > "MnpReceivePacket: Size error, HL:TL = %d:%d.\n", > HeaderSize, > BufLen) > ); > return EFI_DEVICE_ERROR; > } The message reports 1073233750 (0x3FF83F56) for HeaderSize, which is obviously its original, uninitialized / indeterminate value from the stack. iPXE's SNP.Receive() implementation doesn't fill in HeaderSize on success, albeit a requirement in the UEFI spec. (Or maybe it does fill it in, with bogus data.) HeaderSize The size, in bytes, of the media header received on the network interface. If this parameter is NULL, then the media header size will not be returned. I recommend testing a fresh upstream iPXE build. (*) configured with: <domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'> <qemu:commandline> <qemu:arg value='-global'/> <qemu:arg value='isa-debugcon.iobase=0x402'/> <qemu:arg value='-debugcon'/> <qemu:arg value='file:/tmp/ovmf.f20.log'/> </qemu:commandline> </domain> Note the namespace definition in the root element!
(In reply to Laszlo Ersek from comment #17) Hi Laszlo, I grabbed the latest iPXE from Fedora, which is ipxe-20140303-3.gitff1e7fc7.fc22. is that recent enough for this test ? <interface type='bridge'> <mac address='52:54:00:96:79:19'/> <source bridge='uefi-pxe'/> <target dev='vnet1'/> <model type='virtio'/> <boot order='1'/> <alias name='net0'/> <rom file='/usr/share/ipxe.efi/1af41000.rom'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> ---------snip of 'info qtree' dev: virtio-net-pci, id "net0" addr = 03.0 romfile = "/usr/share/ipxe.efi/1af41000.rom" rombar = 1 (0x1) multifunction = false command_serr_enable = true class Ethernet controller, addr 00:03.0, pci id 1af4:1000 (sub 1af4:0001) bar 0: i/o at 0xc100 [0xc11f] bar 1: mem at 0x82004000 [0x82004fff] bar 6: mem at 0xffffffffffffffff [0x3fffe] bus: virtio-bus type virtio-pci-bus dev: virtio-net-device, id "" mac = "52:54:00:96:79:19" ---------- ========== text from console iPXE 1.0.0+ (ff1e7fc7) -- Open Source Network Boot Firmware -- http://ipxe.org Features: HTTP DNS TFTP EFI Menu net0: 52:54:00:96:79:19 using virtio-net on PCI00:03.0 (open) [Link:up, TX:0 TXE:0 RX:0 RXE:0] Configuring (net0 52:54:00:96:79:19)............. =========== and it looked identical to the screen output tcpdump shows the iPXE oprom doesn't work, it just keep sending dhcp discover and, ignoring dhcp offer from server 15:27:14.646695 IP (tos 0x0, ttl 64, id 256, offset 0, flags [none], proto UDP (17), length 406) 0.0.0.0.bootpc > 255.255.255.255.bootps: [udp sum ok] BOOTP/DHCP, Request from 52:54:00:96:79:19, length 378, xid 0xbdd3fc65, secs 4, Flags [none] (0x0000) Client-Ethernet-Address 52:54:00:96:79:19 Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Discover MSZ Option 57, length 2: 1472 ARCH Option 93, length 2: 9 NDI Option 94, length 3: 1.3.10 Vendor-Class Option 60, length 32: "PXEClient:Arch:00009:UNDI:003010" CLASS Option 77, length 4: "iPXE" Parameter-Request Option 55, length 22: Subnet-Mask, Default-Gateway, Domain-Name-Server, LOG Hostname, Domain-Name, RP, Vendor-Option Vendor-Class, TFTP, BF, Option 119 Option 128, Option 129, Option 130, Option 131 Option 132, Option 133, Option 134, Option 135 Option 175, Option 203 T175 Option 175, length 27: 177.5.1.26.244.16.0.19.1.1.23.1.1.21.1.1.36.1.1.39.1.1.235.3.1.0.0 Client-ID Option 61, length 7: ether 52:54:00:96:79:19 GUID Option 97, length 17: 0.0.13.193.164.80.172.15.64.176.94.198.41.255.167.160.32 END Option 255, length 0 15:27:14.647096 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.0.2 tell 192.168.0.1, length 28 15:27:15.648010 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 328) 192.168.0.1.bootps > 192.168.0.2.bootpc: [udp sum ok] BOOTP/DHCP, Reply, length 300, xid 0xbdd3fc65, secs 4, Flags [none] (0x0000) Your-IP 192.168.0.2 Server-IP 192.168.0.1 Client-Ethernet-Address 52:54:00:96:79:19 file "grub2-efi-2.02-0.16.el7.x86_64-with-cfg.efi" Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Offer Server-ID Option 54, length 4: 192.168.0.1 Lease-Time Option 51, length 4: 43200 Subnet-Mask Option 1, length 4: 255.255.255.0 END Option 255, length 0 PAD Option 0, length 0, occurs 38 15:27:15.648663 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.0.2 tell 192.168.0.1, length 28 15:27:15.720853 IP (tos 0x0, ttl 64, id 515, offset 0, flags [none], proto UDP (17), length 406) 0.0.0.0.bootpc > 255.255.255.255.bootps: [udp sum ok] BOOTP/DHCP, Request from 52:54:00:96:79:19, length 378, xid 0xbdd3fc65, secs 8, Flags [none] (0x0000) Client-Ethernet-Address 52:54:00:96:79:19 Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Discover MSZ Option 57, length 2: 1472 ARCH Option 93, length 2: 9 NDI Option 94, length 3: 1.3.10 Vendor-Class Option 60, length 32: "PXEClient:Arch:00009:UNDI:003010" CLASS Option 77, length 4: "iPXE" Parameter-Request Option 55, length 22: Subnet-Mask, Default-Gateway, Domain-Name-Server, LOG Hostname, Domain-Name, RP, Vendor-Option Vendor-Class, TFTP, BF, Option 119 Option 128, Option 129, Option 130, Option 131 Option 132, Option 133, Option 134, Option 135 Option 175, Option 203 T175 Option 175, length 27: 177.5.1.26.244.16.0.19.1.1.23.1.1.21.1.1.36.1.1.39.1.1.235.3.1.0.0 Client-ID Option 61, length 7: ether 52:54:00:96:79:19 GUID Option 97, length 17: 0.0.13.193.164.80.172.15.64.176.94.198.41.255.167.160.32 END Option 255, length 0 15:27:15.721237 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 328) 192.168.0.1.bootps > 192.168.0.2.bootpc: [udp sum ok] BOOTP/DHCP, Reply, length 300, xid 0xbdd3fc65, secs 8, Flags [none] (0x0000) Your-IP 192.168.0.2 Server-IP 192.168.0.1 Client-Ethernet-Address 52:54:00:96:79:19 file "grub2-efi-2.02-0.16.el7.x86_64-with-cfg.efi" Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Offer Server-ID Option 54, length 4: 192.168.0.1 Lease-Time Option 51, length 4: 43200 Subnet-Mask Option 1, length 4: 255.255.255.0 END Option 255, length 0 PAD Option 0, length 0, occurs 38 15:27:16.650691 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.0.2 tell 192.168.0.1, length 28 15:27:17.840867 IP (tos 0x0, ttl 64, id 773, offset 0, flags [none], proto UDP (17), length 406) 0.0.0.0.bootpc > 255.255.255.255.bootps: [udp sum ok] BOOTP/DHCP, Request from 52:54:00:96:79:19, length 378, xid 0xbdd3fc65, secs 12, Flags [none] (0x0000) Client-Ethernet-Address 52:54:00:96:79:19 Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Discover MSZ Option 57, length 2: 1472 ARCH Option 93, length 2: 9 NDI Option 94, length 3: 1.3.10 Vendor-Class Option 60, length 32: "PXEClient:Arch:00009:UNDI:003010" CLASS Option 77, length 4: "iPXE" Parameter-Request Option 55, length 22: Subnet-Mask, Default-Gateway, Domain-Name-Server, LOG Hostname, Domain-Name, RP, Vendor-Option Vendor-Class, TFTP, BF, Option 119 Option 128, Option 129, Option 130, Option 131 Option 132, Option 133, Option 134, Option 135 Option 175, Option 203 T175 Option 175, length 27: 177.5.1.26.244.16.0.19.1.1.23.1.1.21.1.1.36.1.1.39.1.1.235.3.1.0.0 Client-ID Option 61, length 7: ether 52:54:00:96:79:19 GUID Option 97, length 17: 0.0.13.193.164.80.172.15.64.176.94.198.41.255.167.160.32 END Option 255, length 0 15:27:17.841268 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 328) 192.168.0.1.bootps > 192.168.0.2.bootpc: [udp sum ok] BOOTP/DHCP, Reply, length 300, xid 0xbdd3fc65, secs 12, Flags [none] (0x0000) Your-IP 192.168.0.2 Server-IP 192.168.0.1 Client-Ethernet-Address 52:54:00:96:79:19 file "grub2-efi-2.02-0.16.el7.x86_64-with-cfg.efi" Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Offer Server-ID Option 54, length 4: 192.168.0.1 Lease-Time Option 51, length 4: 43200 Subnet-Mask Option 1, length 4: 255.255.255.0 END Option 255, length 0 PAD Option 0, length 0, occurs 38 15:27:22.080972 IP (tos 0x0, ttl 64, id 1030, offset 0, flags [none], proto UDP (17), length 406) 0.0.0.0.bootpc > 255.255.255.255.bootps: [udp sum ok] BOOTP/DHCP, Request from 52:54:00:96:79:19, length 378, xid 0xbdd3fc65, secs 16, Flags [none] (0x0000) Client-Ethernet-Address 52:54:00:96:79:19 Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Discover MSZ Option 57, length 2: 1472 ARCH Option 93, length 2: 9 NDI Option 94, length 3: 1.3.10 Vendor-Class Option 60, length 32: "PXEClient:Arch:00009:UNDI:003010" CLASS Option 77, length 4: "iPXE" Parameter-Request Option 55, length 22: Subnet-Mask, Default-Gateway, Domain-Name-Server, LOG Hostname, Domain-Name, RP, Vendor-Option Vendor-Class, TFTP, BF, Option 119 Option 128, Option 129, Option 130, Option 131 Option 132, Option 133, Option 134, Option 135 Option 175, Option 203 T175 Option 175, length 27: 177.5.1.26.244.16.0.19.1.1.23.1.1.21.1.1.36.1.1.39.1.1.235.3.1.0.0 Client-ID Option 61, length 7: ether 52:54:00:96:79:19 GUID Option 97, length 17: 0.0.13.193.164.80.172.15.64.176.94.198.41.255.167.160.32 END Option 255, length 0 15:27:22.081323 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 328) 192.168.0.1.bootps > 192.168.0.2.bootpc: [udp sum ok] BOOTP/DHCP, Reply, length 300, xid 0xbdd3fc65, secs 16, Flags [none] (0x0000) Your-IP 192.168.0.2 Server-IP 192.168.0.1 Client-Ethernet-Address 52:54:00:96:79:19 file "grub2-efi-2.02-0.16.el7.x86_64-with-cfg.efi" Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Offer Server-ID Option 54, length 4: 192.168.0.1 Lease-Time Option 51, length 4: 43200 Subnet-Mask Option 1, length 4: 255.255.255.0 END Option 255, length 0 PAD Option 0, length 0, occurs 38
(In reply to Xiaoqing Wei from comment #18) > ipxe-20140303-3.gitff1e7fc7.fc22. files are here: http://koji.fedoraproject.org/koji/buildinfo?buildID=558967 ipxe-roms-qemu-20140303-3.gitff1e7fc7.fc22.noarch ipxe-roms-20140303-3.gitff1e7fc7.fc22.noarch ipxe-bootimgs-20140303-3.gitff1e7fc7.fc22.noarch
Not sure if the iPXE rom you tested is recent enough. In fact, in light of the recent bugs you filed, I'm starting to doubt iPXE works at all.
(In reply to Laszlo Ersek from comment #20) > Not sure if the iPXE rom you tested is recent enough. > > In fact, in light of the recent bugs you filed, I'm starting to doubt iPXE > works at all. I doubt that too, I'll try to use e1000/rtl8139 that dont have a DXE driver, to confirm whether it's iPXE's fault.
Okay, gathering info from guest console and tcpdump, I think iPXE is doing wrong here's my understanding: when we boot a OVMF guest(with network first priority), OVMF -> (DXE) or (iPXE oprom) -> DHCP,TFTP(request shim.efi then grubx64.efi) -> accord to grub.cfg might request vmlinuz and initrd.img below tcpdump shown grubx64.efi is not requested when use oprom at all. dump guest console output to a file... use latest git iPXE from ipxe.org(same behaviour for rhel build), make everything and use the all-in-one efirom in the bin-x86_64-efi folder. <rom file='/root/ipxe/src/bin-x86_64-efi/ipxe.efirom'/> boot VM with e1000 NIC Connected to domain uefi-rhel66 Escape character is ^] ^[[2J^[[01;01H^[[2J^[[01;01H^M ^M ^M ^[[1m^[[37m^[[40miPXE 1.0.0+ (d38b)^[[0m^[[37m^[[40m -- Open Source Network Boot Firmware -- ^[[0m^[[36m^[[40mhttp://ipxe.org^[[0m^[[37m^[[40m^M Features: VLAN HTTP DNS TFTP EFI Menu^M ^M Press Ctrl-B for the iPXE command line...^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^Hnet0: 52:54:00:96:79:19 using 82540em on PCI00:03.0 (open)^M [Link:up, TX:0 TXE:0 RX:8 RXE:1]^M [RXE: 1 x "Operation not supported (http://ipxe.org/3c086083)"]^M Configuring (net0 52:54:00:96:79:19)...... ok^M net0: 192.168.0.2/255.255.255.0^M Next server: 192.168.0.1^M Filename: shim.efi^M tftp://192.168.0.1/shim.efi... ok^M Failed to open grubx64.efi - Not Found^M ------------- iPXE lies starting from this line, tcpdump shows it didn't request for this file at all. Failed to load image grubx64.efi: Not Found^M Failed to open MokManager.efi - Not Found^M Failed to load image MokManager.efi: Not Found^M Could not boot image: Error 0x7f04828e (http://ipxe.org/7f04828e)^M ^M Press Ctrl-B for the iPXE command line...^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^ ------------------ tcpdump snip, starting at dhcp ack 18:01:44.311613 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 328) 192.168.0.1.bootps > 192.168.0.2.bootpc: [udp sum ok] BOOTP/DHCP, Reply, length 300, xid 0x313fb23d, secs 14, Flags [none] (0x0000) Your-IP 192.168.0.2 Server-IP 192.168.0.1 Client-Ethernet-Address 52:54:00:96:79:19 file "shim.efi" ************** filename delivered to pxe client Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: ACK Server-ID Option 54, length 4: 192.168.0.1 Lease-Time Option 51, length 4: 43200 Subnet-Mask Option 1, length 4: 255.255.255.0 END Option 255, length 0 PAD Option 0, length 0, occurs 38 18:01:44.317863 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.0.1 tell 192.168.0.2, length 28 18:01:44.317967 ARP, Ethernet (len 6), IPv4 (len 4), Reply 192.168.0.1 is-at 52:54:00:0c:d7:27, length 28 18:01:44.318072 IP (tos 0x0, ttl 64, id 1054, offset 0, flags [none], proto UDP (17), length 66) 192.168.0.2.8316 > 192.168.0.1.tftp: [udp sum ok] 38 RRQ "shim.efi" octet blksize 1432 tsize 0 ************** pxe client request for shim.efi 18:01:44.319566 IP (tos 0x0, ttl 64, id 2191, offset 0, flags [none], proto UDP (17), length 57) 192.168.0.1.44354 > 192.168.0.2.8316: [udp sum ok] UDP, length 29 ...... ---------------------- and until the end of the tcpdump(to the screen where iPXE complianing grubx64.efi and other *.efi missing), no another line of "RRQ" request from pxe client to pxe server and I captured the DXE virtio-net, pxe client requested for that file, and boot continues. 3596 16:19:38.828581 IP (tos 0x0, ttl 64, id 59121, offset 0, flags [none], proto UDP (17), length 60) 3597 192.168.0.3.mc-client > 192.168.0.1.tftp: [udp sum ok] 32 RRQ "grubx64.efi" octet blksize 512
Hi Laszlo, I am thinking C#20 is a new issue, an iPXE functional bug(if my analyze is correct). and the origin report for this BZ on C#0, is using romfile="" which use purely DXE, and it's tracking transferring vmlinuz/initrd.img slow. if C#20 is correct, I think is better to open a new bz w/ C#20 content, and leave this #1181980 to continue track the slowness issue. how do you think ? Regards, Xiaoqing.
Hi, wrt. comment 23: (a) I think your analysis in comment 22 is good. Feel free to open another bug for iPXE if you think this is a different iPXE issue (you said "same behaviour for rhel build"). However, it could still be a problem with the PXE server setup you have (I'm not saying it is, just that it might be). *Every* single time I try to set up PXE in a new environment, it requires tweaking the DHCP / PXE server config, because the client & the server never understand each other at first. It usually requires staring at packet captures (like you did) and sometimes even the PXE specification. I can't help with iPXE internals, unfortunately. (b) I disagree that comment 0 is about a problem with the builtin virtio-net driver. Yes I can see that you used romfile='' in comment 0, but according to comment 14 to comment 17, the original report in comment 0 used a wrong PXE server environment. After fixing up that environment, virtio-net worked okay (which is why we moved this BZ itself over to ipxe). (c) Please do not use "DXE" as a synonim for the builtin driver; it's very confusing. Both the builtin driver and the oprom are UEFI drivers that run in the DXE phase. Please just say "builtin" and "oprom". Thanks!
(In reply to Laszlo Ersek from comment #24) > Hi, wrt. comment 23: > > (a) I think your analysis in comment 22 is good. Feel free to open another > bug for iPXE if you think this is a different iPXE issue (you said "same > behaviour for rhel build"). However, it could still be a problem with the > PXE server setup you have (I'm not saying it is, just that it might be). > *Every* single time I try to set up PXE in a new environment, it requires > tweaking the DHCP / PXE server config, because the client & the server never > understand each other at first. It usually requires staring at packet > captures (like you did) and sometimes even the PXE specification. Yes, DHCP with PXE/TFTP are nasty. and after some more thinking, I think it's more likely a grub2 bug(though iPXE could buggy, too) as iPXE could load the shim.efi(or grubx64.efi depends on how dhcpd.conf configured). so I filled a bug and CC'ed you: Bug 1184694 - grub2 network disfunctional under iPXE oprom(in OVMF guest) > (b) I disagree that comment 0 is about a problem with the builtin virtio-net > driver. Yes I can see that you used romfile='' in comment 0, but according > to comment 14 to comment 17, the original report in comment 0 used a wrong > PXE server environment. After fixing up that environment, virtio-net worked > okay (which is why we moved this BZ itself over to ipxe). Yes, I understood, the origin env was setup by using rhel70 ga files. after switch both grub2-efi and shim.efi to latest from RHEL7.1 this become quickly, I'll try a fresh setup later, if that disappear on latest build, then we would close this bz. > > (c) Please do not use "DXE" as a synonim for the builtin driver; it's very > confusing. Both the builtin driver and the oprom are UEFI drivers that run > in the DXE phase. Please just say "builtin" and "oprom". Thanks! Aha, thx for pointing that ! Best Regards, Xiaoqing Wei.
Created attachment 983095 [details] [1/2] Revert "[efi] Add our own EFI_LOAD_FILE_PROTOCOL implementation" This reverts commit c7c3d839fc9120aee28de9aabe452dc85ad91502. This patch drops iPXE's own broken EFI_LOAD_FILE_PROTOCOL implementation, reverting: commit c7c3d839fc9120aee28de9aabe452dc85ad91502 Author: Michael Brown <mcb30> Date: Wed Mar 13 22:42:26 2013 +0000 [efi] Add our own EFI_LOAD_FILE_PROTOCOL implementation When iPXE is used as a UEFI driver, the UEFI PXE base code currently provides the TCP/IP stack, network protocols, and user interface. This represents a substantial downgrade from the standard BIOS iPXE user experience. Fix by installing our own EFI_LOAD_FILE_PROTOCOL implementation which initiates the standard iPXE boot procedure. This upgrades the UEFI iPXE user experience to match the standard BIOS iPXE user experience. Signed-off-by: Michael Brown <mcb30> It is hard to decide which is worse, the brokenness of that patch, or the arrogance of the commit message. According to the UEFI specification: The EFI_LOAD_FILE_PROTOCOL is a simple protocol used to obtain files from arbitrary devices. [...] the firmware [...] attempts to read the file via the EFI_LOAD_FILE_PROTOCOL and the LoadFile() function. In this case the LoadFile() function implements the policy of interpreting the File Path value. It's too bad that the EFI_LOAD_FILE_PROTOCOL implementation added by iPXE commit c7c3d839 explicitly *ignores* the pathname of the file to load (!), it just calls the main ipxe() function. Plus, if the "booting" parameter is false, it reports "SNPDEV %p cannot load non-boot file\n" This means that iPXE's own EFI_LOAD_FILE_PROTOCOL can load but one file (the first file advertised by the DHCP server), which the iPXE author presumably thinks *must* be the main iPXE binary. In addition, iPXE doesn't provide an implementation of EFI_PXE_BASE_CODE_PROTOCOL, which shim.efi uses to chain-load grubx64.efi. Obviously, when shim.efi tries to load grubx64.efi from the network with the nonexistent EFI_PXE_BASE_CODE_PROTOCOL, or grubx64.efi tries to load the kernel & initrd images from the network with the broken EFI_LOAD_FILE_PROTOCOL, things fail miserably. (Now go back and read the arrogant commit message about the "iPXE user experience".) By dropping this EFI_LOAD_FILE_PROTOCOL implementation in iPXE, we allow the Intel BDS drver in edk2 / OVMF to find the one provided by "MdeModulePkg/Universal/Network/UefiPxeBcDxe", which actually works for any pathname. Plus, the edk2 driver provides EFI_PXE_BASE_CODE_PROTOCOL too. Signed-off-by: Laszlo Ersek <lersek> --- src/include/ipxe/efi/efi_snp.h | 3 --- src/interface/efi/efi_snp.c | 57 +----------------------------------------- 2 files changed, 1 insertion(+), 59 deletions(-)
Created attachment 983097 [details] [2/2] efi_snp: improve compliance with the EFI_SIMPLE_NETWORK_PROTOCOL spec The efi_snp interface dates back to 2008, when the GetStatus() interface must have been seriously under-specified. The UEFI Specification (2.4) specifies EFI_SIMPLE_NETWORK_PROTOCOL in detail however. In short: - the Transmit() interface is assumed to link (not copy) the SNP client's buffer and return at once (without blocking), taking ownership of the buffer temporarily; - the GetStatus() interface releases one of the completed (transmitted or internally copied) buffers back to the caller. If there are several completed buffers, it is unspecified which one is returned. The EFI build of the grub boot loader actually verifies the buffer address returned by GetStatus(), therefore in efi_snp we must at least fake the queueing of client buffers. This patch doesn't track client buffers together with the internally queued io_buffer structures, we consider a client buffer recyclable as soon as we make a deep copy of it and queue the copy internally. Signed-off-by: Laszlo Ersek <lersek> --- src/include/ipxe/efi/efi_snp.h | 6 +++++ src/interface/efi/efi_snp.c | 50 +++++++++++++++++++++++------------------- 2 files changed, 33 insertions(+), 23 deletions(-)
(In reply to Laszlo Ersek from comment #28) Wow, what a quick fix. with ipxe-roms-qemu-20130517-6.gitc4bce43.el7.efi_fixes_2.noarch: e1000,rtl8139,virtio-net with oprom. works identical to builtin driver now it request for every file that needs to boot. # grep -i rrq tcpdump-ipxe-roms-qemu-20130517-6.gitc4bce43.el7.efi_fixes_2.noarch-rtl8139.txt 192.168.0.3.slingshot > 192.168.0.1.tftp: [udp sum ok] 38 RRQ "shim.efi" octet tsize 0 blksize 1468 192.168.0.3.jetform > 192.168.0.1.tftp: [udp sum ok] 30 RRQ "shim.efi" octet blksize 1468 192.168.0.3.vdmplay > 192.168.0.1.tftp: [udp sum ok] 32 RRQ "grubx64.efi" octet blksize 512 192.168.0.3.25300 > 192.168.0.1.tftp: [udp sum ok] 61 RRQ "//grub.cfg-01-52-54-00-96-79-19" octet blksize 1024 tsize 0 192.168.0.3.25301 > 192.168.0.1.tftp: [udp sum ok] 49 RRQ "//grub.cfg-C0A80003" octet blksize 1024 tsize 0 192.168.0.3.25302 > 192.168.0.1.tftp: [udp sum ok] 48 RRQ "//grub.cfg-C0A8000" octet blksize 1024 tsize 0 192.168.0.3.25303 > 192.168.0.1.tftp: [udp sum ok] 47 RRQ "//grub.cfg-C0A800" octet blksize 1024 tsize 0 192.168.0.3.25304 > 192.168.0.1.tftp: [udp sum ok] 46 RRQ "//grub.cfg-C0A80" octet blksize 1024 tsize 0 192.168.0.3.25305 > 192.168.0.1.tftp: [udp sum ok] 45 RRQ "//grub.cfg-C0A8" octet blksize 1024 tsize 0 192.168.0.3.25306 > 192.168.0.1.tftp: [udp sum ok] 44 RRQ "//grub.cfg-C0A" octet blksize 1024 tsize 0 192.168.0.3.25307 > 192.168.0.1.tftp: [udp sum ok] 43 RRQ "//grub.cfg-C0" octet blksize 1024 tsize 0 192.168.0.3.25308 > 192.168.0.1.tftp: [udp sum ok] 42 RRQ "//grub.cfg-C" octet blksize 1024 tsize 0 192.168.0.3.25309 > 192.168.0.1.tftp: [udp sum ok] 40 RRQ "//grub.cfg" octet blksize 1024 tsize 0 192.168.0.3.25310 > 192.168.0.1.tftp: [udp sum ok] 40 RRQ "//grub.cfg" octet blksize 1024 tsize 0 192.168.0.3.25311 > 192.168.0.1.tftp: [udp sum ok] 64 RRQ "/EFI/redhat/x86_64-efi/command.lst" octet blksize 1024 tsize 0 192.168.0.3.25312 > 192.168.0.1.tftp: [udp sum ok] 59 RRQ "/EFI/redhat/x86_64-efi/fs.lst" octet blksize 1024 tsize 0 192.168.0.3.25313 > 192.168.0.1.tftp: [udp sum ok] 63 RRQ "/EFI/redhat/x86_64-efi/crypto.lst" octet blksize 1024 tsize 0 192.168.0.3.25314 > 192.168.0.1.tftp: [udp sum ok] 65 RRQ "/EFI/redhat/x86_64-efi/terminal.lst" octet blksize 1024 tsize 0 192.168.0.3.25315 > 192.168.0.1.tftp: [udp sum ok] 40 RRQ "//grub.cfg" octet blksize 1024 tsize 0 192.168.0.3.25316 > 192.168.0.1.tftp: [udp sum ok] 40 RRQ "//grub.cfg" octet blksize 1024 tsize 0 192.168.0.3.25317 > 192.168.0.1.tftp: [udp sum ok] 40 RRQ "//grub.cfg" octet blksize 1024 tsize 0 192.168.0.3.25318 > 192.168.0.1.tftp: [udp sum ok] 47 RRQ "/rhel70ga/vmlinuz" octet blksize 1024 tsize 0 192.168.0.3.25319 > 192.168.0.1.tftp: [udp sum ok] 47 RRQ "/rhel70ga/vmlinuz" octet blksize 1024 tsize 0 192.168.0.3.25320 > 192.168.0.1.tftp: [udp sum ok] 50 RRQ "/rhel70ga/initrd.img" octet blksize 1024 tsize 0
and I observed the DHCP_ARCH_VENDOR_CLASS_ID has changed in the build you provided. I know this is minor and didn't affect DHCP function at all, just asking whether it's by intentional ? to use same arch id to let system administrator use one cfg for both iPXE and builtin driver ? rhel pkg or upstream pkg has a Arch 00009 like: ----------- Vendor-Class Option 60, length 32: "PXEClient:Arch:00009:UNDI:003010" ----------- and from tcpdump of the scratch build, I saw the id has changed to 00007 =========== scratch build GUID Option 97, length 17: 0.0.13.193.164.80.172.15.64.176.94.198.41.255.167.160.32 NDI Option 94, length 3: 0.0.0 ARCH Option 93, length 2: 7 Vendor-Class Option 60, length 32: "PXEClient:Arch:00007:iPXE:000000" END Option 255, length 0 11:01:52.821353 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto ICMP (1), length 48) ============= and just out of curiosity, the files you changed mentioned in C#26 C#27 didn't touch this dhcp_arch.h for x86_64, why the arch id changed.... src/arch/x86_64/include/efi/ipxe/dhcp_arch.h #ifndef _DHCP_ARCH_H #define _DHCP_ARCH_H /** @file * * Architecture-specific DHCP options */ FILE_LICENCE ( GPL2_OR_LATER ); #include <ipxe/dhcp.h> #define DHCP_ARCH_VENDOR_CLASS_ID \ DHCP_STRING ( 'P', 'X', 'E', 'C', 'l', 'i', 'e', 'n', 't', ':', \ 'A', 'r', 'c', 'h', ':', '0', '0', '0', '0', '9', ':', \ 'U', 'N', 'D', 'I', ':', '0', '0', '3', '0', '1', '0' )
The processor architecture types are listed here: (1) http://www.iana.org/assignments/dhcpv6-parameters/dhcpv6-parameters.xhtml#processor-architecture (2) http://tools.ietf.org/html/rfc4578#section-2.1 (3) http://tools.ietf.org/html/rfc5970#section-3.3 There's disagreement between references (1) and (2); they assign the opposite values to "UEFI x64" and "EBC" (EFI Byte Code, an interpreted byte code conceptually similar to Java and Python byte code). ref(1) ref(2) x64 UEFI 0x0007 0x0009 EBC 0x0009 0x0007 Reference (2) is much older and probably obsolete now; it's an RFC whose status is "Informational". (Reference (3) has status "Proposed Standard", but it doesn't list any architecture types, it just defers to reference (1).) Reference (1) is a live document that is maintained by the IANA. edk2's PXE client uses the ref(1) assignments, while iPXE's PXE client seems to use the older ref(2) assignments. With patch#1 in comment 26, iPXE's PXE client is not used any longer when netbooting from under OVMF: iPXE only provides the SNP driver and edk2's PXE client is in effect. That's why you see the change in the arch type field.
Note that gPXE (from which iPXE was forked in 2010) used the correct value: http://git.etherboot.org/gpxe.git/blob/HEAD:/src/arch/x86_64/include/efi/gpxe/dhcp_arch.h
Created attachment 983481 [details] upstream version of patch 2/2 (on top of d38bac05)
*** Bug 1183904 has been marked as a duplicate of this bug. ***
*** Bug 1181934 has been marked as a duplicate of this bug. ***
*** Bug 1181938 has been marked as a duplicate of this bug. ***
*** Bug 1183464 has been marked as a duplicate of this bug. ***
(In reply to Laszlo Ersek from comment #37) > Created attachment 983481 [details] > upstream version of patch 2/2 (on top of d38bac05) Posted as <http://thread.gmane.org/gmane.network.ipxe.devel/3799>.
Additional upstream commit to include: 755d2b8 [efi] Ensure drivers are disconnected when ExitBootServices() is called
Qianqian, Could you have a try? Best Regards, Junyi
ipxe rebase (bug 1298313) should fix this.
Retest with: ipxe-bootimgs-20150821-1.git4e03af8e.el7.rebase73_bz1298313.noarch ipxe-roms-20150821-1.git4e03af8e.el7.rebase73_bz1298313.noarch ipxe-roms-qemu-20150821-1.git4e03af8e.el7.rebase73_bz1298313.noarch OVMF-20151104-1.gitb9ffeab.el7.noarch qemu-kvm-rhev-2.3.0-31.el7_2.5.x86_64 kernel-3.10.0-327.el7.x86_64 libvirt-1.2.17-13.el7.x86_64 Steps: 1. Follow Steps "21.1.2. Configuring a PXE Server for UEFI-based AMD64 and Intel 64 Clients" on https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Installation_Guide/chap-installation-server-setup.html 2. Boot a OVMF guest, select net boot. Result: Net boot succeed, and very fast.
Hi Qianqian, Could you reply comment65? Best Regards, Junyi
Retest with: qemu-kvm-rhev-2.3.0-31.el7_2.7.x86_64 ipxe-roms-qemu-20150821-1.git4e03af8e.el7.test.noarch ipxe-bootimgs-20150821-1.git4e03af8e.el7.test.noarch ipxe-roms-20150821-1.git4e03af8e.el7.test.noarch OVMF-20151104-1.gitb9ffeab.el7.noarch kernel-3.10.0-327.el7.x86_64 Steps: 1. Follow Steps "21.1.2. Configuring a PXE Server for UEFI-based AMD64 and Intel 64 Clients" on https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Installation_Guide/chap-installation-server-setup.html 2. Boot a OVMF guest, select net boot. Result: Net boot succeed, and very fast.
(In reply to Laszlo Ersek from comment #24) > (c) Please do not use "DXE" as a synonim for the builtin driver; it's very > confusing. Both the builtin driver and the oprom are UEFI drivers that run > in the DXE phase. Please just say "builtin" and "oprom". Thanks! Hi Laszlo, I have no idea about the builtin driver, could you please help to introduce about it? 1. From the comments list, I knew that there is a buildin rom for virtio-net in OVMF, if we use "<rom file=''/>" in domain's xml or "romfile=" in qemu command with virtio modle interface, it will use this buildin rom. Right? The build in rom is embeded in "/usr/share/OVMF/OVMF_CODE.fd" right? 2. I have tried to use <rom file=''/> in xml, but the xml will fail to validate. So I must add "--skip-validate", is it expected? # virsh edit from142 error: XML document failed to validate against schema: Unable to validate doc against /usr/share/libvirt/schemas/domain.rng Extra element devices in interleave Element domain failed to validate content Failed. Try again? [y,n,i,f,?]: And I have tried virtio with <rom file=''/>, it works for pxe boot. It failed when I changed to rtl8139 and e1000, which is expected.
(In reply to yalzhang from comment #76) > (In reply to Laszlo Ersek from comment #24) > > > (c) Please do not use "DXE" as a synonim for the builtin driver; it's very > > confusing. Both the builtin driver and the oprom are UEFI drivers that run > > in the DXE phase. Please just say "builtin" and "oprom". Thanks! > > Hi Laszlo, > I have no idea about the builtin driver, could you please help to introduce > about it? > 1. From the comments list, I knew that there is a buildin rom for virtio-net > in OVMF, if we use "<rom file=''/>" in domain's xml or "romfile=" in qemu > command with virtio modle interface, it will use this buildin rom. Right? > The build in rom is embeded in "/usr/share/OVMF/OVMF_CODE.fd" right? Yes, that's correct. The OVMF_CODE.secboot.fd file contains a builtin driver for the virtio-net NIC (called VirtioNetDxe). In addition, the virtio-net NIC may have a PCI expansion ROM BAR, populated with a matching UEFI driver built from the iPXE project. If the iPXE oprom is present, then it takes precedence over the builtin driver. The iPXE oprom can be disabled in two ways on the QEMU command line (and each method has a corresponding libvirt method). The first method is to pass a ",romfile=" property to the virtio-net device (and correspondingly, set <rom file=''/> in the domain XML). However, as far as I remember, this method stopped working with both QEMU directly, and in libvirt as well. The second method is a bit different. The second method does not just prevent the iPXE driver file from being loaded into the ROM BAR, instead it turns off the ROM BAR completely. On the QEMU command line, this is achieved by passing the ",rombar=0" property. In the domain XML, it is done with "<rom bar='off'/>"). > 2. I have tried to use <rom file=''/> in xml, but the xml will fail to > validate. So I must add "--skip-validate", is it expected? > # virsh edit from142 > error: XML document failed to validate against schema: Unable to validate > doc against /usr/share/libvirt/schemas/domain.rng > Extra element devices in interleave > Element domain failed to validate content > > Failed. Try again? [y,n,i,f,?]: I've seen this error earlier. As far as I remember, it is unrelated to the <rom> element. The XML schema in the "domain.rng" file that is used to validate the domain XMLs has changed in some incompatible way, but it doesn't necessarily cause problems. I don't know exactly what part in the XML schema changed incompatibly. You can see an earlier example in bug 1214664. I'm CC'ing Cole and Jirka; maybe they can advise you on this. > And I have tried virtio with <rom file=''/>, it works for pxe boot. It > failed when I changed to rtl8139 and e1000, which is expected. Right, all of those results are expected. However, this BZ is about iPXE. Thus, after the above sanity-checking with the builtin driver (= VirtioNetDxe), can you please remove the <rom> element altogether from your domain XML, and retest? (One way to ascertain whether iPXE is being used for virtio-net is to search the OVMF debug log for the string "1af41000.efi". If you find it in the debug log, then the virtio-net iPXE oprom is present.)
(In reply to Laszlo Ersek from comment #77) > I've seen this error earlier. As far as I remember, it is unrelated to the > <rom> element. The XML schema in the "domain.rng" file that is used to > validate the domain XMLs has changed in some incompatible way, but it > doesn't necessarily cause problems. Well, the XML schema validation checks that the XML document looks the way libvirt expects. E.g., the validation fails if the XML contains an element that libvirt doesn't understand (often caused by a typo in the element name). Without this check libvirt would happily accept the XML configuration, but it would ignore the unknown element. So if the check fails on a correct XML and skipping validation does what you expect (i.e., libvirt accepts the XML and doesn't ignore anything from it), it is a sign of a bug in the XML schema.
Right, I think this is related to a more-or-less innocuous regression in the XML schema. For example, I have domain XMLs on my laptop that I created long time ago, and RHEL-7.0 libvirt was pretty happy with them. But, at the moment, RHEL-7.3 libvirt complains about them (with the above message) whenever I edit them with "virsh edit". The problem is, the error message doesn't state exactly which element of the domain XML is deemed problematic -- that would help us either fix the domain XML, or fix the schema.
(In reply to Laszlo Ersek from comment #77) > However, this BZ is about iPXE. Thus, after the above sanity-checking with > the builtin driver (= VirtioNetDxe), can you please remove the <rom> element > altogether from your domain XML, and retest? Laszlo, Thank you for your detailed info. Retest with ipxe-roms-qemu-20160127-1.git6366fa7a.el7.noarch, the guest can pxe boot by UEFI, and install the OS successfully. For different model type virtio, rtl8139 and e1000, all the results is expected.
file a Bug 1348045 for virt-xml-validate can not validate rom file='' which indicated to use OVMF build-in driver for virtio-net.
(In reply to yalzhang from comment #80) > (In reply to Laszlo Ersek from comment #77) > > > However, this BZ is about iPXE. Thus, after the above sanity-checking with > > the builtin driver (= VirtioNetDxe), can you please remove the <rom> element > > altogether from your domain XML, and retest? > > Laszlo, Thank you for your detailed info. > Retest with ipxe-roms-qemu-20160127-1.git6366fa7a.el7.noarch, the guest can > pxe boot by UEFI, and install the OS successfully. For different model type > virtio, rtl8139 and e1000, all the results is expected. Thank you. Do you intend to set this BZ to VERIFIED?
As comment 80 indicated, all the results is expected. Set the bug to verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-2214.html