Created attachment 1156523 [details] screen log of node-1 loop booting longer chainloaded ipxe config file Description of problem: I've encountered this issue trying to boot qemu virt host via OpenStack Ironic project. It seems that chainloading a long enough ipxe config file over http causes the boot to hang. I was able to capture the http and dhcp traffic between host both with a "hanging" and a booting guest systems. The guest and host are the same in both scenarios: host: a F23 virtualbox instance and guest a qemu guest inside the host (see attached virsh dumpxml) From this traffic log my impression is that if the configuration file (chain-loaded) is too big (not fitting single http response packet) wrong checksum packets come from the guest system while downloading the rest of the config file through http. The traffic is cut off by httpd eventually while keeping the guest waiting in a loop forever. If the config file is reduced in size (so as to fit a single http response packet) guest system boots with no issue. Version-Release number of selected component (if applicable): qemu-common-2.4.1-8.fc23.x86_64 ipxe-roms-qemu-20150407-3.gitdc795b9f.fc23.noarch libvirt-daemon-driver-qemu-1.2.18.2-3.fc23.x86_64 qemu-system-x86-2.4.1-8.fc23.x86_64 qemu-img-2.4.1-8.fc23.x86_64 qemu-kvm-2.4.1-8.fc23.x86_64 How reproducible: Always Steps to Reproduce: ## through my OpenStack deployment 0. deploy Devstack env with IRONIC_IPXE_ENABLED=True (see my local.conf attached) 1. ironic node-set-provision-state node-1 manage 2. ironic node-set-provision-state node-1 provide 3. bootin node-1 hangs (ironic node-list shows node-1 in clean-wait state forever) ## through manipulating the chainloaded ipxe config file 0. use the attached longer chainloading ipxe file (in my case /opt/stack/data/ironic/httpboot/pxelinux.cfg/52-54-00-45-4a-d6) 1. virsh reset node-1 Actual results: node-1 loops downloading the chainloaded ipxe file forever Expected results: node-1 manages to download a chainloaded ipxe file even if it is larger than a single http response packet Additional info:
Created attachment 1156524 [details] tcpdump pcap of the guest failing to chainload longer ipxe config file see traffic around packets 70--90
Created attachment 1156526 [details] tcpdump pcap of the guest able to chainload shorter ipxe config see packet #77
Created attachment 1156528 [details] screen log of node-1 booting with chainloaded shorter ipxe config
Created attachment 1156543 [details] node-1 virsh dumpxml
Created attachment 1156563 [details] my devstack local.conf
FYI: in RHOSP we had to update the iPXE ROM to 20160127-1.git6366fa7a.el7 to fix numerous similar issues.
Thanks Milan, great report. Would be good to test it on baremetal to see if that also affects the iPXE ROM that we chainload (the one in the /tftpboot) or if it just affects the iPXE QEMU ROMS.
(In reply to Dmitry Tantsur from comment #6) > FYI: in RHOSP we had to update the iPXE ROM to 20160127-1.git6366fa7a.el7 to > fix numerous similar issues. unfortunately that version doesn't work for me either
I just built a newer version of ipxe in rawhide, it should install fine on older fedora. Can someone give it a spin and see if the issue persists? Grab the RPMs with: koji download-build ipxe-20160622-1.git0418631.fc26
This message is a reminder that Fedora 23 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 23. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '23'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 23 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Since NEEDINFO has gone unresponded for a while, closing as INSUFFICENT_DATA. If anyone is still hitting this bug on F24+, please try one of the newer ipxe builds like suggested in comment #9