Description of problem: After update to the recent version of libvirt etc. in F18, booting virtual machine over PXE is unusably slow. Upgrading to F19 doesn't solve the issue. It was okay with the older versions. Version-Release number of selected component (if applicable): libvirt-daemon-1.0.5.2-1.fc19.x86_64 How reproducible: 100% Steps to Reproduce: 1. try to boot a virtual machine over PXE (typically via bridge) Actual results: iPXE (TFTP communication) is unusably slow to initialize and fetch kernel and initrd Expected results: fast initialization and data fetching
Did you upgrade iPXE / QEMU when you noticed the slowdown ? I'm pretty doubtful that libvirt itself can cause guest PXE performance to change in any measurable way.
There was no iPXE update/upgrade, but there were some QEMU updates/upgrades, I think. I don't know what is the right component for this bug so I started with libvirt and I hope you guys will reassign it properly. Thanks
This seems to be a kernel issue -- works well with kernel-3.8.13-100.fc17 -- reassigning. Did something change in the TFTP stack recently?
*********** MASS BUG UPDATE ************** We apologize for the inconvenience. There is a large number of bugs to go through and several of them have gone stale. Due to this, we are doing a mass bug update across all of the Fedora 19 kernel bugs. Fedora 19 has now been rebased to 3.11.1-200.fc19. Please test this kernel update and let us know if you issue has been resolved or if it is still present with the newer kernel. If you experience different issues, please open a new bug report for those.
Still not resolved, maybe even worse than with older kernels. This really is an annoying bug preventing F19 usage for virtualization in combination with PXE.
Turns out to highly probably be a bug in iPXE. If I boot the virtual machine via PXE (takes ages) and then run tftp client and fetch vmlinuz and initrd.img from there, everything works as expected. However, it works well with old kernels so it looks like a kernel change not reflected in iPXE or something like that.
Vratislav, can you please run grep . /sys/module/kvm_intel/parameters/* I suspect this is caused by iPXE's usage of big real mode.
Created attachment 831511 [details] kvm_intel_params from the not working (slow)
Indeed: /sys/module/kvm_intel/parameters/emulate_invalid_guest_state:Y /sys/module/kvm_intel/parameters/unrestricted_guest:N This is a pretty old machine BTW: /sys/module/kvm_intel/parameters/flexpriority:N
Comment on attachment 831511 [details] kvm_intel_params from the not working (slow) wrong attachment, will put the proper one in comments
==not working (slow) F19== 3.11.1-200.fc19.x86_64 /sys/module/kvm_intel/parameters/emulate_invalid_guest_state:Y /sys/module/kvm_intel/parameters/enable_apicv:N /sys/module/kvm_intel/parameters/enable_shadow_vmcs:N /sys/module/kvm_intel/parameters/ept:N /sys/module/kvm_intel/parameters/eptad:N /sys/module/kvm_intel/parameters/fasteoi:Y /sys/module/kvm_intel/parameters/flexpriority:N /sys/module/kvm_intel/parameters/nested:N /sys/module/kvm_intel/parameters/ple_gap:0 /sys/module/kvm_intel/parameters/ple_window:4096 /sys/module/kvm_intel/parameters/unrestricted_guest:N /sys/module/kvm_intel/parameters/vmm_exclusive:Y /sys/module/kvm_intel/parameters/vpid:N ==working F19 with F17 kernel== 3.8.13-100.fc17.x86_64 /sys/module/kvm_intel/parameters/emulate_invalid_guest_state:Y /sys/module/kvm_intel/parameters/ept:N /sys/module/kvm_intel/parameters/eptad:N /sys/module/kvm_intel/parameters/fasteoi:Y /sys/module/kvm_intel/parameters/flexpriority:N /sys/module/kvm_intel/parameters/nested:N /sys/module/kvm_intel/parameters/ple_gap:0 /sys/module/kvm_intel/parameters/ple_window:4096 /sys/module/kvm_intel/parameters/unrestricted_guest:N /sys/module/kvm_intel/parameters/vmm_exclusive:Y /sys/module/kvm_intel/parameters/vpid:N ==not working (slow) F20== 3.11.7-300.fc20.x86_64 /sys/module/kvm_intel/parameters/emulate_invalid_guest_state:Y /sys/module/kvm_intel/parameters/enable_apicv:N /sys/module/kvm_intel/parameters/enable_shadow_vmcs:N /sys/module/kvm_intel/parameters/ept:Y /sys/module/kvm_intel/parameters/eptad:N /sys/module/kvm_intel/parameters/fasteoi:Y /sys/module/kvm_intel/parameters/flexpriority:Y /sys/module/kvm_intel/parameters/nested:N /sys/module/kvm_intel/parameters/ple_gap:0 /sys/module/kvm_intel/parameters/ple_window:4096 /sys/module/kvm_intel/parameters/unrestricted_guest:N /sys/module/kvm_intel/parameters/vmm_exclusive:Y /sys/module/kvm_intel/parameters/vpid:Y
Is there any boot option or something like that I could use to disable the problematic feature?
Yes, you can load KVM with "emulate_invalid_guest_state=N" to work around the problem. I reproduced your scenario with a 2MB vmlinuz and 12MB initrd.img. Here is my pxelinux.cfg/default file: default anaconda prompt 1 timeout 10 label local localboot 1 label anaconda kernel kernels/vmlinuz initrd kernels/initrd.img I have: unrestricted_guest=Y => ~1s emulate_invalid_guest_state=N unrestricted_guest=N => ~1s emulate_invalid_guest_state=Y unrestricted_guest=N => 10s So it's not _unbearably slow_, but it is pretty slow indeed.
Is there a way to use some boot options instead of reloading the modules after each boot? And compared to your numbers, we have seen much worse cases -- loading vmlinuz and initrd.img in minutes instead of (tens of) seconds.
I have observed this as well trying to install RHEL 5.8 in a guest under Fedora 20. Symptoms were that the 5.8 ISO would boot in the guest and allow selecting the install mode, but the initrd would take many many minutes to begin to load, and after the install the system would not accept keyboard input after boot (at least not for a long time -- I eventually got tired of waiting and trying). The same workaround seems to have fixed it: I added options intel_kvm emulate_invalid_guest_state=N to /etc/modprobe.d/kvm_intel.conf, removed and reinserted the module and the guest now appears to install normally.
There's a new ipxe update available, ipxe-20140303-1.gitff1e7fc7.fc20, but I'm guessing it doesn't help any. Paolo, is this truly an ipxe issue, or a kernel/kvm issue due to the slowness of emulation big real mode?
The kernel/KVM issue cannot be really solved except by upgrading the host. For Fedora I think this is CANTFIX.
(In reply to Paolo Bonzini from comment #17) > The kernel/KVM issue cannot be really solved except by upgrading the host. > > For Fedora I think this is CANTFIX. How can this be a CANTFIX if it is a regression? Isn't the big real mode emulation just broken?
> Isn't the big real mode emulation just broken? The older code was not accurate and it broke in other cases. In most cases we could live with the inaccuracy, but not always. I can look at it as an upstream project, but it's not something that the Fedora project can fix with a downstream-only patch. In any case, there's nothing to be fixed in iPXE.
(In reply to Paolo Bonzini from comment #19) > > Isn't the big real mode emulation just broken? > > The older code was not accurate and it broke in other cases. In most cases > we could live with the inaccuracy, but not always. I can look at it as an > upstream project, but it's not something that the Fedora project can fix > with a downstream-only patch. Agreed. > > In any case, there's nothing to be fixed in iPXE. I still don't get why a TFTP transfer in the iPXE "session" takes longer than the same transfer in the running system, but I'm not an expert.