Bug 1261244
Summary: | F23 beta tc4 vagrant box yields crashes kvm and tcg qemu | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Dusty Mabe <dustymabe> |
Component: | qemu | Assignee: | Fedora Virtualization Maintainers <virt-maint> |
Status: | CLOSED NOTABUG | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 23 | CC: | amit.shah, berrange, cfergeau, clalancette, crobinso, dustymabe, dwmw2, ehabkost, extras-orphan, itamar, jsullivan3, kraxel, markmc, pbonzini, quintela, rjones, virt-maint |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2015-09-16 16:11:29 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Dusty Mabe
2015-09-09 03:28:25 UTC
I can reproduce on f23 host too... I'll poke at it Non-vagrant reproducer: wget https://dl.fedoraproject.org/pub/alt/stage/23_Beta_TC4/Cloud_Images/x86_64/Images/Fedora-Cloud-Base-Vagrant-23_Beta_TC4-20150907.x86_64.vagrant-libvirt.box tar -xvf Fedora-Cloud-Base-Vagrant-23_Beta_TC4-20150907.x86_64.vagrant-libvirt.box qemu-kvm -machine pc,accel=kvm -m 2048 -display sdl box.img Fails almost immediately with an 'emulation failure'. However if I up the memory to -m 4096, it doesn't throw an error, but the boot hangs after printing the Syslinux copyright banner. Using accel=tcg -m 512 crashes with: qemu: fatal: Trying to execute code outside RAM or ROM at 0x000000002ef000f8 accel=tcg -m 2048 and -m 4096 seems to hang at the Syslinux banner as well. All the above commands work for at least starting a kernel boot of the (different) Fedora-Cloud-Base image from here: https://dl.fedoraproject.org/pub/alt/stage/23_Beta_TC4/Cloud_Images/x86_64/Images/Fedora-Cloud-Base-23_Beta_TC4-20150907.x86_64.qcow2 Tried with F23 qemu-kvm-2.4.0-2.fc23.x86_64 and the F21 qemu-2.1.3-10.fc21 compiled locally and both reproduced. All f23 vagrant images seem to have the issue, but the f22 GA vagrant images work fine. So my guess is that either some syslinux change is tickling a bug in qemu or maybe seabios, or the f23 images are actually bogus. Dusty, how is the vagrant disk image different from the cloud base image? Does the cloud image use syslinux or grub? Gerd, paolo, any suggestions how to debug this further? (In reply to Cole Robinson from comment #2) > Dusty, how is the vagrant disk image different from the cloud base image? > Does the cloud image use syslinux or grub? > The vagrant image uses the cloud kickstart as a base (so they are very similar, but see changes in links below). Unfortunately in this case it looks like cloud base was switched to use grub for f23 while the vagrant box is still using extlinux. This could be the cause of some of the issues. https://git.fedorahosted.org/cgit/spin-kickstarts.git/tree/fedora-cloud-base.ks?h=f23 https://git.fedorahosted.org/cgit/spin-kickstarts.git/tree/fedora-cloud-base-vagrant.ks?h=f23 I am going to get a rebuild of the vagrant box with grub rather than extlinux. If that doesn't show any failures then is this issue of concern? In other words, if we somehow managed to throw garbage at qemu/kvm is it still considered a bug or is it invalid? (In reply to Dusty Mabe from comment #4) > > In other words, if we somehow managed to throw garbage at qemu/kvm is it > still considered a bug or is it invalid? I'm not an expert here but it may or may not be a qemu bug... really depends on if what the guest is doing makes sense or not. If the guest is completely legitimate then this is probably qemu/seabios issue that needs solving at some point, but if the image is messed up or syslinux is busted then qemu/kvm falling over like this might be considered fine. Would need someone like Gerd or Paolo who understand this stuff better to chime in. But I'm guessing the quickest path to 'fixing' this WRT to vagrant usage is to switch to what the cloud images are doing for booting > EAX=00000020 EBX=0010e704 ECX=000000fd EDX=2ef000f8 > ESI=00e09241 EDI=00116fe1 EBP=00007b0e ESP=0032afdc > EIP=d35f676c EFL=00010246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0 > ES =0028 00000000 ffffffff 00c09300 DPL=0 DS [-WA] > CS =0020 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA] > SS =0028 00000000 ffffffff 00c09300 DPL=0 DS [-WA] > DS =0028 00000000 ffffffff 00c09300 DPL=0 DS [-WA] > CR0=00000011 CR2=00000000 CR3=00000000 CR4=00000000 EIP looks like it jumped into nowhere. Paging is not yet enabled, so there is nothing at that address. So kvm most likely barfs on trying to emulate an invalid instruction. Segments look like the bootloader (i.e. extlinux) is running at that point. Unlikely it is something in qemu. I'd guess either extlinux does something strange (possibly due to toolchain issues), or the x86 emulation in the kernel is buggy and extlinux trips over it. Hmm, just seeing tcg fails too, for the same reason (try execute code outside ram). Makes extlinux being buggy more likely. Does the same extlinux version work on real hardware? So I'm going to say this was an issue on our end. We made the cloud base image use grub and removed the 'helper code for extlinux in our %post in anaconda' but we didn't modify the vagrant image's kickstart (which inherits from cloud base) to use grub so it still used extlinux. This resulted in an image that was configured to use extlinux but didn't have any of the "helper code"[1] in place to make it happen. I'm going to close this bug as invalid unless anyone thinks it should be left open. [1] https://git.fedorahosted.org/cgit/spin-kickstarts.git/commit/?h=f23&id=bc1f075e4110c5bad913936036b335bd217f6624 Nah let's close it. Thanks for following up |