Red Hat Bugzilla – Bug 1250874
F23 Alpha RC2 Cloud Base image doesn't boot
Last modified: 2015-10-19 18:27:54 EDT
Description of problem:
Booting the base F23 Alpha RC2 cloud image doesn't work on either openstack or locally.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Attempt to boot cloud image
Instance is "paused"
Logs are here: http://fpaste.org/252090/14388476/
Still need to test on EC2
Proposed as a Blocker for 23-alpha by Fedora user roshi using the blocker tracking app because:
These images won't boot locally or with openstack, in v2 qcow or v3. It violates the following criterion: "All release-blocking images must boot in their supported configurations."
I can verify this issue. Strangely it is booting when using qemu-kvm directly, but not with virt-install.
I see this also in the fedora infrastructure private cloud.
# virsh list | grep paus
1754 instance-000068ab paused
# cat /var/log/libvirt/qemu/instance-000068ab.log
2015-08-06 05:28:26.961+0000: starting up
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name instance-000068ab -S -machine pc-i440fx-rhel7.1.0,accel=kvm,usb=off -cpu SandyBridge,+invpcid,+erms,+bmi2,+smep,+avx2,+bmi1,+fsgsbase,+abm,+pdpe1gb,+rdrand,+f16c,+osxsave,+movbe,+dca,+pcid,+pdcm,+xtpr,+fma,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme -m 4096 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 2c420820-f81f-4ebf-9de1-5d317cd21a8a -smbios type=1,manufacturer=Red Hat,product=OpenStack Compute,version=2014.1.4-5.el7ost,serial=44454c4c-4b00-1050-8057-b8c04f383432,uuid=2c420820-f81f-4ebf-9de1-5d317cd21a8a -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/instance-000068ab.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/nova/instances/2c420820-f81f-4ebf-9de1-5d317cd21a8a/disk,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=44 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:c3:80:a3,bus=pci.0,addr=0x3 -chardev file,id=charserial0,path=/var/lib/nova/instances/2c420820-f81f-4ebf-9de1-5d317cd21a8a/console.log -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device usb-tablet,id=input0 -vnc 0.0.0.0:19 -k en-us -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on
char device redirected to /dev/pts/19 (label charserial1)
KVM: entry failed, hardware error 0x7
EAX=0000ffe1 EBX=00008adb ECX=0000e827 EDX=0000e000
ESI=0000ffff EDI=00116fe1 EBP=00007b0f ESP=0000ffe4
EIP=0000e829 EFL=00000016 [----AP-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 00000000 0000ffff 00009300
CS =f000 000f0000 0000ffff 00009b00
SS =e000 000e0000 0000ffff 00009300
DS =e000 000e0000 0000ffff 00009300
FS =0000 00000000 0000ffff 00009300
GS =0000 00000000 0000ffff 00009300
LDT=0000 00000000 ffffffff 00c00000
TR =0008 00000580 00000067 00008b00
GDT= 00009400 0000002f
IDT= 00000000 0000ffff
CR0=00000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
Code=18 d7 00 00 66 e8 b8 a9 ff ff 66 83 c4 20 66 5b 66 c3 b0 e1 <6f> 11 00 00 00 00 00 0f 7b 00 00 48 e8 0f 00 05 e7 10 00 00 00 00 00 5d 00 45 ff 74 74 70
Did rc1 work? or any of the tc's? That might help us isolate where it started failing...
+1 blocker from me.
Yeah, unfortunately I have to vote +1 blocker as well.
oh yeah, +1 blocker here too. ;(
Answering my question above, none of tc1/tc2/rc1 had cloud images, so we can't use them to isolate it. ;(
Hmmm — "KVM: entry failed, hardware error 0x7" in bug #1016748, but that's for nested virt.
I bet the difference between qemu-kvm and and the failure case is in the -cpu or -machine options. Someone want to start bisecting those? :)
As described so far, clear +1 blocker.
This sounds a lot like what I have seen here:
Yup, sure does, thanks for the note. More syslinux crap? Fun.
The difference between syslinux 6.03-2.fc22 and 6.03-5.fc23 is more or less GCC 4 vs. GCC 5, the 'Harden All Packages' change, and the fix for 1234653 that we backported.
The CFLAGS for both compiles look to be identical to me.
My suggestion is to switch to grub2 for the alpha, and if extlinux issues can be addressed before beta, put it back to that, but if not, leave as grub2 (and work on reducing grub2 packaging issues like fedora-logos dependency).
So it seems to boot if I manually install grub2. It's a start.
So, we now have grub2-based RC2.2 images, and they work in EC2 and OpenStack.
All my testing indicates things are good to go for Alpha.
Discussed in 2015-08-06 Go/No-Go meeting. Accepted as a blocker.
Setting back to ASSIGNED, but dropping blocker status, as the Alpha RC2.2 images are verified to boot.
Should this be re-assigned to syslinux?
So I did a build of syslinux with a couple of patches from the upstream mailing list, and it makes livecd-iso-to-disk work. It may also solve the cloud case. Does someone want to try building an image using the syslinux packages from this scratch build?
The problems I have reported in https://bugzilla.redhat.com/show_bug.cgi?id=1241159 are fixed with the version of syslinux at
My Fedora 23 VM can now boot again with syslinux (without crashing KVM).
OK, so since we're sticking with grub2 for 23 and we have a fixed syslinux for Rawhide, I figure we can close this, there's no work left.