1250874 – F23 Alpha RC2 Cloud Base image doesn't boot

Bug 1250874 - F23 Alpha RC2 Cloud Base image doesn't boot

Summary: F23 Alpha RC2 Cloud Base image doesn't boot

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	syslinux
Sub Component:
Version:	23
Hardware:	All
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Assignee:	Peter Jones
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-08-06 08:12 UTC by Mike Ruckman
Modified:	2015-10-19 22:27 UTC (History)
CC List:	14 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2015-10-19 22:27:54 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Mike Ruckman 2015-08-06 08:12:08 UTC

Description of problem:
Booting the base F23 Alpha RC2 cloud image doesn't work on either openstack or locally. 

Version-Release number of selected component (if applicable):
Fedora-Cloud-Base-23_Alpha-20150805.x86_64.qcow2

How reproducible:
Always

Steps to Reproduce:
1. Attempt to boot cloud image
2. 
3.

Actual results:
Instance is "paused"

Expected results:
Instance boots

Additional info:

Logs are here: http://fpaste.org/252090/14388476/
Still need to test on EC2

Comment 1 Fedora Blocker Bugs Application 2015-08-06 08:16:19 UTC

Proposed as a Blocker for 23-alpha by Fedora user roshi using the blocker tracking app because:

 These images won't boot locally or with openstack, in v2 qcow or v3. It violates the following criterion: "All release-blocking images must boot in their supported configurations."

Comment 2 kushaldas@gmail.com 2015-08-06 10:09:28 UTC

I can verify this issue. Strangely it is booting when using qemu-kvm directly, but not with virt-install.

Comment 3 Kevin Fenzi 2015-08-06 14:55:02 UTC

I see this also in the fedora infrastructure private cloud. 
:( 
# virsh list | grep paus
 1754  instance-000068ab              paused
# cat /var/log/libvirt/qemu/instance-000068ab.log 
2015-08-06 05:28:26.961+0000: starting up
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name instance-000068ab -S -machine pc-i440fx-rhel7.1.0,accel=kvm,usb=off -cpu SandyBridge,+invpcid,+erms,+bmi2,+smep,+avx2,+bmi1,+fsgsbase,+abm,+pdpe1gb,+rdrand,+f16c,+osxsave,+movbe,+dca,+pcid,+pdcm,+xtpr,+fma,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme -m 4096 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 2c420820-f81f-4ebf-9de1-5d317cd21a8a -smbios type=1,manufacturer=Red Hat,product=OpenStack Compute,version=2014.1.4-5.el7ost,serial=44454c4c-4b00-1050-8057-b8c04f383432,uuid=2c420820-f81f-4ebf-9de1-5d317cd21a8a -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/instance-000068ab.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/nova/instances/2c420820-f81f-4ebf-9de1-5d317cd21a8a/disk,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=44 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:c3:80:a3,bus=pci.0,addr=0x3 -chardev file,id=charserial0,path=/var/lib/nova/instances/2c420820-f81f-4ebf-9de1-5d317cd21a8a/console.log -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device usb-tablet,id=input0 -vnc 0.0.0.0:19 -k en-us -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on
char device redirected to /dev/pts/19 (label charserial1)
KVM: entry failed, hardware error 0x7
EAX=0000ffe1 EBX=00008adb ECX=0000e827 EDX=0000e000
ESI=0000ffff EDI=00116fe1 EBP=00007b0f ESP=0000ffe4
EIP=0000e829 EFL=00000016 [----AP-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 00000000 0000ffff 00009300
CS =f000 000f0000 0000ffff 00009b00
SS =e000 000e0000 0000ffff 00009300
DS =e000 000e0000 0000ffff 00009300
FS =0000 00000000 0000ffff 00009300
GS =0000 00000000 0000ffff 00009300
LDT=0000 00000000 ffffffff 00c00000
TR =0008 00000580 00000067 00008b00
GDT=     00009400 0000002f
IDT=     00000000 0000ffff
CR0=00000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=18 d7 00 00 66 e8 b8 a9 ff ff 66 83 c4 20 66 5b 66 c3 b0 e1 <6f> 11 00 00 00 00 00 0f 7b 00 00 48 e8 0f 00 05 e7 10 00 00 00 00 00 5d 00 45 ff 74 74 70

Did rc1 work? or any of the tc's? That might help us isolate where it started failing...

Comment 4 Petr Schindler 2015-08-06 15:02:06 UTC

+1 blocker from me.

Comment 5 Stephen Gallagher 2015-08-06 15:07:59 UTC

Yeah, unfortunately I have to vote +1 blocker as well.

Comment 6 Kevin Fenzi 2015-08-06 15:19:49 UTC

oh yeah, +1 blocker here too. ;( 

Answering my question above, none of tc1/tc2/rc1 had cloud images, so we can't use them to isolate it. ;(

Comment 7 Matthew Miller 2015-08-06 15:35:29 UTC

Hmmm — "KVM: entry failed, hardware error 0x7" in bug #1016748, but that's for nested virt.

Comment 8 Matthew Miller 2015-08-06 15:52:29 UTC

I bet the difference between qemu-kvm and and the failure case is in the -cpu or -machine options. Someone want to start bisecting those? :)

Comment 9 Adam Williamson 2015-08-06 16:05:11 UTC

As described so far, clear +1 blocker.

Comment 10 Adrian Reber 2015-08-06 16:35:46 UTC

This sounds a lot like what I have seen here:

https://bugzilla.redhat.com/show_bug.cgi?id=1241159

Comment 11 Adam Williamson 2015-08-06 16:40:54 UTC

Yup, sure does, thanks for the note. More syslinux crap? Fun.

Comment 12 Adam Williamson 2015-08-06 16:45:32 UTC

The difference between syslinux 6.03-2.fc22 and 6.03-5.fc23 is more or less GCC 4 vs. GCC 5, the 'Harden All Packages' change, and the fix for 1234653 that we backported.

The CFLAGS for both compiles look to be identical to me.

Comment 13 Matthew Miller 2015-08-06 17:15:01 UTC

My suggestion is to switch to grub2 for the alpha, and if extlinux issues can be addressed before beta, put it back to that, but if not, leave as grub2 (and work on reducing grub2 packaging issues like fedora-logos dependency).

Comment 14 Mike Ruckman 2015-08-06 18:12:25 UTC

So it seems to boot if I manually install grub2. It's a start.

Comment 15 Matthew Miller 2015-08-07 15:23:30 UTC

So, we now have grub2-based RC2.2 images, and they work in EC2 and OpenStack.

Comment 16 Mike Ruckman 2015-08-07 16:55:42 UTC

All my testing indicates things are good to go for Alpha.

Comment 17 Mike Ruckman 2015-08-07 17:09:29 UTC

Discussed in 2015-08-06 Go/No-Go meeting. Accepted as a blocker.

Comment 18 Adam Williamson 2015-08-07 19:36:51 UTC

Setting back to ASSIGNED, but dropping blocker status, as the Alpha RC2.2 images are verified to boot.

Should this be re-assigned to syslinux?

Comment 19 Adam Williamson 2015-10-15 23:18:47 UTC

So I did a build of syslinux with a couple of patches from the upstream mailing list, and it makes livecd-iso-to-disk work. It may also solve the cloud case. Does someone want to try building an image using the syslinux packages from this scratch build?

http://koji.fedoraproject.org/koji/taskinfo?taskID=11466483

Comment 20 Adrian Reber 2015-10-16 06:44:16 UTC

The problems I have reported in https://bugzilla.redhat.com/show_bug.cgi?id=1241159 are fixed with the version of syslinux at

http://koji.fedoraproject.org/koji/taskinfo?taskID=11466483

My Fedora 23 VM can now boot again with syslinux (without crashing KVM).

Comment 21 Adam Williamson 2015-10-19 22:27:54 UTC

OK, so since we're sticking with grub2 for 23 and we have a fixed syslinux for Rawhide, I figure we can close this, there's no work left.

Note You need to log in before you can comment on or make changes to this bug.