With qemu-kvm-rhev-1.5.3-60.el7_0.10.x86_64 and "virt_type = qemu" in nova.conf, instances fail to boot. Nova successfully starts a qemu-kvm process: # ps -fe | grep instance-0000000c qemu 27714 1 88 16:22 ? 00:01:59 /usr/libexec/qemu-kvm -name instance-0000000c -S -machine pc-i440fx-rhel7.0.0,accel=tcg,usb=off -m 512 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid c6a76589-2713-4298-911c-dc03bd01e992 -smbios type=1,manufacturer=Fedora Project,product=OpenStack Nova,version=2014.2.1-7.el7ost,serial=622e4ef0-ebb5-48b5-a1d2-d8567c30ea7c,uuid=c6a76589-2713-4298-911c-dc03bd01e992 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/instance-0000000c.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/nova/instances/c6a76589-2713-4298-911c-dc03bd01e992/disk,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=25,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:93:bd:96,bus=pci.0,addr=0x3 -chardev file,id=charserial0,path=/var/lib/nova/instances/c6a76589-2713-4298-911c-dc03bd01e992/console.log -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device usb-tablet,id=input0 -vnc 0.0.0.0:0 -k en-us -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 But the kernel gets stuck booting. The last messages logged to the console are: # nova console-log test1 | tail [ 0.184010] Freeing SMP alternatives: 24k freed [ 0.184010] ACPI: Core revision 20110623 [ 0.204522] ftrace: allocating 27027 entries in 106 pages [ 0.219204] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 [ 0.236014] ..MP-BIOS bug: 8254 timer not connected to IO-APIC [ 0.236014] ...trying to set up timer (IRQ0) through the 8259A ... [ 0.236014] ..... (found apic 0 pin 2) ... [ 0.252015] ....... failed. [ 0.252015] ...trying to set up timer as Virtual Wire IRQ... Downgrading to qemu-kvm-rhev-1.5.3-60.el7_0.7.x86_64 (and with no other changes), everything works as expected.
The bug you specifically mention looks a lot like one which you could solve by adding no_timer_check to the kernel command line (this is the default in modern kernels, but you don't mention what kernel version this is).
I'd also like to pimp qemu-sanity-check: http://people.redhat.com/~rjones/qemu-sanity-check/ It only has very minimal dependencies (just gcc, glibc-static and bash) and can test if a kernel is compatible with a qemu.
Ah, the no_timer_check was discussed here https://bugs.launchpad.net/cirros/+bug/1312199 Speaking of no_timer_check, Daniel Berrnage once pointed me to this commit[1] in upstream Nova): commit 6b86a61fee15ce1237303fab2f7896f8c3bcad47 Author: Attila Fazekas <afazekas> Date: Wed May 28 09:19:29 2014 +0200 Use no_timer_check with soft-qemu The Linux kernel timer check not working properly when the hypervisor's thread preempted by the host CPU scheduler. The timer check is automatically disabled with other types of hypervisors including the hardware accelerated kvm, but timer_check is not disabled when qemu used without hardware acceleration. This issue is frequently mischaracterized as an SSH connectivity issue and causes rechecks and occasional boot failures. This change adds no_timer_check kernel parameter when we are using uec images with qemu. Closes-Bug: #1312199 Change-Id: I3cfdfe9048fe219fc12cdac8a399b496f237e55e [1] https://review.openstack.org/#/c/96090/
There are already bugs open against our guest images to add the no_timer_check parameter: - https://bugzilla.redhat.com/show_bug.cgi?id=1144155 - https://bugzilla.redhat.com/show_bug.cgi?id=1147035 So maybe this is CLOSE NOTABUG, but there is a difference in behavior with these two qemu versions.
Miroslav, do you know if there were any changes that might account for this? I don't see anything obvious in the package changelog.
Changelog contains all changes done in qemu-kvm-rhev between -7 and -10 version. I suspect the vmstate_xhci_event patches to be the culprit but I do not know how they can cause this? Any idea Laszlo?
Nothing seems relevant. I suggest trying each official build in the interval, and then bisecting the "culprit build" patch for patch.