Bug 502058

Summary: qemu -no-kvm guest hangs at during timer setup; works with noapic
Product: [Fedora] Fedora Reporter: Richard W.M. Jones <rjones>
Component: qemuAssignee: Justin M. Forbes <jforbes>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: medium    
Version: 11CC: dwmw2, gcosta, itamar, markmc, mgoldman, virt-maint
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-11-09 15:51:50 EST Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On:    
Bug Blocks: 480594    
Attachments:
Description Flags
build log none

Description Richard W.M. Jones 2009-05-21 13:05:20 EDT
Created attachment 344993 [details]
build log

When you boot the guest, it hangs at the following
point in the boot:

/usr/bin/qemu-kvm /usr/bin/qemu-kvm -drive file=test.img -m 384 -no-reboot -kernel /builddir/build/BUILD/libguestfs-1.0.29/vmlinuz.fedora-12.x86_64 -initrd /builddir/build/BUILD/libguestfs-1.0.29/initramfs.fedora-12.x86_64.img -append 'panic=1 console=ttyS0 guestfs=10.0.2.4:6666 guestfs_verbose=1' -nographic -serial stdio -net channel,6666:unix:/tmp/libguestfsdsLp3s/sock,server,nowait -net user,vlan=0 -net nic,model=virtio,vlan=0
open /dev/kvm: No such file or directory
Could not initialize KVM, will disable KVM support

  [...]

CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 512K (64 bytes/line)
CPU 0/0x0 -> Node 0
SMP alternatives: switching to UP code
ACPI: Core revision 20081204
ftrace: converting mcount calls to 0f 1f 44 00 00
ftrace: allocating 18880 entries in 149 pages
Setting APIC routing to flat
..TIMER: vector=0x30 apic1=0 pin1=0 apic2=-1 pin2=-1
..MP-BIOS bug: 8254 timer not connected to IO-APIC
...trying to set up timer (IRQ0) through the 8259A ...
..... (found apic 0 pin 0) ...
....... failed.
...trying to set up timer as Virtual Wire IRQ...

guest kernel 2.6.29.3-155.fc11.x86_64
qemu-kvm-0.10-16.fc11.x86_64
bochs-bios 2.3.8-0.6.git04387139e3b.fc11

NB: This is happening with software emulation - on a
machine that doesn't have KVM.
Comment 1 Richard W.M. Jones 2009-05-21 13:09:07 EDT
Note these Ubuntu bugs:
https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/379000 (in qemu)
https://bugs.launchpad.net/ubuntu/+source/kvm/+bug/320320 (in KVM, fixed)
Comment 2 Mark McLoughlin 2009-05-21 13:50:01 EDT
Could you try with -no-acpi and qemu-kvm-0.10.4-4.fc11 ?

It should be easy enough to reproduce this outside of Koji by e.g. disabling access to /dev/kvm
Comment 3 Mark McLoughlin 2009-05-21 13:52:56 EDT
(In reply to comment #2)

> It should be easy enough to reproduce this outside of Koji by e.g. disabling
> access to /dev/kvm  

Brain fart. Just try with -no-kvm
Comment 4 Richard W.M. Jones 2009-05-21 14:13:44 EDT
(In reply to comment #2)
> Could you try with -no-acpi and qemu-kvm-0.10.4-4.fc11 ?

I haven't tried -no-acpi yet, but I *have* tried booting the
guest kernel with the noapic option (NB: APIC not ACPI).  This
has in fact fixed the problem for me.
Comment 5 Mark McLoughlin 2009-05-25 06:19:40 EDT
See also bug #502440
Comment 6 Mark McLoughlin 2009-05-25 06:58:51 EDT
Okay, I can only reproduce this by running qemu -no-kvm inside a KVM guest. Race condition perhaps?
Comment 7 Bug Zapper 2009-06-09 12:16:20 EDT
This bug appears to have been reported against 'rawhide' during the Fedora 11 development cycle.
Changing version to '11'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 8 Richard W.M. Jones 2009-07-15 07:17:48 EDT
This affects running libguestfs/guestfish in Amazon EC2
instances.

The problem is that because Amazon EC2 instances run
inside Xen, KVM acceleration is not available, and so
they hit this bug in QEMU tcg soft emulation.

Workaround:
export LIBGUESTFS_APPEND=noapic

(Reported by Marek Goldmann, confirmed by RWMJ).
Comment 9 Richard W.M. Jones 2009-07-15 07:20:19 EDT
$ ssh oddthesis@****.compute-1.amazonaws.com
oddthesis@****.compute-1.amazonaws.com's password: 
Last login: Wed Jul 15 07:01:35 2009 from ****
Appliance:  JBoss Cloud appliance build environment
Version:    1.0.0.Beta6-1


[oddthesis@**** ~]$ cat /etc/fedora-release 
Fedora release 11 (Leonidas)
Comment 10 Fedora Admin XMLRPC Client 2010-03-09 11:53:45 EST
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.
Comment 11 Fedora Admin XMLRPC Client 2010-03-09 12:17:25 EST
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.
Comment 12 Bug Zapper 2010-04-27 10:26:41 EDT
This message is a reminder that Fedora 11 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 11.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '11'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 11's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 11 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 13 Bug Zapper 2010-06-28 08:38:34 EDT
Fedora 11 changed to end-of-life (EOL) status on 2010-06-25. Fedora 11 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.
Comment 14 Richard W.M. Jones 2011-09-25 06:34:20 EDT
Reopening this.  In order to workaround bug 723822 which
causes noapic to fail, we removed this option (so APIC is
supposed to be enabled).

The failures described in this bug occur intermittently
(about 1 time in 10).

[00441ms] /usr/bin/qemu-kvm \
    -drive file=../images/test.iso,snapshot=on,if=virtio \
    -nodefconfig \
    -machine pc,accel=kvm:tcg \
    -nodefaults \
    -nographic \
    -m 500 \
    -no-reboot \
    -no-hpet \
    -device virtio-serial \
    -serial stdio \
    -chardev socket,path=../libguestfszIyAyK/guestfsd.sock,id=channel0 \
    -device virtserialport,chardev=channel0,name=org.libguestfs.channel.0 \
    -kernel ../.guestfs-419/kernel.22990 \
    -initrd ../.guestfs-419/initrd.22990 \
    -append 'panic=1 console=ttyS0 udevtimeout=300 acpi=off printk.time=1 cgroup_disable=memory selinux=0 guestfs_verbose=1 TERM=linux ' \
    -drive file=../.guestfs-419/root.22990,snapshot=on,if=virtio,cache=unsafeCould not access KVM kernel module: No such file or directory
failed to initialize KVM: No such file or directory
Back to tcg accelerator.
Could not open option rom 'sgabios.bin': No such file or directory
[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Linux version 3.1.0-0.rc7.git0.2.fc17.x86_64 (mockbuild@x86-04.phx2.fedoraproject.org) (gcc version 4.6.1 20110824 (Red Hat 4.6.1-8) (GCC) ) #1 SMP Thu Sep 22 01:59:29 UTC 2011
[    0.000000] Command line: panic=1 console=ttyS0 udevtimeout=300 acpi=off printk.time=1 cgroup_disable=memory selinux=0 guestfs_verbose=1 TERM=linux 
[    0.000000] BIOS-provided physical RAM map:
[    0.000000]  BIOS-e820: 0000000000000000 - 000000000009bc00 (usable)
[    0.000000]  BIOS-e820: 000000000009bc00 - 00000000000a0000 (reserved)
[    0.000000]  BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
[    0.000000]  BIOS-e820: 0000000000100000 - 000000001f3fd000 (usable)
[    0.000000]  BIOS-e820: 000000001f3fd000 - 000000001f400000 (reserved)
[    0.000000]  BIOS-e820: 00000000fffc0000 - 0000000100000000 (reserved)
[    0.000000] NX (Execute Disable) protection: active
[    0.000000] DMI 2.4 present.
[    0.000000] No AGP bridge found
[    0.000000] last_pfn = 0x1f3fd max_arch_pfn = 0x400000000
[    0.000000] x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
[    0.000000] found SMP MP-table at [ffff8800000fdaf0] fdaf0
[    0.000000] init_memory_mapping: 0000000000000000-000000001f3fd000
[    0.000000] RAMDISK: 1f2ec000 - 1f3f0000
[    0.000000] No NUMA configuration found
[    0.000000] Faking a node at 0000000000000000-000000001f3fd000
[    0.000000] Initmem setup node 0 0000000000000000-000000001f3fd000
[    0.000000]   NODE_DATA [000000001f2d7000 - 000000001f2ebfff]
[    0.000000] Zone PFN ranges:
[    0.000000]   DMA      0x00000010 -> 0x00001000
[    0.000000]   DMA32    0x00001000 -> 0x00100000
[    0.000000]   Normal   empty
[    0.000000] Movable zone start PFN for each node
[    0.000000] early_node_map[2] active PFN ranges
[    0.000000]     0: 0x00000010 -> 0x0000009b
[    0.000000]     0: 0x00000100 -> 0x0001f3fd
[    0.000000] SFI: Simple Firmware Interface v0.81 http://simplefirmware.org
[    0.000000] Intel MultiProcessor Specification v1.4
[    0.000000] MPTABLE: OEM ID: BOCHSCPU
[    0.000000] MPTABLE: Product ID: 0.1         
[    0.000000] MPTABLE: APIC at: 0xFEE00000
[    0.000000] Processor #0 (Bootup-CPU)
[    0.000000] IOAPIC[0]: apic_id 1, version 17, address 0xfec00000, GSI 0-23
[    0.000000] Processors: 1
[    0.000000] SMP: Allowing 1 CPUs, 0 hotplug CPUs
[    0.000000] PM: Registered nosave memory: 000000000009b000 - 000000000009c000
[    0.000000] PM: Registered nosave memory: 000000000009c000 - 00000000000a0000
[    0.000000] PM: Registered nosave memory: 00000000000a0000 - 00000000000f0000
[    0.000000] PM: Registered nosave memory: 00000000000f0000 - 0000000000100000
[    0.000000] Allocating PCI resources starting at 1f400000 (gap: 1f400000:e0bc0000)
[    0.000000] Booting paravirtualized kernel on bare hardware
[    0.000000] setup_percpu: NR_CPUS:512 nr_cpumask_bits:512 nr_cpu_ids:1 nr_node_ids:1
[    0.000000] PERCPU: Embedded 476 pages/cpu @ffff88001f000000 s1918464 r8192 d23040 u2097152
[    0.000000] Built 1 zonelists in Node order, mobility grouping on.  Total pages: 125875
[    0.000000] Policy zone: DMA32
[    0.000000] Kernel command line: panic=1 console=ttyS0 udevtimeout=300 acpi=off printk.time=1 cgroup_disable=memory selinux=0 guestfs_verbose=1 TERM=linux 
[    0.000000] Disabling memory control group subsystem
[    0.000000] PID hash table entries: 2048 (order: 2, 16384 bytes)
[    0.000000] Checking aperture...
[    0.000000] No AGP bridge found
[    0.000000] Memory: 473228k/511988k available (5185k kernel code, 468k absent, 38292k reserved, 6577k data, 2784k init)
[    0.000000] SLUB: Genslabs=15, HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[    0.000000] Hierarchical RCU implementation.
[    0.000000] \tRCU dyntick-idle grace-period acceleration is enabled.
[    0.000000] \tRCU lockdep checking is enabled.
[    0.000000] NR_IRQS:33024 nr_irqs:256 16
[    0.000000] Console: colour dummy device 80x25
[    0.000000] console [ttyS0] enabled
[    0.000000] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
[    0.000000] ... MAX_LOCKDEP_SUBCLASSES:  8
[    0.000000] ... MAX_LOCK_DEPTH:          48
[    0.000000] ... MAX_LOCKDEP_KEYS:        8191
[    0.000000] ... CLASSHASH_SIZE:          4096
[    0.000000] ... MAX_LOCKDEP_ENTRIES:     16384
[    0.000000] ... MAX_LOCKDEP_CHAINS:      32768
[    0.000000] ... CHAINHASH_SIZE:          16384
[    0.000000]  memory used by lock dependency info: 6367 kB
[    0.000000]  per task-struct memory footprint: 2688 bytes
[    0.000000] Fast TSC calibration using PIT
[    0.000000] Detected 2480.298 MHz processor.
[    0.000490] Calibrating delay loop (skipped), value calculated using timer frequency.. 4960.59 BogoMIPS (lpj=2480298)
[    0.000999] pid_max: default: 32768 minimum: 301
[    0.000999] Security Framework initialized
[    0.000999] SELinux:  Disabled at boot.
[    0.000999] Dentry cache hash table entries: 65536 (order: 7, 524288 bytes)
[    0.000999] Inode-cache hash table entries: 32768 (order: 6, 262144 bytes)
[    0.000999] Mount-cache hash table entries: 256
[    0.000999] Initializing cgroup subsys cpuacct
[    0.000999] Initializing cgroup subsys memory
[    0.000999] Initializing cgroup subsys devices
[    0.000999] Initializing cgroup subsys freezer
[    0.000999] Initializing cgroup subsys net_cls
[    0.000999] Initializing cgroup subsys blkio
[    0.000999] Initializing cgroup subsys perf_event
[    0.000999] mce: CPU supports 10 MCE banks
[    0.000999] SMP alternatives: switching to UP code
[    0.000999] Freeing SMP alternatives: 12k freed
[    0.000999] ftrace: allocating 25829 entries in 102 pages
[    0.000999] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[    0.000999] ..MP-BIOS bug: 8254 timer not connected to IO-APIC
[    0.000999] ...trying to set up timer (IRQ0) through the 8259A ...
[    0.000999] ..... (found apic 0 pin 2) ...
[    0.000999] ....... failed.
[    0.000999] ...trying to set up timer as Virtual Wire IRQ...
[    0.000999] ..... failed.
[    0.000999] ...trying to set up timer as ExtINT IRQ...
[    0.000999] ..... failed :(.
[    0.000999] Kernel panic - not syncing: IO-APIC + timer doesn't work!  Boot with apic=debug and send a report.  Then try booting with the 'noapic' option.
[    0.000999] 
[    0.000999] Pid: 1, comm: swapper Not tainted 3.1.0-0.rc7.git0.2.fc17.x86_64 #1
[    0.000999] Call Trace:
[    0.000999]  [<ffffffff814f9d6d>] panic+0xa0/0x1b9
[    0.000999]  [<ffffffff81d636a5>] setup_IO_APIC+0x2df/0x761
[    0.000999]  [<ffffffff81d602f5>] native_smp_prepare_cpus+0x2e2/0x356
[    0.000999]  [<ffffffff81d53c48>] kernel_init+0x8b/0x159
[    0.000999]  [<ffffffff8150dc04>] kernel_thread_helper+0x4/0x10
[    0.000999]  [<ffffffff81505074>] ? retint_restore_args+0x13/0x13
[    0.000999]  [<ffffffff81d53bbd>] ? start_kernel+0x3ea/0x3ea
[    0.000999]  [<ffffffff8150dc00>] ? gs_change+0x13/0x13
[    0.000999] Rebooting in 1 seconds..

2:qemu-kvm-0.15.0-4.fc17.x86_64
kernel-3.1.0-0.rc7.git0.2.fc17.x86_64

This is with TCG, not KVM, in case that isn't clear.
Comment 15 Richard W.M. Jones 2011-09-26 08:32:19 EDT
After some examination of the code, this turns out to
be a known problem with the code that tests for buggy
timers.  This code is not necessary when running in
qemu, and it gets confused because it tries to do accurate
timing checks which sometimes fail in virt.  For more
information, see:

https://bugzilla.redhat.com/show_bug.cgi?id=698842#c8

Adding kernel no_timer_check option appears to fix the
problem for me, but I am still doing testing.
Comment 16 Richard W.M. Jones 2011-09-26 11:11:24 EDT
Added this commit to libguestfs to work around this issue:

http://git.annexia.org/?p=libguestfs.git;a=commitdiff;h=322106521f546d7c70c5a38255db7d243a456a6b
Comment 17 John Poelstra 2011-11-09 15:02:46 EST
Okay to close this?
Comment 18 Richard W.M. Jones 2011-11-09 15:51:50 EST
Yup, I'll close it, thanks.

Worth remembering that ALL code in the kernel that tries
to test timers / calibrate timing loops, is suspect in a
virt context!