Bug 502058
Summary: | qemu -no-kvm guest hangs at during timer setup; works with noapic | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Richard W.M. Jones <rjones> | ||||
Component: | qemu | Assignee: | Justin M. Forbes <jforbes> | ||||
Status: | CLOSED WONTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 11 | CC: | dwmw2, gcosta, itamar, markmc, mgoldman, virt-maint | ||||
Target Milestone: | --- | Keywords: | Reopened | ||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2011-11-09 20:51:50 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 480594 | ||||||
Attachments: |
|
Note these Ubuntu bugs: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/379000 (in qemu) https://bugs.launchpad.net/ubuntu/+source/kvm/+bug/320320 (in KVM, fixed) Could you try with -no-acpi and qemu-kvm-0.10.4-4.fc11 ? It should be easy enough to reproduce this outside of Koji by e.g. disabling access to /dev/kvm (In reply to comment #2) > It should be easy enough to reproduce this outside of Koji by e.g. disabling > access to /dev/kvm Brain fart. Just try with -no-kvm (In reply to comment #2) > Could you try with -no-acpi and qemu-kvm-0.10.4-4.fc11 ? I haven't tried -no-acpi yet, but I *have* tried booting the guest kernel with the noapic option (NB: APIC not ACPI). This has in fact fixed the problem for me. See also bug #502440 Okay, I can only reproduce this by running qemu -no-kvm inside a KVM guest. Race condition perhaps? This bug appears to have been reported against 'rawhide' during the Fedora 11 development cycle. Changing version to '11'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping This affects running libguestfs/guestfish in Amazon EC2 instances. The problem is that because Amazon EC2 instances run inside Xen, KVM acceleration is not available, and so they hit this bug in QEMU tcg soft emulation. Workaround: export LIBGUESTFS_APPEND=noapic (Reported by Marek Goldmann, confirmed by RWMJ). $ ssh oddthesis@****.compute-1.amazonaws.com oddthesis@****.compute-1.amazonaws.com's password: Last login: Wed Jul 15 07:01:35 2009 from **** Appliance: JBoss Cloud appliance build environment Version: 1.0.0.Beta6-1 [oddthesis@**** ~]$ cat /etc/fedora-release Fedora release 11 (Leonidas) This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component. This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component. This message is a reminder that Fedora 11 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 11. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '11'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 11's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 11 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping Fedora 11 changed to end-of-life (EOL) status on 2010-06-25. Fedora 11 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed. Reopening this. In order to workaround bug 723822 which causes noapic to fail, we removed this option (so APIC is supposed to be enabled). The failures described in this bug occur intermittently (about 1 time in 10). [00441ms] /usr/bin/qemu-kvm \ -drive file=../images/test.iso,snapshot=on,if=virtio \ -nodefconfig \ -machine pc,accel=kvm:tcg \ -nodefaults \ -nographic \ -m 500 \ -no-reboot \ -no-hpet \ -device virtio-serial \ -serial stdio \ -chardev socket,path=../libguestfszIyAyK/guestfsd.sock,id=channel0 \ -device virtserialport,chardev=channel0,name=org.libguestfs.channel.0 \ -kernel ../.guestfs-419/kernel.22990 \ -initrd ../.guestfs-419/initrd.22990 \ -append 'panic=1 console=ttyS0 udevtimeout=300 acpi=off printk.time=1 cgroup_disable=memory selinux=0 guestfs_verbose=1 TERM=linux ' \ -drive file=../.guestfs-419/root.22990,snapshot=on,if=virtio,cache=unsafeCould not access KVM kernel module: No such file or directory failed to initialize KVM: No such file or directory Back to tcg accelerator. Could not open option rom 'sgabios.bin': No such file or directory [ 0.000000] Initializing cgroup subsys cpuset [ 0.000000] Initializing cgroup subsys cpu [ 0.000000] Linux version 3.1.0-0.rc7.git0.2.fc17.x86_64 (mockbuild.fedoraproject.org) (gcc version 4.6.1 20110824 (Red Hat 4.6.1-8) (GCC) ) #1 SMP Thu Sep 22 01:59:29 UTC 2011 [ 0.000000] Command line: panic=1 console=ttyS0 udevtimeout=300 acpi=off printk.time=1 cgroup_disable=memory selinux=0 guestfs_verbose=1 TERM=linux [ 0.000000] BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: 0000000000000000 - 000000000009bc00 (usable) [ 0.000000] BIOS-e820: 000000000009bc00 - 00000000000a0000 (reserved) [ 0.000000] BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) [ 0.000000] BIOS-e820: 0000000000100000 - 000000001f3fd000 (usable) [ 0.000000] BIOS-e820: 000000001f3fd000 - 000000001f400000 (reserved) [ 0.000000] BIOS-e820: 00000000fffc0000 - 0000000100000000 (reserved) [ 0.000000] NX (Execute Disable) protection: active [ 0.000000] DMI 2.4 present. [ 0.000000] No AGP bridge found [ 0.000000] last_pfn = 0x1f3fd max_arch_pfn = 0x400000000 [ 0.000000] x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106 [ 0.000000] found SMP MP-table at [ffff8800000fdaf0] fdaf0 [ 0.000000] init_memory_mapping: 0000000000000000-000000001f3fd000 [ 0.000000] RAMDISK: 1f2ec000 - 1f3f0000 [ 0.000000] No NUMA configuration found [ 0.000000] Faking a node at 0000000000000000-000000001f3fd000 [ 0.000000] Initmem setup node 0 0000000000000000-000000001f3fd000 [ 0.000000] NODE_DATA [000000001f2d7000 - 000000001f2ebfff] [ 0.000000] Zone PFN ranges: [ 0.000000] DMA 0x00000010 -> 0x00001000 [ 0.000000] DMA32 0x00001000 -> 0x00100000 [ 0.000000] Normal empty [ 0.000000] Movable zone start PFN for each node [ 0.000000] early_node_map[2] active PFN ranges [ 0.000000] 0: 0x00000010 -> 0x0000009b [ 0.000000] 0: 0x00000100 -> 0x0001f3fd [ 0.000000] SFI: Simple Firmware Interface v0.81 http://simplefirmware.org [ 0.000000] Intel MultiProcessor Specification v1.4 [ 0.000000] MPTABLE: OEM ID: BOCHSCPU [ 0.000000] MPTABLE: Product ID: 0.1 [ 0.000000] MPTABLE: APIC at: 0xFEE00000 [ 0.000000] Processor #0 (Bootup-CPU) [ 0.000000] IOAPIC[0]: apic_id 1, version 17, address 0xfec00000, GSI 0-23 [ 0.000000] Processors: 1 [ 0.000000] SMP: Allowing 1 CPUs, 0 hotplug CPUs [ 0.000000] PM: Registered nosave memory: 000000000009b000 - 000000000009c000 [ 0.000000] PM: Registered nosave memory: 000000000009c000 - 00000000000a0000 [ 0.000000] PM: Registered nosave memory: 00000000000a0000 - 00000000000f0000 [ 0.000000] PM: Registered nosave memory: 00000000000f0000 - 0000000000100000 [ 0.000000] Allocating PCI resources starting at 1f400000 (gap: 1f400000:e0bc0000) [ 0.000000] Booting paravirtualized kernel on bare hardware [ 0.000000] setup_percpu: NR_CPUS:512 nr_cpumask_bits:512 nr_cpu_ids:1 nr_node_ids:1 [ 0.000000] PERCPU: Embedded 476 pages/cpu @ffff88001f000000 s1918464 r8192 d23040 u2097152 [ 0.000000] Built 1 zonelists in Node order, mobility grouping on. Total pages: 125875 [ 0.000000] Policy zone: DMA32 [ 0.000000] Kernel command line: panic=1 console=ttyS0 udevtimeout=300 acpi=off printk.time=1 cgroup_disable=memory selinux=0 guestfs_verbose=1 TERM=linux [ 0.000000] Disabling memory control group subsystem [ 0.000000] PID hash table entries: 2048 (order: 2, 16384 bytes) [ 0.000000] Checking aperture... [ 0.000000] No AGP bridge found [ 0.000000] Memory: 473228k/511988k available (5185k kernel code, 468k absent, 38292k reserved, 6577k data, 2784k init) [ 0.000000] SLUB: Genslabs=15, HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1 [ 0.000000] Hierarchical RCU implementation. [ 0.000000] \tRCU dyntick-idle grace-period acceleration is enabled. [ 0.000000] \tRCU lockdep checking is enabled. [ 0.000000] NR_IRQS:33024 nr_irqs:256 16 [ 0.000000] Console: colour dummy device 80x25 [ 0.000000] console [ttyS0] enabled [ 0.000000] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar [ 0.000000] ... MAX_LOCKDEP_SUBCLASSES: 8 [ 0.000000] ... MAX_LOCK_DEPTH: 48 [ 0.000000] ... MAX_LOCKDEP_KEYS: 8191 [ 0.000000] ... CLASSHASH_SIZE: 4096 [ 0.000000] ... MAX_LOCKDEP_ENTRIES: 16384 [ 0.000000] ... MAX_LOCKDEP_CHAINS: 32768 [ 0.000000] ... CHAINHASH_SIZE: 16384 [ 0.000000] memory used by lock dependency info: 6367 kB [ 0.000000] per task-struct memory footprint: 2688 bytes [ 0.000000] Fast TSC calibration using PIT [ 0.000000] Detected 2480.298 MHz processor. [ 0.000490] Calibrating delay loop (skipped), value calculated using timer frequency.. 4960.59 BogoMIPS (lpj=2480298) [ 0.000999] pid_max: default: 32768 minimum: 301 [ 0.000999] Security Framework initialized [ 0.000999] SELinux: Disabled at boot. [ 0.000999] Dentry cache hash table entries: 65536 (order: 7, 524288 bytes) [ 0.000999] Inode-cache hash table entries: 32768 (order: 6, 262144 bytes) [ 0.000999] Mount-cache hash table entries: 256 [ 0.000999] Initializing cgroup subsys cpuacct [ 0.000999] Initializing cgroup subsys memory [ 0.000999] Initializing cgroup subsys devices [ 0.000999] Initializing cgroup subsys freezer [ 0.000999] Initializing cgroup subsys net_cls [ 0.000999] Initializing cgroup subsys blkio [ 0.000999] Initializing cgroup subsys perf_event [ 0.000999] mce: CPU supports 10 MCE banks [ 0.000999] SMP alternatives: switching to UP code [ 0.000999] Freeing SMP alternatives: 12k freed [ 0.000999] ftrace: allocating 25829 entries in 102 pages [ 0.000999] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 [ 0.000999] ..MP-BIOS bug: 8254 timer not connected to IO-APIC [ 0.000999] ...trying to set up timer (IRQ0) through the 8259A ... [ 0.000999] ..... (found apic 0 pin 2) ... [ 0.000999] ....... failed. [ 0.000999] ...trying to set up timer as Virtual Wire IRQ... [ 0.000999] ..... failed. [ 0.000999] ...trying to set up timer as ExtINT IRQ... [ 0.000999] ..... failed :(. [ 0.000999] Kernel panic - not syncing: IO-APIC + timer doesn't work! Boot with apic=debug and send a report. Then try booting with the 'noapic' option. [ 0.000999] [ 0.000999] Pid: 1, comm: swapper Not tainted 3.1.0-0.rc7.git0.2.fc17.x86_64 #1 [ 0.000999] Call Trace: [ 0.000999] [<ffffffff814f9d6d>] panic+0xa0/0x1b9 [ 0.000999] [<ffffffff81d636a5>] setup_IO_APIC+0x2df/0x761 [ 0.000999] [<ffffffff81d602f5>] native_smp_prepare_cpus+0x2e2/0x356 [ 0.000999] [<ffffffff81d53c48>] kernel_init+0x8b/0x159 [ 0.000999] [<ffffffff8150dc04>] kernel_thread_helper+0x4/0x10 [ 0.000999] [<ffffffff81505074>] ? retint_restore_args+0x13/0x13 [ 0.000999] [<ffffffff81d53bbd>] ? start_kernel+0x3ea/0x3ea [ 0.000999] [<ffffffff8150dc00>] ? gs_change+0x13/0x13 [ 0.000999] Rebooting in 1 seconds.. 2:qemu-kvm-0.15.0-4.fc17.x86_64 kernel-3.1.0-0.rc7.git0.2.fc17.x86_64 This is with TCG, not KVM, in case that isn't clear. After some examination of the code, this turns out to be a known problem with the code that tests for buggy timers. This code is not necessary when running in qemu, and it gets confused because it tries to do accurate timing checks which sometimes fail in virt. For more information, see: https://bugzilla.redhat.com/show_bug.cgi?id=698842#c8 Adding kernel no_timer_check option appears to fix the problem for me, but I am still doing testing. Added this commit to libguestfs to work around this issue: http://git.annexia.org/?p=libguestfs.git;a=commitdiff;h=322106521f546d7c70c5a38255db7d243a456a6b Okay to close this? Yup, I'll close it, thanks. Worth remembering that ALL code in the kernel that tries to test timers / calibrate timing loops, is suspect in a virt context! |
Created attachment 344993 [details] build log When you boot the guest, it hangs at the following point in the boot: /usr/bin/qemu-kvm /usr/bin/qemu-kvm -drive file=test.img -m 384 -no-reboot -kernel /builddir/build/BUILD/libguestfs-1.0.29/vmlinuz.fedora-12.x86_64 -initrd /builddir/build/BUILD/libguestfs-1.0.29/initramfs.fedora-12.x86_64.img -append 'panic=1 console=ttyS0 guestfs=10.0.2.4:6666 guestfs_verbose=1' -nographic -serial stdio -net channel,6666:unix:/tmp/libguestfsdsLp3s/sock,server,nowait -net user,vlan=0 -net nic,model=virtio,vlan=0 open /dev/kvm: No such file or directory Could not initialize KVM, will disable KVM support [...] CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 512K (64 bytes/line) CPU 0/0x0 -> Node 0 SMP alternatives: switching to UP code ACPI: Core revision 20081204 ftrace: converting mcount calls to 0f 1f 44 00 00 ftrace: allocating 18880 entries in 149 pages Setting APIC routing to flat ..TIMER: vector=0x30 apic1=0 pin1=0 apic2=-1 pin2=-1 ..MP-BIOS bug: 8254 timer not connected to IO-APIC ...trying to set up timer (IRQ0) through the 8259A ... ..... (found apic 0 pin 0) ... ....... failed. ...trying to set up timer as Virtual Wire IRQ... guest kernel 2.6.29.3-155.fc11.x86_64 qemu-kvm-0.10-16.fc11.x86_64 bochs-bios 2.3.8-0.6.git04387139e3b.fc11 NB: This is happening with software emulation - on a machine that doesn't have KVM.