Bug 857026
Summary: | qemu on i686 hangs at boot; APIC emulation in qemu (or kernel) is broken | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Richard W.M. Jones <rjones> |
Component: | qemu | Assignee: | Fedora Virtualization Maintainers <virt-maint> |
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 18 | CC: | amit.shah, berrange, cfergeau, crobinso, dwmw2, gleb, itamar, knoel, mtosatti, pbonzini, rjones, scottt.tw, virt-maint |
Target Milestone: | --- | Keywords: | Reopened |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2013-06-23 15:20:58 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Richard W.M. Jones
2012-09-13 12:16:05 UTC
I just realized I was running the older qemu. However I get the same problem with an updated qemu: qemu-1.2.0-3.fc18.i686 There's also a bug in libvirt where it's running the wrong qemu binary. Dan is fixing that. I'll use his fix and see if it makes any difference. Same problem happens with qemu-system-i386. Adding "noapic" to the kernel command line fixes it. Kernel bug? Interaction between APIC emulation in qemu and the kernel? /me generally wishes 32 bit would go away. Could well be a Seabios issue. Might want to try updating to seabios 1.7.1, or downgrading to an older version 1.6.3 version to see if anything changes libvirt-0.10.1-4.fc18 has been submitted as an update for Fedora 18. https://admin.fedoraproject.org/updates/FEDORA-2012-13853/libvirt-0.10.1-4.fc18 Setting back to NEW state. This bug is something to do with the kernel or qemu, and probably not a bug in libvirt. libvirt-0.10.1-4.fc18 has been pushed to the Fedora 18 stable repository. If problems still persist, please make note of it in this bug report. This seems to still be a problem even with the latest Rawhide as of today. In libguestfs we are working around it by passing noapic on the kernel command line (i386 only). Marcelo, gleb, any of this sound familiar? I haven't tested this since we started to use the lpj= option. Setting NEEDINFO to me only. (In reply to comment #11) > Marcelo, gleb, any of this sound familiar? First of all check with -enable-kvm and drop -no-kvm. Second what's at arch/x86/kernel/apic/apic.c:1337 in this specific kernel? [ 0.000000] tsc: Fast TSC calibration failed [ 0.000000] tsc: Unable to calibrate against PIT [ 0.000000] tsc: No reference (HPET/PMTIMER) available [ 0.000000] tsc: Marking TSC unstable due to could not calculate TSC khz [ 0.017997] Calibrating delay loop... 179.96 BogoMIPS (lpj=89984) Does that differ significantly from the host? If so, then that is likely the issue Warning is: if (queued) { if (cpu_has_tsc) { rdtscll(ntsc); max_loops = (cpu_khz << 10) - (ntsc - tsc); } else max_loops--; } } while (queued && max_loops > 0); WARN_ON(max_loops <= 0); BTW, Richard, do you still need the lpj= setting with guest or host using different HZ values? (that is broken, as you noted, BTW). [Note this is a Fedora 32 bit TCG-related bug and hence not relevant to RHEL, or very much at all since few people use 32 bit] I've just done another test build under Koji, 32 bit, TCG without the noapic flag, and it's still hanging, so the bug is still present. We are now passing lpj= on the kernel command line, and it seems to make no difference. (In reply to comment #14) > [ 0.000000] tsc: Fast TSC calibration failed > [ 0.000000] tsc: Unable to calibrate against PIT > [ 0.000000] tsc: No reference (HPET/PMTIMER) available > [ 0.000000] tsc: Marking TSC unstable due to could not calculate TSC khz > [ 0.017997] Calibrating delay loop... 179.96 BogoMIPS (lpj=89984) > > Does that differ significantly from the host? If so, then that is likely the > issue lpj being passed is the same as on the host (but note that "host" here is the L1 hypervisor, since everything is nested and using TCG). > BTW, Richard, do you still need the lpj= setting with guest or host using > different HZ > values? (that is broken, as you noted, BTW). It would still be useful to have lpj exported in /proc/cpuinfo. In the libguestfs case mostly the host kernel is identical to the appliance kernel, so HZ values would normally be identical. Can you please provide output of failure with lpj= ? I've verified this is fixed now. |