Bug 744426
Summary: | Big slowdown, kernel reports "Clocksource tsc unstable", with Linux 3.1-rc9 when run in qemu | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Richard W.M. Jones <rjones> |
Component: | kernel | Assignee: | Marcelo Tosatti <mtosatti> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | rawhide | CC: | gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, mishu, mtosatti |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2012-11-19 22:45:54 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Richard W.M. Jones
2011-10-08 15:44:38 UTC
Did you ever bisect this? Does it still happen with 3.1.0-7.fc16? Do you have a command line to recreate using TCG? I'm not sure why you think this is the kernel's fault. TCG, as I understand it, is dynamic translation of instructions on the fly. If an instruction stream changes, it seems perfectly reasonable that performance can vary. This is still happening with kernel 3.2.0-0.rc0.git4.1.fc17 from a few days ago. It's definitely a bug, although probably one in qemu, because something shouldn't go from no time to taking several minutes without us understanding exactly what the reason is. I've not had time to bisect this. I cannot reproduce this locally (only in Koji, where it happens all the time), but if it were to be reproducible then something like this should do it: cd /tmp cat > sleep.c #include <stdio.h> #include <stdlib.h> #include <unistd.h> int main (void) { write (2, "sleeping\n", 9); sleep (60); write (2, "exiting\n", 8); return 0; } gcc -static sleep.c -o init echo 'init' | cpio -o -c > initrd qemu-system-x86_64 -nodefconfig -machine accel=tcg -nodefaults -nographic -m 500 -no-reboot -no-hpet -device virtio-serial -serial stdio -kernel /boot/vmlinuz-3.1.0-0.rc9.git0.0.fc17.x86_64 -initrd /tmp/initrd -append 'panic=1 console=ttyS0 no_timer_check acpi=off' From: "Richard W.M. Jones" <rjones> But why do we have all this timer detection code running in the virt path? It makes no sense for Linux guests to have flakey timing loops and calculations, when all of this information is already known by the hypervisor and could just be passed up to the guest. Why don't we just put all of this stuff in the dmi information about what timers are available and what speeds they run at, and pass the whole lot over to Linux and be done with it? *** Bug 870042 has been marked as a duplicate of this bug. *** Richard, Please inform whether the lpj= setting resolves the problem for you. I'm going to close this one because it's old and the slowdown no longer occurs. However I am going to try your lpj= suggestion to see if it improves the clocksource stability problems under TCG. I tested this again and it does seem to have gone away. I have also pushed a patch to libguestfs so it will try to pass the lpj=... parameter (TCG only): https://github.com/libguestfs/libguestfs/commit/aeea803ad0fafe1ed4c7f8e781dfe4fdc150cac0 |