Description of problem: Using qemu and kvm generates lots of syscalls, about 9000 per second generating lots of wakeups of the CPU (about 2500 per second according to powertop), consuming more power and increasing the heat. This is a bit impractical since autonomy gets severely compromised when working on battery. I made a simple test installing Windows XP SP3 using qemu-kvm through virt-manager. Performance is actually good (really good), the VM is idle using between 0 to 4% of the VCPU according to Windows but on the host side qemu-kvm is constantly using 20% of the CPU. Version-Release number of selected component (if applicable): etherboot-roms-kvm-5.4.4-4.fc10.i386 kvm-74-10.fc10.i386 qemu-launcher-1.7.4-4.fc9.noarch qemu-img-0.9.1-10.fc10.i386 qemu-0.9.1-10.fc10.i386 kernel-2.6.27.9-159.fc10.i686 How reproducible: I used Win XP because that's what I had access to right now, I'll be repeating the test with Fedora 10 itself as a guess. Expected results: Syscalls and host CPU usage should not be so high when idle. Wakeups should be reduced too. Additional info: I don't really know if this is a qemu problem or a kvm problem, please switch components if appropriate.
Created attachment 327891 [details] strace log from qemu-kvm I generated this log with the following command: strace -tt -p 7621 &> qemu.log Note that there is almost 9000 syscalls in just one second.
I think I should have given my CPU information: processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 14 model name : Genuine Intel(R) CPU T2400 @ 1.83GHz stepping : 8 cpu MHz : 1000.000 cache size : 2048 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 2 apicid : 0 initial apicid : 0 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx constant_tsc arch_perfmon bts pni monitor vmx est tm2 xtpr bogomips : 3657.45 clflush size : 64 power management:
This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component.
This message is a reminder that Fedora 10 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 10. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '10'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 10's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 10 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Dropping the usb tablet device should reduce wakeups.
Yeah, now I'm testing on a Fedora 12 host and a Fedora 12 guest. Removing the tablet device reduced the wakeups from 2060 to 580 in powertop and reduced the CPU consumption in the host from 20% to 7%. Using strace the syscalls were reduced from 16 thousand per second to 6300 per second. Do you have any other advice? If I just need a keyboard and a mouse, is there any problem removing the tablet device? Do you know of any other improvement being worked on in KVM to reduce the wakeups and CPU consumption?? I'm a really fan of KVM, I am using it heavily for web development testing and soon in a production environment.
Is the guest really idle? Is it running gnome? Please run 'strace -c -p pid -p pid -p pid' and report. Note that qemu uses threads, each of them needs to be a parameter to strace.
Yep, it's completely idle, I deactivated many unneeded services except for sendmail and I'm now running in init level 3 so gnome is not running. PowerTOP is always green between 2.5 and 6 wake ups per second (in the guest). top shows 99% of idle CPU being top the only one using the 1% remaining. As of now I'm using just one guest VM with F12 so I issued the following command: [root@xpsm1210 ~]# strace -c -p 25878 Process 25878 attached - interrupt to quit ^CProcess 25878 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 85.47 0.054323 9 6322 select 5.54 0.003521 0 27365 clock_gettime 4.51 0.002869 0 21097 10545 read 1.94 0.001235 0 8770 write 1.10 0.000699 0 4370 timer_settime 0.76 0.000482 0 6128 timer_gettime 0.67 0.000429 0 4241 rt_sigaction 0.00 0.000000 0 18 ioctl 0.00 0.000000 0 18 recvmsg ------ ----------- ----------- --------- --------- ---------------- 100.00 0.063558 78329 10545 total The other problem I'm seeing is that ksmd kicks in after I turn on another virtual machine (with Debian) and stays there consuming about 32% of the CPU all the time. It stays there even long after I turned off the Debian guest: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 32 root 25 5 0 0 0 R 32.9 0.0 23:02.98 ksmd Note that it has 23 minutes of CPU accumulated time so it have been running for a long time (and what's up with the 0 resident size?). Do you want me to strace it or something? I have sysprof installed if that's useful.
Created attachment 378038 [details] Just in case this info is useful
(In reply to comment #9) > The other problem I'm seeing is that ksmd kicks in after I turn on another > virtual machine (with Debian) and stays there consuming about 32% of the CPU > all the time. It stays there even long after I turned off the Debian guest: > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 32 root 25 5 0 0 0 R 32.9 0.0 23:02.98 ksmd > > Note that it has 23 minutes of CPU accumulated time so it have been running for > a long time (and what's up with the 0 resident size?). Do you want me to > strace it or something? I have sysprof installed if that's useful. ksmd has 0 resident size because it is a kernel thread, not a userspace application. That said, is the ksmtuned service started? Essentually ksm is using CPU to scan pages. The ksmtuned service will idle ksm when a new VM load has settled and start it back up when a new VM is started. Of course you can also turn off ksm. There are 2 services associated with KSM. The first (ksm) simply turns on the ksm feature and sets the number of pages it can associate to a sane default for your system. The second (ksmtuned) will actively watch system activity and start/stop ksm based on system usage.
Please attach to all the qemu threads at once.
(In reply to comment #12) > Please attach to all the qemu threads at once. Avi, there is only one qemu-kvm thread because there is only one virtual machine running. Do you need strace data with more virtual machines?? how many? is 2 good enough? I'll post them when I get home tonight.
Each virtual machine consists of multiple threads (at least two). Please trace all of them. Look in /proc/$pid/task.
(In reply to comment #14) > Each virtual machine consists of multiple threads (at least two). Please trace > all of them. Look in /proc/$pid/task. Ok, that's new to me but now I think I got it right. The qemu-kvm pid is 10769, so "ls /proc/10769/task/" shows the following: 10769 10770 Doing "strace -c -p 10769 -p 10770" for about 40 seconds as root shows this: Process 10769 attached - interrupt to quit Process 10770 attached - interrupt to quit ^CProcess 10769 detached Process 10770 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 84.17 0.257319 9 27136 select 5.26 0.016094 0 121942 clock_gettime 5.10 0.015587 0 93174 46573 read 2.44 0.007459 0 40312 write 1.19 0.003624 0 28190 timer_gettime 1.04 0.003194 0 20077 timer_settime 0.80 0.002450 0 19515 rt_sigaction 0.00 0.000000 0 80 ioctl 0.00 0.000000 0 1 1 rt_sigtimedwait 0.00 0.000000 0 80 recvmsg ------ ----------- ----------- --------- --------- ---------------- 100.00 0.305727 350507 46574 total Is this good enough? please don't hesitate to ask anything if you need something.
(In reply to comment #11) > ksmd has 0 resident size because it is a kernel thread, not a userspace > application. That said, is the ksmtuned service started? Essentually ksm is > using CPU to scan pages. The ksmtuned service will idle ksm when a new VM load > has settled and start it back up when a new VM is started. Of course you can > also turn off ksm. There are 2 services associated with KSM. The first (ksm) > simply turns on the ksm feature and sets the number of pages it can associate > to a sane default for your system. The second (ksmtuned) will actively watch > system activity and start/stop ksm based on system usage. Hi Justin. Yes, ksmtuned is active and running. In fact, ksmd is using the CPU again (with the same behaviour explained in comment #9) but I am using just one virtual machine not two. My idea is not to turn off ksm. I'd like to know what is going on with ksmd that keeps using the CPU without stopping. I have a question: Is ksm just processing virtual machines memory or is it merging identical pages on the whole system? If it is the former, how can i configure it to make it scan the whole system? Is there any howto?
ksm is only scanning the memory from the running virtual machine. The way to make it scan a given chunk of memory is to have the application using that memory to madvise the memory it allocates as MADV_MERGEABLE. It doesn't make sense for most applications to do this, and right now kvm is the only application which does so. The ksmtuned service should fire up ksm as needed (when a new VM is started) and stop the ksm kernel thread once it has scanned enough. Even with a single VM you will find some pages get merged.
Makes sense, that's really interesting. How can I tell if pages have been merged? what files should I look?
Yes, good enough. Looks like we have ~700 wakeups per second (high, but believable), which gets amplified about 12x. At least some of that should be easy to prune.
Oh, a kvm_stat snapshot would be welcome too with the same guest under the same conditions.
Oppss... I didn't notice this bug was in NeedInfo. Avi, could you please explain a little more how to obtain the information you need. I'm an advanced linux user but I'm new to KVM.
kvm_stat is a part of the qemu-kvm source, I think packaged in fedora as qemu-kvm-tools. Simply run it (plus kvm_stat --batch) to see what the guest is doing.
*** Bug 553462 has been marked as a duplicate of this bug. ***
Here's the output of kvm_stat while the guest was sitting at the boot menu (the scenario described in bug 553462). No other guests have been booted on this host since the host was booted. The behavior described in 553462 is 100% reproducible, so it's easy to gather more info if you need anything. [root@localhost ~]# kvm_stat --batch efer_reload 0 0 exits 214928 1767 fpu_reload 116649 111 halt_exits 35 0 halt_wakeup 2 0 host_state_reload 127090 111 hypercalls 0 0 insn_emulation 21239 0 insn_emulation_fail 0 0 invlpg 0 0 io_exits 109156 18 irq_exits 84105 1742 irq_injections 853 18 irq_window 276 6 largepages 0 0 mmio_exits 120 0 mmu_cache_miss 59 0 mmu_flooded 0 0 mmu_pde_zapped 0 0 mmu_pte_updated 0 0 mmu_pte_write 0 0 mmu_recycled 0 0 mmu_shadow_zapped 92 0 mmu_unsync 0 0 nmi_injections 0 0 nmi_window 0 0 pf_fixed 137 0 pf_guest 0 0 remote_tlb_flush 1 0 request_irq 0 0 signal_exits 0 0 tlb_flush 28 0 [root@localhost ~]#
Sorry for the late response, this is my output for kvm_stat --batch, the VM was completely idle in the console logged in as root, powertop showed about 1.5 to 2.5 wakeups per second in the VM (powertop was not running during kvm_stat): [root@xpsm1210 ~]# kvm_stat --batch efer_reload 0 0 exits 2103142 28 fpu_reload 265701 0 halt_exits 17702 8 halt_wakeup 1803 0 host_state_reload 327253 8 hypercalls 483875 0 insn_emulation 600579 20 insn_emulation_fail 0 0 invlpg 87304 0 io_exits 302063 0 irq_exits 44934 0 irq_injections 46988 8 irq_window 8846 0 largepages 0 0 mmio_exits 8066 0 mmu_cache_miss 37477 0 mmu_flooded 13373 0 mmu_pde_zapped 19555 0 mmu_pte_updated 360060 0 mmu_pte_write 554385 0 mmu_recycled 0 0 mmu_shadow_zapped 41998 0 mmu_unsync 6 0 nmi_injections 0 0 nmi_window 0 0 pf_fixed 221292 0 pf_guest 236321 0 remote_tlb_flush 18 0 request_irq 0 0 signal_exits 7 0 tlb_flush 539907 0 I hope this helps.
Fedora 13 pre-beta, kernel 2.6.33.1-19.fc13.x86_64 Running 1 guest, a RHEL 5.4 kernel 2.6.18-164.el5. >3000 wakeups/sec seen by powertop in the host. # kvm_stat --batch efer_reload 0 0 exits 346164446 7109 fpu_reload 2630935 0 halt_exits 67231715 2007 halt_wakeup 35545422 1003 host_state_reload 189796146 5077 hypercalls 0 0 insn_emulation 197764345 5026 insn_emulation_fail 2 0 invlpg 5005754 0 io_exits 8141340 56 irq_exits 5320202 14 irq_injections 76817243 2011 irq_window 6982277 3 largepages 0 0 mmio_exits 114512130 3015 mmu_cache_miss 2258952 0 mmu_flooded 934960 0 mmu_pde_zapped 3562527 0 mmu_pte_updated 80 0 mmu_pte_write 4575891 0 mmu_recycled 30713 0 mmu_shadow_zapped 2256672 0 mmu_unsync 2072 0 nmi_injections 0 0 nmi_window 0 0 pf_fixed 35582237 1 pf_guest 20851551 0 remote_tlb_flush 152 0 request_irq 0 0 signal_exits 0 0 tlb_flush 10467625 0
Hi Matt, does it help to remove the usb tablet device from the VM as suggested by Avi in comment #6 ?
Created attachment 403500 [details] rhel5.xml I have no USB tablet device installed in the VM. Please see the attached rhel5.xml file.
In case it was a guest kernel thing fixed in RHEL 5.5, I upgraded the guest to kernel 2.6.18-194.el5, and ran this with the guest in runlevel 1. Some improvement... efer_reload 0 0 exits 19127061 4019 fpu_reload 1433637 0 halt_exits 1512565 2005 halt_wakeup 797225 1003 host_state_reload 6237489 2005 hypercalls 0 0 insn_emulation 9414464 2007 insn_emulation_fail 0 0 invlpg 354398 0 io_exits 1847206 0 irq_exits 549333 7 irq_injections 2227379 2006 irq_window 223478 2 largepages 0 0 mmio_exits 2874007 0 mmu_cache_miss 205679 0 mmu_flooded 69976 0 mmu_pde_zapped 268710 0 mmu_pte_updated 8 0 mmu_pte_write 867255 0 mmu_recycled 0 0 mmu_shadow_zapped 206041 0 mmu_unsync 50 0 nmi_injections 0 0 nmi_window 0 0 pf_fixed 3492959 0 pf_guest 1662079 0 remote_tlb_flush 317 0 request_irq 0 0 signal_exits 1 0 tlb_flush 824489 0
I wonder if this [1] fix can help fix the problem. (At least the usb tablet part of the problem) - Gilboa [1] http://lists.gnu.org/archive/html/qemu-devel/2010-04/msg00159.html
I wonder if there is any news about this? I'm using Fedora 13 with the latest updates in both, the host and the guest and I still see a somewhat high number of wakeups per second (544) in powertop for an idle guest (6 wakeups/sec in init 3). This is the output of "strace -c -p 7046 -p 7048" for about 10 seconds: Process 7046 attached - interrupt to quit Process 7048 attached - interrupt to quit ^CProcess 7046 detached Process 7048 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 78.28 0.042906 8 5071 select 5.29 0.002902 0 27104 clock_gettime 4.80 0.002632 0 19316 9644 read 3.25 0.001783 0 9586 write 3.12 0.001710 0 12279 gettimeofday 2.59 0.001422 0 6454 timer_settime 1.60 0.000878 0 7208 timer_gettime 1.05 0.000578 0 4707 rt_sigaction 0.00 0.000000 0 10 ioctl 0.00 0.000000 0 1 rt_sigpending 0.00 0.000000 0 1 1 rt_sigtimedwait 0.00 0.000000 0 130 recvmsg ------ ----------- ----------- --------- --------- ---------------- 100.00 0.054811 91867 9645 total And this is the output of "kvm_stat --batch": efer_reload 0 0 exits 3976310 31 fpu_reload 887620 0 halt_exits 10703 8 halt_wakeup 2158 0 host_state_reload 1181477 8 hypercalls 0 0 insn_emulation 2182043 22 insn_emulation_fail 0 0 invlpg 0 0 io_exits 999410 0 irq_exits 48939 1 irq_injections 39173 8 irq_window 4996 0 largepages 0 0 mmio_exits 228426 0 mmu_cache_miss 4738 0 mmu_flooded 0 0 mmu_pde_zapped 0 0 mmu_pte_updated 0 0 mmu_pte_write 44720 0 mmu_recycled 0 0 mmu_shadow_zapped 4430 0 mmu_unsync 0 0 nmi_injections 0 0 nmi_window 0 0 pf_fixed 258020 0 pf_guest 0 0 remote_tlb_flush 104 0 request_irq 0 0 signal_exits 3 0 tlb_flush 0 0 Is there any room for improvement?
This message is a reminder that Fedora 12 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 12. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '12'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 12's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 12 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Fedora 12 changed to end-of-life (EOL) status on 2010-12-02. Fedora 12 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed.