Bug 478317 - almost 9 thousand syscalls per second while idle
Summary: almost 9 thousand syscalls per second while idle
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: qemu
Version: 12
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Justin M. Forbes
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: F13VirtImportant
TreeView+ depends on / blocked
 
Reported: 2008-12-27 22:10 UTC by William Lovaton
Modified: 2013-01-09 11:26 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-12-05 07:03:34 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
strace log from qemu-kvm (700.59 KB, text/plain)
2008-12-27 22:13 UTC, William Lovaton
no flags Details
Just in case this info is useful (7.39 KB, image/png)
2009-12-13 17:59 UTC, William Lovaton
no flags Details
rhel5.xml (1.31 KB, text/xml)
2010-03-30 14:00 UTC, Matt Domsch
no flags Details

Description William Lovaton 2008-12-27 22:10:36 UTC
Description of problem:
Using qemu and kvm generates lots of syscalls, about 9000 per second generating lots of wakeups of the CPU (about 2500 per second according to powertop), consuming more power and increasing the heat.  This is a bit impractical since autonomy gets severely compromised when working on battery.

I made a simple test installing Windows XP SP3 using qemu-kvm through virt-manager.  Performance is actually good (really good), the VM is idle using between 0 to 4% of the VCPU according to Windows but on the host side qemu-kvm is constantly using 20% of the CPU.

Version-Release number of selected component (if applicable):
etherboot-roms-kvm-5.4.4-4.fc10.i386
kvm-74-10.fc10.i386
qemu-launcher-1.7.4-4.fc9.noarch
qemu-img-0.9.1-10.fc10.i386
qemu-0.9.1-10.fc10.i386
kernel-2.6.27.9-159.fc10.i686

How reproducible:
I used Win XP because that's what I had access to right now, I'll be repeating the test with Fedora 10 itself as a guess.

Expected results:
Syscalls and host CPU usage should not be so high when idle.  Wakeups should be reduced too.

Additional info:
I don't really know if this is a qemu problem or a kvm problem, please switch components if appropriate.

Comment 1 William Lovaton 2008-12-27 22:13:59 UTC
Created attachment 327891 [details]
strace log from qemu-kvm

I generated this log with the following command:
strace -tt -p 7621 &> qemu.log

Note that there is almost 9000 syscalls in just one second.

Comment 2 William Lovaton 2008-12-28 19:00:38 UTC
I think I should have given my CPU information:

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 14
model name	: Genuine Intel(R) CPU           T2400  @ 1.83GHz
stepping	: 8
cpu MHz		: 1000.000
cache size	: 2048 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 2
apicid		: 0
initial apicid	: 0
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx constant_tsc arch_perfmon bts pni monitor vmx est tm2 xtpr
bogomips	: 3657.45
clflush size	: 64
power management:

Comment 3 Fedora Admin XMLRPC Client 2009-05-07 12:12:50 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 4 Fedora Admin XMLRPC Client 2009-05-07 16:38:23 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 5 Bug Zapper 2009-11-18 10:35:15 UTC
This message is a reminder that Fedora 10 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 10.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '10'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 10's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 10 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 6 Avi Kivity 2009-11-22 15:08:34 UTC
Dropping the usb tablet device should reduce wakeups.

Comment 7 William Lovaton 2009-12-13 13:42:36 UTC
Yeah, now I'm testing on a Fedora 12 host and a Fedora 12 guest.  Removing the tablet device reduced the wakeups from 2060 to 580 in powertop and reduced the CPU consumption in the host from 20% to 7%.

Using strace the syscalls were reduced from 16 thousand per second to 6300 per second.  Do you have any other advice?

If I just need a keyboard and a mouse, is there any problem removing the tablet device?

Do you know of any other improvement being worked on in KVM to reduce the wakeups and CPU consumption?? I'm a really fan of KVM, I am using it heavily for web development testing and soon in a production environment.

Comment 8 Avi Kivity 2009-12-13 16:01:12 UTC
Is the guest really idle?  Is it running gnome?

Please run 'strace -c -p pid -p pid -p pid' and report.  Note that qemu uses threads, each of them needs to be a parameter to strace.

Comment 9 William Lovaton 2009-12-13 17:55:53 UTC
Yep, it's completely idle, I deactivated many unneeded services except for sendmail and I'm now running in init level 3 so gnome is not running.  PowerTOP is always green between 2.5 and 6 wake ups per second (in the guest). top shows 99% of idle CPU being top the only one using the 1% remaining.

As of now I'm using just one guest VM with F12 so I issued the following command:

[root@xpsm1210 ~]# strace -c -p 25878
Process 25878 attached - interrupt to quit
^CProcess 25878 detached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 85.47    0.054323           9      6322           select
  5.54    0.003521           0     27365           clock_gettime
  4.51    0.002869           0     21097     10545 read
  1.94    0.001235           0      8770           write
  1.10    0.000699           0      4370           timer_settime
  0.76    0.000482           0      6128           timer_gettime
  0.67    0.000429           0      4241           rt_sigaction
  0.00    0.000000           0        18           ioctl
  0.00    0.000000           0        18           recvmsg
------ ----------- ----------- --------- --------- ----------------
100.00    0.063558                 78329     10545 total


The other problem I'm seeing is that ksmd kicks in after I turn on another virtual machine (with Debian) and stays there consuming about 32% of the CPU all the time.  It stays there even long after I turned off the Debian guest:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            
   32 root      25   5     0    0    0 R 32.9  0.0  23:02.98 ksmd               

Note that it has 23 minutes of CPU accumulated time so it have been running for a long time (and what's up with the 0 resident size?).  Do you want me to strace it or something? I have sysprof installed if that's useful.

Comment 10 William Lovaton 2009-12-13 17:59:15 UTC
Created attachment 378038 [details]
Just in case this info is useful

Comment 11 Justin M. Forbes 2009-12-15 18:32:23 UTC
(In reply to comment #9)

> The other problem I'm seeing is that ksmd kicks in after I turn on another
> virtual machine (with Debian) and stays there consuming about 32% of the CPU
> all the time.  It stays there even long after I turned off the Debian guest:
> 
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            
>    32 root      25   5     0    0    0 R 32.9  0.0  23:02.98 ksmd               
> 
> Note that it has 23 minutes of CPU accumulated time so it have been running for
> a long time (and what's up with the 0 resident size?).  Do you want me to
> strace it or something? I have sysprof installed if that's useful.  

ksmd has 0 resident size because it is a kernel thread, not a userspace application.  That said, is the ksmtuned service started?  Essentually ksm is using CPU to scan pages.  The ksmtuned service will idle ksm when a new VM load has settled and start it back up when a new VM is started.   Of course you can also turn off ksm.  There are 2 services associated with KSM.  The first (ksm) simply turns on the ksm feature and sets the number of pages it can associate to a sane default for your system. The second (ksmtuned) will actively watch system activity and start/stop ksm based on system usage.

Comment 12 Avi Kivity 2009-12-15 20:08:22 UTC
Please attach to all the qemu threads at once.

Comment 13 William Lovaton 2009-12-17 20:04:35 UTC
(In reply to comment #12)
> Please attach to all the qemu threads at once.  

Avi, there is only one qemu-kvm thread because there is only one virtual machine running.  Do you need strace data with more virtual machines?? how many? is 2 good enough?  I'll post them when I get home tonight.

Comment 14 Avi Kivity 2009-12-20 12:09:33 UTC
Each virtual machine consists of multiple threads (at least two).  Please trace all of them.  Look in /proc/$pid/task.

Comment 15 William Lovaton 2009-12-22 01:16:49 UTC
(In reply to comment #14)
> Each virtual machine consists of multiple threads (at least two).  Please trace
> all of them.  Look in /proc/$pid/task.  

Ok, that's new to me but now I think I got it right.  The qemu-kvm pid is 10769, so "ls /proc/10769/task/" shows the following: 10769  10770

Doing "strace -c -p 10769 -p 10770" for about 40 seconds as root shows this:

Process 10769 attached - interrupt to quit
Process 10770 attached - interrupt to quit
^CProcess 10769 detached
Process 10770 detached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 84.17    0.257319           9     27136           select
  5.26    0.016094           0    121942           clock_gettime
  5.10    0.015587           0     93174     46573 read
  2.44    0.007459           0     40312           write
  1.19    0.003624           0     28190           timer_gettime
  1.04    0.003194           0     20077           timer_settime
  0.80    0.002450           0     19515           rt_sigaction
  0.00    0.000000           0        80           ioctl
  0.00    0.000000           0         1         1 rt_sigtimedwait
  0.00    0.000000           0        80           recvmsg
------ ----------- ----------- --------- --------- ----------------
100.00    0.305727                350507     46574 total

Is this good enough? please don't hesitate to ask anything if you need something.

Comment 16 William Lovaton 2009-12-22 01:39:15 UTC
(In reply to comment #11)
> ksmd has 0 resident size because it is a kernel thread, not a userspace
> application.  That said, is the ksmtuned service started?  Essentually ksm is
> using CPU to scan pages.  The ksmtuned service will idle ksm when a new VM load
> has settled and start it back up when a new VM is started.   Of course you can
> also turn off ksm.  There are 2 services associated with KSM.  The first (ksm)
> simply turns on the ksm feature and sets the number of pages it can associate
> to a sane default for your system. The second (ksmtuned) will actively watch
> system activity and start/stop ksm based on system usage.  

Hi Justin.  Yes, ksmtuned is active and running.  In fact, ksmd is using the CPU again (with the same behaviour explained in comment #9) but I am using just one virtual machine not two.

My idea is not to turn off ksm.  I'd like to know what is going on with ksmd that keeps using the CPU without stopping.

I have a question: Is ksm just processing virtual machines memory or is it merging identical pages on the whole system? If it is the former, how can i configure it to make it scan the whole system?

Is there any howto?

Comment 17 Justin M. Forbes 2009-12-22 16:02:42 UTC
ksm is only scanning the memory from the running virtual machine.  The way to make it scan a given chunk of memory is to have the application using that memory to madvise the memory it allocates as MADV_MERGEABLE.  It doesn't make sense for most applications to do this, and right now kvm is the only application which does so.
The ksmtuned service should fire up ksm as needed (when a new VM is started) and stop the ksm kernel thread once it has scanned enough.  Even with a single VM you will find some pages get merged.

Comment 18 William Lovaton 2009-12-22 22:57:24 UTC
Makes sense, that's really interesting.

How can I tell if pages have been merged? what files should I look?

Comment 19 Avi Kivity 2009-12-23 13:13:50 UTC
Yes, good enough.  Looks like we have ~700 wakeups per second (high, but believable), which gets amplified about 12x.  At least some of that should be easy to prune.

Comment 20 Avi Kivity 2009-12-23 13:15:30 UTC
Oh, a kvm_stat snapshot would be welcome too with the same guest under the same conditions.

Comment 21 William Lovaton 2010-01-15 03:42:30 UTC
Oppss... I didn't notice this bug was in NeedInfo.

Avi, could you please explain a little more how to obtain the information you need.  I'm an advanced linux user but I'm new to KVM.

Comment 22 Avi Kivity 2010-01-18 12:47:47 UTC
kvm_stat is a part of the qemu-kvm source, I think packaged in fedora as qemu-kvm-tools.  Simply run it (plus kvm_stat --batch) to see what the guest is doing.

Comment 23 Justin M. Forbes 2010-02-03 16:12:40 UTC
*** Bug 553462 has been marked as a duplicate of this bug. ***

Comment 24 Dave Allan 2010-02-03 16:31:58 UTC
Here's the output of kvm_stat while the guest was sitting at the boot menu (the scenario described in bug 553462).  No other guests have been booted on this host since the host was booted.  The behavior described in 553462 is 100% reproducible, so it's easy to gather more info if you need anything.

[root@localhost ~]# kvm_stat --batch
efer_reload                    0         0
exits                     214928      1767
fpu_reload                116649       111
halt_exits                    35         0
halt_wakeup                    2         0
host_state_reload         127090       111
hypercalls                     0         0
insn_emulation             21239         0
insn_emulation_fail            0         0
invlpg                         0         0
io_exits                  109156        18
irq_exits                  84105      1742
irq_injections               853        18
irq_window                   276         6
largepages                     0         0
mmio_exits                   120         0
mmu_cache_miss                59         0
mmu_flooded                    0         0
mmu_pde_zapped                 0         0
mmu_pte_updated                0         0
mmu_pte_write                  0         0
mmu_recycled                   0         0
mmu_shadow_zapped             92         0
mmu_unsync                     0         0
nmi_injections                 0         0
nmi_window                     0         0
pf_fixed                     137         0
pf_guest                       0         0
remote_tlb_flush               1         0
request_irq                    0         0
signal_exits                   0         0
tlb_flush                     28         0
[root@localhost ~]#

Comment 25 William Lovaton 2010-02-04 03:29:29 UTC
Sorry for the late response, this is my output for kvm_stat --batch, the VM was completely idle in the console logged in as root, powertop showed about 1.5 to 2.5 wakeups per second in the VM (powertop was not running during kvm_stat):

[root@xpsm1210 ~]# kvm_stat --batch
efer_reload                    0         0
exits                    2103142        28
fpu_reload                265701         0
halt_exits                 17702         8
halt_wakeup                 1803         0
host_state_reload         327253         8
hypercalls                483875         0
insn_emulation            600579        20
insn_emulation_fail            0         0
invlpg                     87304         0
io_exits                  302063         0
irq_exits                  44934         0
irq_injections             46988         8
irq_window                  8846         0
largepages                     0         0
mmio_exits                  8066         0
mmu_cache_miss             37477         0
mmu_flooded                13373         0
mmu_pde_zapped             19555         0
mmu_pte_updated           360060         0
mmu_pte_write             554385         0
mmu_recycled                   0         0
mmu_shadow_zapped          41998         0
mmu_unsync                     6         0
nmi_injections                 0         0
nmi_window                     0         0
pf_fixed                  221292         0
pf_guest                  236321         0
remote_tlb_flush              18         0
request_irq                    0         0
signal_exits                   7         0
tlb_flush                 539907         0

I hope this helps.

Comment 26 Fedora Admin XMLRPC Client 2010-03-09 17:17:43 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 27 Matt Domsch 2010-03-30 12:03:18 UTC
Fedora 13 pre-beta, kernel 2.6.33.1-19.fc13.x86_64
Running 1 guest, a RHEL 5.4 kernel 2.6.18-164.el5.
>3000 wakeups/sec seen by powertop in the host.

# kvm_stat --batch
efer_reload                    0         0
exits                  346164446      7109
fpu_reload               2630935         0
halt_exits              67231715      2007
halt_wakeup             35545422      1003
host_state_reload      189796146      5077
hypercalls                     0         0
insn_emulation         197764345      5026
insn_emulation_fail            2         0
invlpg                   5005754         0
io_exits                 8141340        56
irq_exits                5320202        14
irq_injections          76817243      2011
irq_window               6982277         3
largepages                     0         0
mmio_exits             114512130      3015
mmu_cache_miss           2258952         0
mmu_flooded               934960         0
mmu_pde_zapped           3562527         0
mmu_pte_updated               80         0
mmu_pte_write            4575891         0
mmu_recycled               30713         0
mmu_shadow_zapped        2256672         0
mmu_unsync                  2072         0
nmi_injections                 0         0
nmi_window                     0         0
pf_fixed                35582237         1
pf_guest                20851551         0
remote_tlb_flush             152         0
request_irq                    0         0
signal_exits                   0         0
tlb_flush               10467625         0

Comment 28 William Lovaton 2010-03-30 12:48:22 UTC
Hi Matt, does it help to remove the usb tablet device from the VM as suggested by Avi in comment #6 ?

Comment 29 Matt Domsch 2010-03-30 14:00:19 UTC
Created attachment 403500 [details]
rhel5.xml

I have no USB tablet device installed in the VM.  Please see the attached rhel5.xml file.

Comment 30 Matt Domsch 2010-04-01 12:20:01 UTC
In case it was a guest kernel thing fixed in RHEL 5.5, I upgraded the guest to kernel 2.6.18-194.el5, and ran this with the guest in runlevel 1.  Some improvement...

efer_reload                    0         0
exits                   19127061      4019
fpu_reload               1433637         0
halt_exits               1512565      2005
halt_wakeup               797225      1003
host_state_reload        6237489      2005
hypercalls                     0         0
insn_emulation           9414464      2007
insn_emulation_fail            0         0
invlpg                    354398         0
io_exits                 1847206         0
irq_exits                 549333         7
irq_injections           2227379      2006
irq_window                223478         2
largepages                     0         0
mmio_exits               2874007         0
mmu_cache_miss            205679         0
mmu_flooded                69976         0
mmu_pde_zapped            268710         0
mmu_pte_updated                8         0
mmu_pte_write             867255         0
mmu_recycled                   0         0
mmu_shadow_zapped         206041         0
mmu_unsync                    50         0
nmi_injections                 0         0
nmi_window                     0         0
pf_fixed                 3492959         0
pf_guest                 1662079         0
remote_tlb_flush             317         0
request_irq                    0         0
signal_exits                   1         0
tlb_flush                 824489         0

Comment 31 Gilboa Davara 2010-04-12 05:49:20 UTC
I wonder if this [1] fix can help fix the problem. (At least the usb tablet part of the problem)

- Gilboa
[1] http://lists.gnu.org/archive/html/qemu-devel/2010-04/msg00159.html

Comment 32 William Lovaton 2010-07-19 04:17:49 UTC
I wonder if there is any news about this?

I'm using Fedora 13 with the latest updates in both, the host and the guest and I still see a somewhat high number of wakeups per second (544) in powertop for an idle guest (6 wakeups/sec in init 3).

This is the output of "strace -c -p 7046 -p 7048" for about 10 seconds:
Process 7046 attached - interrupt to quit
Process 7048 attached - interrupt to quit
^CProcess 7046 detached
Process 7048 detached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 78.28    0.042906           8      5071           select
  5.29    0.002902           0     27104           clock_gettime
  4.80    0.002632           0     19316      9644 read
  3.25    0.001783           0      9586           write
  3.12    0.001710           0     12279           gettimeofday
  2.59    0.001422           0      6454           timer_settime
  1.60    0.000878           0      7208           timer_gettime
  1.05    0.000578           0      4707           rt_sigaction
  0.00    0.000000           0        10           ioctl
  0.00    0.000000           0         1           rt_sigpending
  0.00    0.000000           0         1         1 rt_sigtimedwait
  0.00    0.000000           0       130           recvmsg
------ ----------- ----------- --------- --------- ----------------
100.00    0.054811                 91867      9645 total


And this is the output of "kvm_stat --batch":
efer_reload                    0         0
exits                    3976310        31
fpu_reload                887620         0
halt_exits                 10703         8
halt_wakeup                 2158         0
host_state_reload        1181477         8
hypercalls                     0         0
insn_emulation           2182043        22
insn_emulation_fail            0         0
invlpg                         0         0
io_exits                  999410         0
irq_exits                  48939         1
irq_injections             39173         8
irq_window                  4996         0
largepages                     0         0
mmio_exits                228426         0
mmu_cache_miss              4738         0
mmu_flooded                    0         0
mmu_pde_zapped                 0         0
mmu_pte_updated                0         0
mmu_pte_write              44720         0
mmu_recycled                   0         0
mmu_shadow_zapped           4430         0
mmu_unsync                     0         0
nmi_injections                 0         0
nmi_window                     0         0
pf_fixed                  258020         0
pf_guest                       0         0
remote_tlb_flush             104         0
request_irq                    0         0
signal_exits                   3         0
tlb_flush                      0         0


Is there any room for improvement?

Comment 33 Bug Zapper 2010-11-04 11:34:56 UTC
This message is a reminder that Fedora 12 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 12.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '12'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 12's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 12 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 34 Bug Zapper 2010-12-05 07:03:34 UTC
Fedora 12 changed to end-of-life (EOL) status on 2010-12-02. Fedora 12 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.