886933 – High disk usage when both libvirt and virt-manager are opened

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 886933 - High disk usage when both libvirt and virt-manager are opened

Summary: High disk usage when both libvirt and virt-manager are opened

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	libvirt
Sub Component:
Version:	6.3
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Jiri Denemark
QA Contact:	Virtualization Bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2012-12-13 15:53 UTC by g.danti
Modified:	2013-02-21 07:28 UTC (History)
CC List:	10 users (show)
Fixed In Version:	libvirt-0.10.2-1.el6
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2013-02-21 07:28:38 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2013:0276	0	normal	SHIPPED_LIVE	Moderate: libvirt security, bug fix, and enhancement update	2013-02-20 21:18:26 UTC

Description g.danti 2012-12-13 15:53:33 UTC

Description of problem:
When both libvirtd and virt-manager are opened and one or more virtual machine are started, the host system continuously rewrite some .xml files under /var/run/libvirt/qemu system directory.

These repeated writes hit the host's physical storage at the same rate of virt-manager statistics update delay generating, depending of what statistics virt-manager is collecting (cpu/mem only or disk and network also), about 500-800 KB of write traffic per started virtual machine.

This means that a system with 2 virtual machine will issue ~1.5 MB of write traffic to the physical disks per second, while a system with 20 VMs will issue traffic in excess of 15 MB/s. This is independent of guest OS state - the writes will happen even inside grub.

Version-Release number of selected component (if applicable):
libvirt-python-0.9.10-21.el6_3.6.x86_64
libvirt-client-0.9.10-21.el6_3.6.x86_64
libvirt-0.9.10-21.el6_3.6.x86_64
virt-manager-0.9.0-14.el6.x86_64

How reproducible:

Steps to Reproduce:
1. open a command prompt and issue the dstat --disk command
2. start libvirtd and open virt-manager
3. start a Linux virtual machine and stop it at the grub screen
4. see how dstat reports high disk usage

Actual results:
This high write traffic can impair host and guests disk performance.

Expected results:
As disk bandwidth is a precious resource, libvirtd and virt-manager should not pose so much load to the host's disks.

Additional info:
Moving the /var/run/libvirt/qemu inside /dev/shm (or mounting a tmpfs in its location) bypasses the problem by moving the repeated writes inside main system RAM. As the FHS indicate that the content of /var/run should be renewed for each boot, the solutions above seem passable as temporary workaround.

Comment 1 Daniel Berrangé 2012-12-13 15:58:45 UTC

The upstream fix was done in 0.9.12:

commit 31796e2c1c8187b6b76a58d43f3bc28e030223ee
Author: Jiri Denemark <jdenemar>
Date:   Fri Apr 6 19:42:34 2012 +0200

    qemu: Avoid excessive calls to qemuDomainObjSaveJob()
    
    As reported by Daniel Berrangé, we have a huge performance regression
    for virDomainGetInfo() due to the change which makes virDomainEndJob()
    save the XML status file every time it is called. Previous to that
    change, 2000 calls to virDomainGetInfo() took ~2.5 seconds. After that
    change, 2000 calls to virDomainGetInfo() take 2 *minutes* 45 secs.
    
    We made the change to be able to recover from libvirtd restart in the
    middle of a job. However, only destroy and async jobs are taken care of.
    Thus it makes more sense to only save domain state XML when these jobs
    are started/stopped.


Further to that, in latest libvirt+kvm, we don't even need to query the balloon driver when calling virDomainGetInfo(), since QEMU gives us async notifications of balloon changes. This avoids the performance issue entirely

Comment 3 g.danti 2012-12-13 16:15:51 UTC

Hi,
thanks for the quick reply.

Will this fix be backported to RHEL 6.x?

In the meantime, it is safe to use tmpsfs to store the the /var/run/libvirt/qemu directory?

Thanks.

Comment 4 Wayne Sun 2012-12-17 04:01:06 UTC

pkgs:
libvirt-0.10.2-12.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.337.el6.x86_64
kernel-2.6.32-341.el6.x86_64

steps:
1. set log level to 1
log_outputs="1:file:/tmp/libvirtd.log"

then restart libvirtd

2. prepare two guests paused at grub
# virsh list
 Id    Name                           State
----------------------------------------------------
 17    T1                             paused
 18    libvirt_test_api               paused

3. prepare a script
# vim getinfo.sh
#!/bin/sh

while true; do
  count=$(grep virDomainGetInfo -n /tmp/libvirtd.log |wc -l)
  if [ "$count" == "2000" ]; then
    exit
  fi
done

# time ./getinfo.sh

real	16m38.126s
user	10m28.336s
sys	8m12.531s

2000 calls to virDomainGetInfo() of 2 guests take 16 *minutes* 35 secs.

4. calculate disk write traffic
# dstat -d 1 60

manual calculate shows write traffic is 6k per second.

Comment 5 Wayne Sun 2012-12-17 04:02:36 UTC

(In reply to comment #4)
> pkgs:
> libvirt-0.10.2-12.el6.x86_64
> qemu-kvm-rhev-0.12.1.2-2.337.el6.x86_64
> kernel-2.6.32-341.el6.x86_64
> 
> steps:
> 1. set log level to 1
> log_outputs="1:file:/tmp/libvirtd.log"
> 
> then restart libvirtd
> 
> 2. prepare two guests paused at grub
> # virsh list
>  Id    Name                           State
> ----------------------------------------------------
>  17    T1                             paused
>  18    libvirt_test_api               paused
> 

virt-manager is running here

> 3. prepare a script
> # vim getinfo.sh
> #!/bin/sh
> 
> while true; do
>   count=$(grep virDomainGetInfo -n /tmp/libvirtd.log |wc -l)
>   if [ "$count" == "2000" ]; then
>     exit
>   fi
> done
> 
> # time ./getinfo.sh
> 
> real	16m38.126s
> user	10m28.336s
> sys	8m12.531s
> 
> 2000 calls to virDomainGetInfo() of 2 guests take 16 *minutes* 35 secs.
> 
> 4. calculate disk write traffic
> # dstat -d 1 60
> 
> manual calculate shows write traffic is 6k per second.

Comment 6 Wayne Sun 2012-12-17 06:12:02 UTC

pkgs:
libvirt-0.9.10-21.el6_3.7.x86_64

steps:
same as comment #4

3.
# time ./getinfo.sh

real	16m35.678s
user	10m27.248s
sys	7m54.820s

2000 calls to virDomainGetInfo() of 2 guests take 16 *minutes* 35 secs.
4.
write traffic is 328k per second.

also tested with libvirt-0.9.10-21.el6_3.6.x86_64, result is similar with 3.7.

Comment 7 EricLee 2012-12-17 07:38:18 UTC

I have reproduced the bug as Steps in Bug Description with the package libvirt-0.9.10-21.el6_3.6.x86_64. However, I get about 800k write traffic per 8 vms per second after eight guests have run long time(of course stoping at grub choosing kernel menu), and when they are just running status the traffic are about 200k per vm. Do not be serious as the bug descritption.

Keep virt-manager opening.

# virsh list --all
 Id    Name                           State
----------------------------------------------------
 1     a                              running
 2     b                              running
 3     c                              running
 4     d                              running
 5     r6u1                           running
 6     r6u2                           running
 7     r6u3                           running
 8     test                           running

They are all hanging at grub choosing kernel menu.

# dstat --disk
-dsk/total-
 read  writ
   0   744k
   0   776k
   0   772k
   0   748k
   0   744k
   0   748k
   0   756k
   0   748k
   0   756k
   0   796k
   0   804k
   0   772k
   0   772k
   0   748k
   0   848k
   0   776k
   0   760k
   0   752k
   0   840k
   0   828k
   0   828k
   0   912k
   0   772k
   0   752k
   0   744k
   0   752k
   0   744k

Comment 9 Min Zhan 2012-12-20 09:39:40 UTC

Move to VERIFIED per Comment 4.

Comment 10 errata-xmlrpc 2013-02-21 07:28:38 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0276.html

Note You need to log in before you can comment on or make changes to this bug.