Bug 572187 - Poor I/O performance of virtio block device
Summary: Poor I/O performance of virtio block device
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kvm
Version: 5.5
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: chellwig@redhat.com
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: Rhel5KvmTier2
TreeView+ depends on / blocked
 
Reported: 2010-03-10 14:07 UTC by jason wang
Modified: 2013-01-09 22:21 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-12-02 09:27:37 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Performance comparsion between ide and virtio (iozone) (122.39 KB, application/octet-stream)
2010-03-10 14:19 UTC, jason wang
no flags Details

Description jason wang 2010-03-10 14:07:52 UTC
Description of problem:
The throughput of virito is always lower than ide in the dbench test.

kvm_version \ drive format | virtio  |   ide   |  v/i |
---------------------------+---------+---------+------+
155                        | 778.247 | 824.418 |  <   |
---------------------------+---------+---------+------+
158                        | 780.043 | 837.585 |  <   |
---------------------------+---------+---------+------+
159                        | 788.497 | 828.245 |  <   |
---------------------------+---------+---------+------+
160                        | 795.518 | 822.073 |  <   |
---------------------------+---------+---------+------+
161                        | 780.396 | 889.323 |  <   |
---------------------------+---------+---------+------+

Version-Release number of selected component (if applicable):
kvm version 161

How reproducible:
100%

Steps to Reproduce:
1. Install a host with kvm
2. boot the virtual machine
3. create a 26M client.txt file as loadfile
4. run the dbench in the guest and record its result:
   dbench 2 -D . -c /home/devel/autotest/client/tests/dbench/src/client.txt -t 600
[step 1-4 was done by kvm-autotest]
5. Repeat 10 times for the above steps 
6. Get the average throughput
  
Actual results:
The performance of ide is better than virtio.

Expected results:
Virtio should out-perform ide.


Additional info:
qemu-kvm command line:
/usr/local/staf/test/RHEV/kvm-new/kvm-test/tests/kvm/qemu -name 'vm1' -monitor tcp:0:6001,server,nowait -drive file=/usr/local/staf/test/RHEV/kvm-new/kvm-test/tests/kvm/images/RHEL-Server-5.5-64-virtio.qcow2,if=virtio,cache=off,boot=on -net nic,vlan=0,model=virtio,macaddr=00:AE:77:04:AF:00 -net tap,vlan=0,ifname=virtio_0_6001,script=/usr/local/staf/test/RHEV/kvm-new/kvm-test/tests/kvm/scripts/qemu-ifup-switch,downscript=no -m 2048 -smp 2 -soundhw ac97 -usbdevice tablet -rtc-td-hack -no-hpet -cpu qemu64,+sse2 -no-kvm-pit-reinjection -vnc :0

All images are in the local disk. These images was checked through qemu-img before and after dbench, no errors found.

Comment 1 jason wang 2010-03-10 14:17:24 UTC
Also noticed that the iozone (iozone -a) test with virtio block device spent triple times as the ide block device.

In the total 1625 tests:
virtio out-perform 870 tests
ide    out-perform 755 tests

qemu-kvm cmdline:
ide:
/home/devel/autotest/client/tests/kvm/qemu -name 'vm1' -monitor tcp:0:6001,server,nowait -drive file=/home/devel/autotest/client/tests/kvm/images/RHEL-Server-5.4-64.qcow2,if=ide,cache=off,boot=on -net nic,vlan=0,model=e1000,macaddr=00:9B:7D:F9:71:03 -net tap,vlan=0,ifname=e1000_0_6001,script=/home/devel/autotest/client/tests/kvm/scripts/qemu-ifup-switch,downscript=no -m 4096 -smp 2 -soundhw ac97 -usbdevice tablet -rtc-td-hack -no-hpet -cpu qemu64,+sse2 -no-kvm-pit-reinjection -vnc :0
virtio:
/home/devel/autotest/client/tests/kvm/qemu -name 'vm1' -monitor tcp:0:6001,server,nowait -drive file=/home/devel/autotest/client/tests/kvm/images/RHEL-Server-5.4-64-virtio.qcow2,if=virtio,cache=off,boot=on -net nic,vlan=0,model=virtio,macaddr=00:9B:7D:F9:71:03 -net tap,vlan=0,ifname=virtio_0_6001,script=/home/devel/autotest/client/tests/kvm/scripts/qemu-ifup-switch,downscript=no -m 4096 -smp 2 -soundhw ac97 -usbdevice tablet -rtc-td-hack -no-hpet -cpu qemu64,+sse2 -no-kvm-pit-reinjection -vnc :0

Comment 2 jason wang 2010-03-10 14:19:08 UTC
Created attachment 399093 [details]
Performance comparsion between ide and virtio (iozone)

Comment 3 Yaniv Kaul 2010-03-10 15:07:14 UTC
As I'm always asking, I'll ask again:
What is the CPU usage of the VMs during the tests? If IDE is taking twice as much CPU as virtio (which I *do not* believe is the case), it's important to know.
We need CPU / MBps.

Comment 4 Dor Laor 2010-03-11 09:39:24 UTC
Christoph, why do we suffer from it?

Jason, did you use deadline scheduler on the host?

Comment 5 chellwig@redhat.com 2010-03-11 10:31:21 UTC
RHEL5.4 is a codebase without any of the block optimizations done over the last year, there could be tons of reasons, mostly due to very different I/O patterns.

Comment 6 Dor Laor 2010-03-11 10:48:17 UTC
What's the guest kernel version?

Comment 7 jason wang 2010-03-11 12:05:34 UTC
I did not use deadline elevator in these tests.
I use kernel 2.6.18-190 in dbench test and kernel 2.6.18-164.2.1 in iozone test. When I upgrade the 5.4z kernel to 2.6.18-164.14.1, the time spent on ide and virtio for iozone was almost equal.

I would re-test them for deadline iosched.

Comment 10 Yaniv Kaul 2010-03-12 22:17:11 UTC
(In reply to comment #3)
> As I'm always asking, I'll ask again:
> What is the CPU usage of the VMs during the tests? If IDE is taking twice as
> much CPU as virtio (which I *do not* believe is the case), it's important to
> know.
> We need CPU / MBps.    


Yes, I know I keep asking for this. I'll just keep asking for this - unless we know the VMs are in constant 100% CPU - which from past experience, I'm pretty sure they aren't.

Comment 12 jason wang 2010-03-15 03:28:56 UTC
I've retested the iozone in raw format:
Re-test with raw:
Guest kernel 2.6.18-189.el5
qemu-kvm cmdline:  
ide:
/usr/local/staf/test/RHEV/kvm-new/kvm-test/tests/kvm/qemu -name 'vm1' -monitor tcp:0:6001,server,nowait -drive file=/usr/local/staf/test/RHEV/kvm-new/kvm-test/tests/kvm/images/RHEL-Server-5.5-64-virtio.raw,if=ide,cache=writethrough,boot=on -net nic,vlan=0,model=e1000,macaddr=00:AE:8F:F2:B3:00 -net tap,vlan=0,ifname=e1000_0_6001,script=/usr/local/staf/test/RHEV/kvm-new/kvm-test/tests/kvm/scripts/qemu-ifup-switch,downscript=no -m 4096 -smp 2 -soundhw ac97 -usbdevice tablet -rtc-td-hack -no-hpet -cpu qemu64,+sse2 -no-kvm-pit-reinjection -vnc :0
virtio:
/usr/local/staf/test/RHEV/kvm-new/kvm-test/tests/kvm/qemu -name 'vm1' -monitor tcp:0:6001,server,nowait -drive file=/usr/local/staf/test/RHEV/kvm-new/kvm-test/tests/kvm/images/RHEL-Server-5.5-64-virtio.raw,if=virtio,cache=writethrough,boot=on -net nic,vlan=0,model=e1000,macaddr=00:AE:8F:F2:B3:00 -net tap,vlan=0,ifname=e1000_0_6001,script=/usr/local/staf/test/RHEV/kvm-new/kvm-test/tests/kvm/scripts/qemu-ifup-switch,downscript=no -m 4096 -smp 2 -soundhw ac97 -usbdevice tablet -rtc-td-hack -no-hpet -cpu qemu64,+sse2 -no-kvm-pit-reinjection -vnc :0

Host: quard-core q9400

----------------------------------------------------------------------+
elevator\driveformat  | virtio out-perfom | ide out-perfom |  equal   | 
----------------------+-------------------+----------------+----------+
default scheduler     |      510          |      1078      |    50    |
----------------------+-------------------+----------------+----------+
deadline              |      573          |      1012      |    53    |
----------------------+-------------------+----------------+----------+


[Default sched:]
https://virtlab3.englab.nay.redhat.com/job/7190/details/
profilers for ide: http://fileshare.englab.nay.redhat.com/logs//7190/default/297834/profiling/iteration.1/
profilers for virtio: http://fileshare.englab.nay.redhat.com/logs//7190/default/297835/profiling/iteration.1/

basic comparsion:

----------+---------------+---------+
          | cswitch/s avg | intr/s  |  
----------+---------------+---------+	
ide       | 8356.41       | 1412.42 |
----------+---------------+---------+
virtio    | 17389.98      | 1329.98 |
----------+---------------+---------+

The frequency of context switch for virito is too high!

cpu stat for ide:
09:31:17 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
Average:          all      0.54      0.00      7.71     17.25      0.00     74.51
Average:            0      0.22      0.00      6.59      0.10      0.00     93.10
Average:            1      0.52      0.00      7.21      0.21      0.00     92.06
Average:            2      0.35      0.00     11.70     67.94      0.00     20.01
Average:            3      1.07      0.00      5.34      0.74      0.00     92.85

cpu stat for virtio:
09:31:17 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
Average:          all      1.00      0.00      7.38     58.01      0.00     33.61
Average:            0      1.22      0.00      6.95     40.75      0.00     51.07
Average:            1      1.22      0.00      8.03     45.46      0.00     45.29
Average:            2      0.69      0.00      8.43     75.85      0.00     15.02
Average:            3      0.87      0.00      6.10     69.99      0.00     23.04


20 wrost cases of virtio:

--------------------------------------+------------+------------+---+-----------------
                                      |       ide  |      virtio|   | degradation of virtio
--------------------------------------+------------+------------+---+-----------------
     128 |       64 |           fread |  4233807.0 |  1755594.0 | < | -141.1609404%
  524288 |     1024 |           fread |  2734539.0 |  1307778.0 | < | -109.098103807%
     128 |        4 |           write |   506097.0 |   304727.0 | < | -66.0820997155%
  524288 |      512 |           fread |  2902005.0 |  1777590.0 | < | -63.255025062%
     128 |      128 |           write |   590094.0 |   374176.0 | < | -57.7049303002%
      64 |        4 |           write |   380864.0 |   242312.0 | < | -57.1791739575%
     256 |        4 |           write |   507006.0 |   335041.0 | < | -51.3265540635%
     256 |      256 |           write |   582983.0 |   390905.0 | < | -49.1367467799%
      64 |       32 |           write |   546928.0 |   378715.0 | < | -44.416777788%
      64 |       64 |           write |   566551.0 |   393135.0 | < | -44.1110559986%
     512 |        4 |           write |   514514.0 |   362885.0 | < | -41.7843118343%
   65536 |       64 |           write |   646374.0 |   456372.0 | < | -41.6331413847%
     512 |        4 |        bkwdread |  2024389.0 |  1455126.0 | < | -39.1212169943%
     256 |        8 |   recordrewrite |  2210228.0 |  1600674.0 | < | -38.0810833436%
  524288 |     8192 |           fread |  1634936.0 |  1187518.0 | < | -37.6767341632%
     256 |      256 |        bkwdread |  4496299.0 |  3275543.0 | < | -37.2688131403%
     512 |       16 |      strideread |  3533174.0 |  2584820.0 | < | -36.6893632825%
  131072 |       64 |           write |   666778.0 |   494078.0 | < | -34.9539951182%
     128 |       32 |           write |   612303.0 |   456986.0 | < | -33.9872556271%
     256 |       64 |         rewrite |  1970871.0 |  1471270.0 | < | -33.9571254766%


[deadline sched]
https://virtlab3.englab.nay.redhat.com/job/7189/details/
profiler for ide: http://fileshare.englab.nay.redhat.com/logs//7189/default/297690/profiling/iteration.1/
profiler for virtio: http://fileshare.englab.nay.redhat.com/logs//7189/default/297691/profiling/iteration.1/

basic comparsion:

----------+---------------+---------+
          | cswitch/s avg | intr/s  |  
----------+---------------+---------+	
ide       | 8180.89       | 1349.21 |
----------+---------------+---------+
virtio    | 15424.09      | 1266.01 |
----------+---------------+---------+

The frequency of context switch for virito is too high!

cpu stat for ide:

09:31:17 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
Average:          all      0.45      0.00      6.36     18.65      0.00     74.54
Average:            0      0.24      0.00      3.97      0.08      0.00     95.70
Average:            1      0.27      0.00      8.65      0.13      0.00     90.94
Average:            2      0.30      0.00      9.56     73.92      0.00     16.22
Average:            3      0.99      0.00      3.27      0.45      0.00     95.28

cpu stat for virtio:

09:31:17 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
Average:          all      0.85      0.00      6.05     60.97      0.00     32.13
Average:            0      0.88      0.00      5.37     44.83      0.00     48.93
Average:            1      0.91      0.00      5.89     49.09      0.00     44.11
Average:            2      0.66      0.00      7.36     79.74      0.00     12.24
Average:            3      0.94      0.00      5.56     70.25      0.00     23.25

20 wrost cases of virtio:

--------------------------------------+------------+------------+---+-----------------
                                      |       ide  |      virtio|   | degradation of virtio
--------------------------------------+------------+------------+---+-----------------
   32768 |       64 |           write |   618674.0 |   325709.0 | < | -89.9468544007%
      64 |       64 |           write |   560635.0 |   338589.0 | < | -65.5798032423%
    1024 |     1024 |          reread |  3941039.0 |  2455943.0 | < | -60.46948158%
      64 |        4 |           write |   394870.0 |   246089.0 | < | -60.4582082092%
      64 |       32 |           write |   565358.0 |   361380.0 | < | -56.4441861752%
     128 |      128 |           write |   554157.0 |   365515.0 | < | -51.6099202495%
     256 |      256 |           write |   592636.0 |   392620.0 | < | -50.9439152361%
     128 |        4 |           write |   453896.0 |   309113.0 | < | -46.838211269%
     256 |        4 |           write |   486779.0 |   337251.0 | < | -44.33730367%
      64 |        8 |         rewrite |  1492919.0 |  1066042.0 | < | -40.0431690309%
  524288 |     2048 |            read |  1876961.0 |  1406404.0 | < | -33.4581670701%
      64 |       16 |           write |   546928.0 |   410573.0 | < | -33.2109028114%
    1024 |        4 |           write |   509497.0 |   387604.0 | < | -31.447817876%
     256 |      128 |        bkwdread |  4404088.0 |  3368013.0 | < | -30.762203115%
     128 |        4 |       randwrite |  1543594.0 |  1185654.0 | < | -30.1892457665%
     128 |        8 |       randwrite |  1939522.0 |  1504659.0 | < | -28.9010998505%
     512 |        4 |           write |   501419.0 |   389969.0 | < | -28.5791947565%
   65536 |       64 |           write |   651321.0 |   508942.0 | < | -27.9754864012%
     128 |       32 |       randwrite |  2211113.0 |  1749872.0 | < | -26.3585565116%
     128 |       32 |   recordrewrite |  2132084.0 |  1710838.0 | < | -24.6222026866%

Comment 13 Yaniv Kaul 2010-03-15 09:27:13 UTC
I'm not sure I am reading the results correctly, but isn't the results indicating the virtio is consuming a less CPU, and is waiting more for IO than IDE?

In that case, it would be interesting to repeat the tests with multiple VMs, running concurrently (we have such a test).
Also, are we sure our backend storage is strong enough? What is the storage used? Is it a local disk?! Please use a normal storage backend, if that's indeed the case (Fibre Channel, with multiple disks and spindles, RAID 0 would be nice).


Note You need to log in before you can comment on or make changes to this bug.