Created attachment 1045567 [details] series of demonstrative benchmarks Description of problem: Write performance with VirtIO drivers on Windows 7 guests has plummeted since 0.1-65. With 0.1-65 drivers installed in the guest, Crystal Diskmark produces 4K Q32T1 and 4K random writes in the 10-30MB/sec range (qcow2 storage, pool of mirrored conventional disks underneath). With 0.1-94 (latest stable), 0.1-96, 0.1-100, or 0.1-105 (latest), write performance on the guest plummets to less than 1MB/sec on same hardware in same configuration. Version-Release number of selected component (if applicable): verified on all versions since and including 0.1-94, verified not a problem on 0.1-65 How reproducible: install Windows 7 VM, VirtIO drivers, .qcow2 storage, writeback cache. Install Crystal DiskMark benchmark tool on guest from http://crystalmark.info/software/CrystalDiskMark/index-e.html and test with 2 runs on 500MB storage. VirtIO drivers 0.1-65 will perform an order of magnitude or more better on writes than 0.1-94 through 0.1-105 do. Steps to Reproduce: 1. install Win7 guest, writeback cache, VirtIO, .qcow2 2. install http://crystalmark.info/software/CrystalDiskMark/index-e.html on guest 3. test with settings of 2 and 500MB on various VirtIO guest drivers, observe write performance plummet with 0.1-94 or newer as compared to 0.1-65 Additional info:
Hi Jim, Can you please provide the following information; - kernel and qemu versions, - qemu command line. Thanks, Vadim
Hi, I am also seeing this issue. uname -a Linux angelbob 3.16.0-41-generic #57~14.04.1-Ubuntu SMP Thu Jun 18 18:01:13 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux kvm --version QEMU emulator version 2.0.0 (Debian 2.0.0+dfsg-2ubuntu1.13), Copyright (c) 2003-2008 Fabrice Bellard libvirt+ 5025 1 69 02:41 ? 05:19:35 qemu-system-x86_64 -enable-kvm -name SharedSQL -S -machine pc-i440fx-trusty,accel=kvm,usb=off -m 4096 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 2cd0fd1f-8a93-4fb1-9463-2ec91f12bd21 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/SharedSQL.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 -drive file=/mnt/storage/SharedSQL-00.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=writeback -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/mnt/storage/ISOs/en_windows_server_2012_r2_x64_dvd_2707946.iso,if=none,id=drive-ide0-0-0,readonly=on,format=raw -device ide-cd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -netdev tap,fd=25,id=hostnet0,vhost=on,vhostfd=26 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:64:64:38,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -device usb-tablet,id=input0 -spice port=5900,addr=127.0.0.1,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1 -chardev spicevmc,id=charredir2,name=usbredir -device usb-redir,chardev=charredir2,id=redir2 -chardev spicevmc,id=charredir3,name=usbredir -device usb-redir,chardev=charredir3,id=redir3 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8
me@locutus:~$ uname -a Linux locutus 3.16.0-41-generic #55~14.04.1-Ubuntu SMP Sun Jun 14 18:43:36 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux me@locutus:~$ apt-cache policy qemu-kvm qemu-kvm: Installed: 2.0.0+dfsg-2ubuntu1.13 me@locutus:~$ ps wwaux | grep win7 | grep -v grep libvirt+ 349 45.5 6.9 6739924 2285080 ? Sl Jul02 693:37 qemu-system-x86_64 -enable-kvm -name win7 -S -machine pc-i440fx-trusty,accel=kvm,usb=off -m 2048 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 741cfd2a-afea-a154-93df-f72d4a4175c3 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/win7.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device ahci,id=ahci0,bus=pci.0,addr=0x7 -drive file=/data/images/win7/win7.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=writeback -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/data/iso/virtio-win-0.1.96.iso,if=none,id=drive-ide0-1-0,readonly=on,format=raw,cache=writeback -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=25 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:e4:b0:0d,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc 127.0.0.1:0 -device VGA,id=video0,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
If you need me to, I can boot from a CentOS or Fedora USB stick and start the same VM on the same storage in order to rule out "it's a debian/ubuntu problem" concerns. That probably wouldn't be 'til next week though. If you want me to do that, let me know the distro and version of your choice.
Tried to reproduce the issue in our test bed, but failed to reproduce it. 1. Host version: kernel-3.10.0-290.el7.x86_64 qemu-kvm-rhev-2.3.0-4.el7.x86_64 2. Qemu command line: /usr/libexec/qemu-kvm \ -name 'virt-tests-vm1' \ -nodefaults \ -drive file='/home/kvm_autotest_root/images/win7-64-sp1-virtio.qcow2',index=0,if=none,id=drive-virtio-disk1,media=disk,cache=none,snapshot=off,format=qcow2,aio=native \ -device ide-drive,bus=ide.0,unit=0,drive=drive-virtio-disk1,bootindex=0 \ -drive file='/home/kvm_autotest_root/images/storage2.qcow2',index=1,if=none,id=drive-virtio-disk2,media=disk,cache=writeback,snapshot=off,format=qcow2,aio=threads \ -device virtio-blk-pci,bus=pci.0,addr=0x5,drive=drive-virtio-disk2,bootindex=1 \ -device virtio-net-pci,netdev=idRoVepv,mac='9a:37:37:37:37:8e',bus=pci.0,addr=0xd,id='iddWbJLZ' \ -netdev tap,id=idRoVepv \ -m 4096 \ -smp 2,cores=1,threads=1,sockets=2 \ -cpu 'Westmere' \ -M pc \ -monitor stdio 3. Results: Crystal Diskmark produces 4K random writes: - SATA disk backend: 0.1-65: 18.32 ~ 20.92 MB/sec 0.1-105: 14.36 ~ 19.63 MB/sec - Fusion-io iofx backend: 0.1-65: 32.63 ~ 39.25 MB/sec 0.1-105: 31.63 ~ 42.14 MB/sec
(In reply to Xiaomei Gao from comment #5) > Tried to reproduce the issue in our test bed, but failed to reproduce it. > > 1. Host version: > kernel-3.10.0-290.el7.x86_64 > qemu-kvm-rhev-2.3.0-4.el7.x86_64 > > 2. Qemu command line: > /usr/libexec/qemu-kvm \ > -name 'virt-tests-vm1' \ > -nodefaults \ > -drive > file='/home/kvm_autotest_root/images/win7-64-sp1-virtio.qcow2',index=0, > if=none,id=drive-virtio-disk1,media=disk,cache=none,snapshot=off, > format=qcow2,aio=native \ > -device ide-drive,bus=ide.0,unit=0,drive=drive-virtio-disk1,bootindex=0 \ > -drive > file='/home/kvm_autotest_root/images/storage2.qcow2',index=1,if=none, > id=drive-virtio-disk2,media=disk,cache=writeback,snapshot=off,format=qcow2, > aio=threads \ > -device > virtio-blk-pci,bus=pci.0,addr=0x5,drive=drive-virtio-disk2,bootindex=1 \ > -device > virtio-net-pci,netdev=idRoVepv,mac='9a:37:37:37:37:8e',bus=pci.0,addr=0xd, > id='iddWbJLZ' \ > -netdev tap,id=idRoVepv \ > -m 4096 \ > -smp 2,cores=1,threads=1,sockets=2 \ > -cpu 'Westmere' \ > -M pc \ > -monitor stdio > > 3. Results: > Crystal Diskmark produces 4K random writes: > > - SATA disk backend: > 0.1-65: 18.32 ~ 20.92 MB/sec > 0.1-105: 14.36 ~ 19.63 MB/sec > > - Fusion-io iofx backend: > 0.1-65: 32.63 ~ 39.25 MB/sec > 0.1-105: 31.63 ~ 42.14 MB/sec Thanks, Can you try raw instead of qcow2? Btw, I also cannot reproduce any performance degradation when testing 65 vs 105 with IoMeter on my test setup - 3.18.7-100.fc20.x86_64 , qemu 2.3.50. Best regards, Vadim.
(In reply to Jim Salter from comment #4) > If you need me to, I can boot from a CentOS or Fedora USB stick and start > the same VM on the same storage in order to rule out "it's a debian/ubuntu > problem" concerns. That probably wouldn't be 'til next week though. > > If you want me to do that, let me know the distro and version of your choice. Thanks Jim, Any recent Fedora, CentOS, or RHEL platforms will be good. It will be easier to us to trace and fix this problem if it is reproducible on any of RH-related products.
(In reply to Vadim Rozenfeld from comment #6) > (In reply to Xiaomei Gao from comment #5) > Can you try raw instead of qcow2? > Btw, I also cannot reproduce any performance degradation when testing 65 vs > 105 with IoMeter on my test setup - 3.18.7-100.fc20.x86_64 , qemu 2.3.50. Hi Vadim, It is nothing with image format. We can reproduce the issue by the Crystal Diskmark in the attachment on both qcow2 and raw format this time, and there is obvious degradation between 65 and 105. 4K Q32T1: Random 4KiB Read/Write with multi Queues & Threads Write Results: build 65: 65.72 MB/s build 105: 1.317 MB/s 4K: Random 4KiB Read Write with single Queue & Thread Write Results: build 65: 35.41 MB/s build 105: 0.321 MB/s Bisect the version, it is build 82 where the regression begins. build 81: 60.55 MB/s build 82: 1.648 MB/s
Created attachment 1050258 [details] CrystalDiskMark4_1_0-en.exe
(In reply to Xiaomei Gao from comment #8) > (In reply to Vadim Rozenfeld from comment #6) > > (In reply to Xiaomei Gao from comment #5) > > Can you try raw instead of qcow2? > > Btw, I also cannot reproduce any performance degradation when testing 65 vs > > 105 with IoMeter on my test setup - 3.18.7-100.fc20.x86_64 , qemu 2.3.50. > > Hi Vadim, > > It is nothing with image format. We can reproduce the issue by the Crystal > Diskmark in the attachment on both qcow2 and raw format this time, and there > is obvious degradation between 65 and 105. > > 4K Q32T1: Random 4KiB Read/Write with multi Queues & Threads > Write Results: > build 65: 65.72 MB/s > build 105: 1.317 MB/s > > > 4K: Random 4KiB Read Write with single Queue & Thread > Write Results: > build 65: 35.41 MB/s > build 105: 0.321 MB/s > > Bisect the version, it is build 82 where the regression begins. > build 81: 60.55 MB/s > build 82: 1.648 MB/s Can you try disabling write-cache buffer flushing on virtio-blk device (Computer->virtio-blk volume->Right click->properties->Hardware->Red Hat VirtIO SCSI Disk Device->Properties->Policies->Turn off Windows write-cache buffer flushing on the device and see if it helps? Thanks, Vadim.
(In reply to Vadim Rozenfeld from comment #10) > Can you try disabling write-cache buffer flushing on virtio-blk device > (Computer->virtio-blk volume->Right click->properties->Hardware->Red Hat > VirtIO SCSI Disk Device->Properties->Policies->Turn off Windows write-cache > buffer flushing on the device and see if it helps? Sure. It does help performance boost when turning off Windows write-cache buffer flushing on the device. Build 105: (write results) 4K Q32T1: 59.48 MB/S 4K: 31.43 MB/S
Vadim, what now? Any further progress on this bug?
(In reply to Jim Salter from comment #12) > Vadim, what now? Any further progress on this bug? Such significant performance degradation is a negative side effect of adding Force Unit Access (FUA) support (https://bugzilla.redhat.com/show_bug.cgi?id=837324) to the driver. Disabling Windows write-cache buffer flushing is the only way to get acceptable performance if an application tries to perform FUA writes. But in this case FUA flag will not be set of cause. This trick is relevant to Win7/WS2008(R2) platforms only. Win8 and newer OSes use FLUSH request instead of FUA. Sorry for that, but there is no way to fix this bug and still keep FUA working. Vadim.
If Windows 2012 R2 no longer uses FUA then what could be causing the performance problems on those os's? I've done all my testing on Windows 2012 R2 and Windows 8.1 and I'm seeing performance issues between version 1.81 and 1.94.
If you look back up the thread, Michael Bellerue was also testing with Windows 2012 R2.
Our QE engineers just completed a new performance testing cycle, trying to fine regression between version 86 and 106. Seems like we don't see any regression, but we are using IoMeter as a major performance evaluation tool. And out setup can be different from yours. I will ask QE to perform one more run comparing versions 81 and 106. Vadim.
Hi Jim, Is this still an issue for you?Could you try with latest virtio-win version? Thanks.
there is another issue https://bugzilla.redhat.com/show_bug.cgi?id=1023894 Bug 1023894 - [virtio-win][viostor] Write/Randwrite IOPS is poor when block size is 256k and iodepth is 64, which also requires running some performance tests