Bug 1487936
| Summary: | Test Windows VM performance with different Hyper-V Enlightenment flags | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Vadim Rozenfeld <vrozenfe> |
| Component: | virtio-win | Assignee: | Vadim Rozenfeld <vrozenfe> |
| virtio-win sub component: | others | QA Contact: | Yanhui Ma <yama> |
| Status: | CLOSED NOTABUG | Docs Contact: | |
| Severity: | unspecified | ||
| Priority: | unspecified | CC: | ailan, chayang, juzhang, knoel, lijin, michen, vrozenfe, wquan, yama |
| Version: | 8.0 | Keywords: | TestOnly |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-05-19 13:31:25 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1624786 | ||
|
Description
Vadim Rozenfeld
2017-09-03 13:33:25 UTC
the following build reduces the smallest number of hyper-v spinlock retries limit to 3ff, which will allow us to test three different sets - KVM default, Microsoft Hyper-V default, and AMD recommended, with the number of retries 1fff, fff, and 7ff respectively. https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=13995876 Some background is here: https://bugzilla.redhat.com/show_bug.cgi?id=1465938#c15 https://bugzilla.redhat.com/show_bug.cgi?id=1465938#c18 Hi Vadim, Could you explain which kind of performance testing(net? blk? scsi?) should QE schedule? (In reply to lijin from comment #8) > Hi Vadim, > > Could you explain which kind of performance testing(net? blk? scsi?) should > QE schedule? Hi Li Jin. We can give a try to IoMeter on virtio-scsi, 4K blocks only, 32 outstanding IOs, and number of workers equal to number of vCPUs, to create some heavy load, and then try to see if the number of spinlock retries (1ffff/ffff/7fff) has any positive or negative impact on IOPS numbers. Thanks, Vadim. Hi yama, Could you help to handle comment#9? Thanks. (In reply to lijin from comment #10) > Hi yama, > > Could you help to handle comment#9? > ok, will arrange the test. > Thanks. Hi Vadim, Could you pls provide the qemu in comment 2 again? I downloaded and saved them before, but cannot find them now. Thanks, Yanhui (In reply to Yanhui Ma from comment #12) > Hi Vadim, > > Could you pls provide the qemu in comment 2 again? I downloaded and saved > them before, but cannot find them now. > > Thanks, > Yanhui https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=14128809 best regards, Vadim. (In reply to Vadim Rozenfeld from comment #9) > (In reply to lijin from comment #8) > > Hi Vadim, > > > > Could you explain which kind of performance testing(net? blk? scsi?) should > > QE schedule? > > Hi Li Jin. > We can give a try to IoMeter on virtio-scsi, 4K blocks only, 32 outstanding > IOs, and number of workers equal to number of vCPUs, to create some heavy > load, and then try to see if the number of spinlock retries > (1ffff/ffff/7fff) has any positive or negative impact on IOPS numbers. > > Thanks, > Vadim. Test win2016_x86_64 guest on RHEL7.4 host, with following versions: qemu-kvm-rhev-2.9.0-16.el7_4.8.bz1487936.x86_64 kerne-3.10.0-693.el7.x86_64 virtio-win-1.9.3-1.el7 Test steps: 1. qemu-img create -f raw /home/kvm_autotest_root/images/storage2.raw 40G 2. boot win2016 with following cmd line: /usr/libexec/qemu-kvm \ -S \ -name 'avocado-vt-vm1' \ -sandbox off \ -machine pc \ -nodefaults \ -vga std \ -device pvpanic,ioport=0x505,id=idlflHYp \ -device ich9-usb-ehci1,id=usb1,addr=1d.7,multifunction=on,bus=pci.0 \ -device ich9-usb-uhci1,id=usb1.0,multifunction=on,masterbus=usb1.0,addr=1d.0,firstport=0,bus=pci.0 \ -device ich9-usb-uhci2,id=usb1.1,multifunction=on,masterbus=usb1.0,addr=1d.2,firstport=2,bus=pci.0 \ -device ich9-usb-uhci3,id=usb1.2,multifunction=on,masterbus=usb1.0,addr=1d.4,firstport=4,bus=pci.0 \ -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=03 \ -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=raw,file=/home/kvm_autotest_root/images/win2016-64-virtio-scsi.raw \ -device scsi-hd,id=image1,drive=drive_image1 \ -drive id=drive_disk1,if=none,snapshot=off,aio=native,cache=none,format=raw,file=/home/kvm_autotest_root/images/storage2.raw \ -device scsi-hd,id=disk1,drive=drive_disk1 \ -device virtio-net-pci,mac=9a:37:37:37:37:5e,id=idH3OCSN,vectors=4,netdev=idPpViz1,bus=pci.0,addr=04 \ -netdev tap,id=idPpViz1,vhost=on \ -m 4096 \ -smp 2,maxcpus=2,cores=1,threads=1,sockets=2 \ -cpu 'SandyBridge',+kvm_pv_unhalt,hv_spinlocks=0x1fff,hv_vapic,hv_time \ -drive id=drive_cd1,if=none,snapshot=off,aio=native,cache=none,media=cdrom,file=/home/kvm_autotest_root/iso/windows/winutils.iso \ -device scsi-cd,id=cd1,drive=drive_cd1 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -vnc :0 \ -rtc base=localtime,clock=host,driftfix=slew \ -boot order=cdn,once=c,menu=off,strict=off \ -enable-kvm -monitor stdio 3. run Iometer on raw disk in guest two workers, 32 outstanding IOs, 4k sequential rand, 4k sequential write, 4k random read, 4k random write, run time for every access specification:4 Mins Rame up time: 60 Secs 4. results: hv_spinlocks=0x1fff IOps MBps 4k sequential read 7771.787023 30.358543 4k sequential write 4801.576204 18.756157 4k random read 4468.191568 17.453873 4k random write 506.11127 1.976997 hv_spinlocks=0xfff IOps MBps 4k sequential read 7710.57683 30.119441 4k sequential write 4785.200039 18.692188 4k random read 5982.518312 23.369212 4k random write 567.173302 2.215521 hv_spinlocks=0x7ff IOps MBps 4k sequential read 8453.741483 33.022428 4k sequential write 4596.968928 17.95691 4k random read 8349.170211 32.613946 4k random write 684.25718 2.67288 It seems that when hv_spinlocks=0x7ff, MBPs has slight imporvement. Because the storage2.raw is just on a SAS HDD, not a SSD, the whole performance is low. (In reply to Yanhui Ma from comment #14) > (In reply to Vadim Rozenfeld from comment #9) > > (In reply to lijin from comment #8) > > > Hi Vadim, > > > > > > Could you explain which kind of performance testing(net? blk? scsi?) should > > > QE schedule? > > > > Hi Li Jin. > > We can give a try to IoMeter on virtio-scsi, 4K blocks only, 32 outstanding > > IOs, and number of workers equal to number of vCPUs, to create some heavy > > load, and then try to see if the number of spinlock retries > > (1ffff/ffff/7fff) has any positive or negative impact on IOPS numbers. > > > > Thanks, > > Vadim. > > Test win2016_x86_64 guest on RHEL7.4 host, with following versions: > > qemu-kvm-rhev-2.9.0-16.el7_4.8.bz1487936.x86_64 > kerne-3.10.0-693.el7.x86_64 > virtio-win-1.9.3-1.el7 > > Test steps: > 1. qemu-img create -f raw /home/kvm_autotest_root/images/storage2.raw 40G > 2. boot win2016 with following cmd line: > /usr/libexec/qemu-kvm \ > -S \ > -name 'avocado-vt-vm1' \ > -sandbox off \ > -machine pc \ > -nodefaults \ > -vga std \ > -device pvpanic,ioport=0x505,id=idlflHYp \ > -device ich9-usb-ehci1,id=usb1,addr=1d.7,multifunction=on,bus=pci.0 \ > -device > ich9-usb-uhci1,id=usb1.0,multifunction=on,masterbus=usb1.0,addr=1d.0, > firstport=0,bus=pci.0 \ > -device > ich9-usb-uhci2,id=usb1.1,multifunction=on,masterbus=usb1.0,addr=1d.2, > firstport=2,bus=pci.0 \ > -device > ich9-usb-uhci3,id=usb1.2,multifunction=on,masterbus=usb1.0,addr=1d.4, > firstport=4,bus=pci.0 \ > -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=03 \ > -drive > id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=raw,file=/ > home/kvm_autotest_root/images/win2016-64-virtio-scsi.raw \ > -device scsi-hd,id=image1,drive=drive_image1 \ > -drive > id=drive_disk1,if=none,snapshot=off,aio=native,cache=none,format=raw,file=/ > home/kvm_autotest_root/images/storage2.raw \ > -device scsi-hd,id=disk1,drive=drive_disk1 \ > -device > virtio-net-pci,mac=9a:37:37:37:37:5e,id=idH3OCSN,vectors=4,netdev=idPpViz1, > bus=pci.0,addr=04 \ > -netdev tap,id=idPpViz1,vhost=on \ > -m 4096 \ > -smp 2,maxcpus=2,cores=1,threads=1,sockets=2 \ > -cpu 'SandyBridge',+kvm_pv_unhalt,hv_spinlocks=0x1fff,hv_vapic,hv_time \ > -drive > id=drive_cd1,if=none,snapshot=off,aio=native,cache=none,media=cdrom,file=/ > home/kvm_autotest_root/iso/windows/winutils.iso \ > -device scsi-cd,id=cd1,drive=drive_cd1 \ > -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ > -vnc :0 \ > -rtc base=localtime,clock=host,driftfix=slew \ > -boot order=cdn,once=c,menu=off,strict=off \ > -enable-kvm -monitor stdio > 3. run Iometer on raw disk in guest > two workers, 32 outstanding IOs, 4k sequential rand, 4k sequential write, 4k > random read, 4k random write, run time for every access specification:4 Mins > Rame up time: 60 Secs > > 4. results: > hv_spinlocks=0x1fff > IOps MBps > 4k sequential read 7771.787023 30.358543 > 4k sequential write 4801.576204 18.756157 > 4k random read 4468.191568 17.453873 > 4k random write 506.11127 1.976997 > > hv_spinlocks=0xfff > IOps MBps > 4k sequential read 7710.57683 30.119441 > 4k sequential write 4785.200039 18.692188 > 4k random read 5982.518312 23.369212 > 4k random write 567.173302 2.215521 > > hv_spinlocks=0x7ff > IOps MBps > 4k sequential read 8453.741483 33.022428 > 4k sequential write 4596.968928 17.95691 > 4k random read 8349.170211 32.613946 > 4k random write 684.25718 2.67288 > > It seems that when hv_spinlocks=0x7ff, MBPs has slight imporvement. Because > the storage2.raw is just on a SAS HDD, not a SSD, the whole performance is > low. Thanks a lot. The results are quite interesting. Any chance to re-spin this test on top of NVMe backend? Best, Vadim. This is a kind of useful testing that nice to run from time to time. I'm moving it to 7.6 as a reminder to re-spin the testing with next rhel version. (In reply to Vadim Rozenfeld from comment #15) > (In reply to Yanhui Ma from comment #14) > > (In reply to Vadim Rozenfeld from comment #9) > > > (In reply to lijin from comment #8) > > > > Hi Vadim, > > > > > > > > Could you explain which kind of performance testing(net? blk? scsi?) should > > > > QE schedule? > > > > > > Hi Li Jin. > > > We can give a try to IoMeter on virtio-scsi, 4K blocks only, 32 outstanding > > > IOs, and number of workers equal to number of vCPUs, to create some heavy > > > load, and then try to see if the number of spinlock retries > > > (1ffff/ffff/7fff) has any positive or negative impact on IOPS numbers. > > > > > > Thanks, > > > Vadim. > > > > Test win2016_x86_64 guest on RHEL7.4 host, with following versions: > > > > qemu-kvm-rhev-2.9.0-16.el7_4.8.bz1487936.x86_64 > > kerne-3.10.0-693.el7.x86_64 > > virtio-win-1.9.3-1.el7 > > > > Test steps: > > 1. qemu-img create -f raw /home/kvm_autotest_root/images/storage2.raw 40G > > 2. boot win2016 with following cmd line: > > /usr/libexec/qemu-kvm \ > > -S \ > > -name 'avocado-vt-vm1' \ > > -sandbox off \ > > -machine pc \ > > -nodefaults \ > > -vga std \ > > -device pvpanic,ioport=0x505,id=idlflHYp \ > > -device ich9-usb-ehci1,id=usb1,addr=1d.7,multifunction=on,bus=pci.0 \ > > -device > > ich9-usb-uhci1,id=usb1.0,multifunction=on,masterbus=usb1.0,addr=1d.0, > > firstport=0,bus=pci.0 \ > > -device > > ich9-usb-uhci2,id=usb1.1,multifunction=on,masterbus=usb1.0,addr=1d.2, > > firstport=2,bus=pci.0 \ > > -device > > ich9-usb-uhci3,id=usb1.2,multifunction=on,masterbus=usb1.0,addr=1d.4, > > firstport=4,bus=pci.0 \ > > -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=03 \ > > -drive > > id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=raw,file=/ > > home/kvm_autotest_root/images/win2016-64-virtio-scsi.raw \ > > -device scsi-hd,id=image1,drive=drive_image1 \ > > -drive > > id=drive_disk1,if=none,snapshot=off,aio=native,cache=none,format=raw,file=/ > > home/kvm_autotest_root/images/storage2.raw \ > > -device scsi-hd,id=disk1,drive=drive_disk1 \ > > -device > > virtio-net-pci,mac=9a:37:37:37:37:5e,id=idH3OCSN,vectors=4,netdev=idPpViz1, > > bus=pci.0,addr=04 \ > > -netdev tap,id=idPpViz1,vhost=on \ > > -m 4096 \ > > -smp 2,maxcpus=2,cores=1,threads=1,sockets=2 \ > > -cpu 'SandyBridge',+kvm_pv_unhalt,hv_spinlocks=0x1fff,hv_vapic,hv_time \ > > -drive > > id=drive_cd1,if=none,snapshot=off,aio=native,cache=none,media=cdrom,file=/ > > home/kvm_autotest_root/iso/windows/winutils.iso \ > > -device scsi-cd,id=cd1,drive=drive_cd1 \ > > -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ > > -vnc :0 \ > > -rtc base=localtime,clock=host,driftfix=slew \ > > -boot order=cdn,once=c,menu=off,strict=off \ > > -enable-kvm -monitor stdio > > 3. run Iometer on raw disk in guest > > two workers, 32 outstanding IOs, 4k sequential rand, 4k sequential write, 4k > > random read, 4k random write, run time for every access specification:4 Mins > > Rame up time: 60 Secs > > > > 4. results: > > hv_spinlocks=0x1fff > > IOps MBps > > 4k sequential read 7771.787023 30.358543 > > 4k sequential write 4801.576204 18.756157 > > 4k random read 4468.191568 17.453873 > > 4k random write 506.11127 1.976997 > > > > hv_spinlocks=0xfff > > IOps MBps > > 4k sequential read 7710.57683 30.119441 > > 4k sequential write 4785.200039 18.692188 > > 4k random read 5982.518312 23.369212 > > 4k random write 567.173302 2.215521 > > > > hv_spinlocks=0x7ff > > IOps MBps > > 4k sequential read 8453.741483 33.022428 > > 4k sequential write 4596.968928 17.95691 > > 4k random read 8349.170211 32.613946 > > 4k random write 684.25718 2.67288 > > > > It seems that when hv_spinlocks=0x7ff, MBPs has slight imporvement. Because > > the storage2.raw is just on a SAS HDD, not a SSD, the whole performance is > > low. > > Thanks a lot. > The results are quite interesting. Any chance to re-spin this test on top of > NVMe backend? > Best, > Vadim. Test packages are the same as comment 14, and test steps are the same as comment 14 except for: boot win2016 with following cmd line: /usr/libexec/qemu-kvm \ ... -drive file='/dev/nvme0n1',if=none,id=virtio-scsi2-id1,media=disk,cache=none,snapshot=off,format=raw,aio=native \ -device scsi-hd,drive=virtio-scsi2-id1 \ ... Here are results on NVMe backend: hv_spinlocks=0x1fff IOps MBps 4k sequential read 8412.399327 32.860935 4k sequential write 8987.859169 35.108825 4k random read 8372.248921 32.704097 4k random write 8885.385109 34.708536 hv_spinlocks=0xfff IOps MBps 4k sequential read 8311.813162 32.46802 4k sequential write 8396.661969 32.799461 4k random read 8182.455121 31.962715 4k random write 8781.237213 34.301708 hv_spinlocks=0x7ff IOps MBps 4k sequential read 8179.53502 31.951309 4k sequential write 8524.368062 33.298313 4k random read 8155.86899 31.858863 4k random write 8613.346035 33.645883 This work is still in the progress. More Hyper-V related stuff is gooing to be added in 7.7 and 8.0 I'm going to keep this bug open for watching the changes/improvements in rhel8 |