Bug 1729077 - flag 'hv_vapic' doesn't improve Windows' performance evidently
Summary: flag 'hv_vapic' doesn't improve Windows' performance evidently
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: qemu-kvm
Version: 8.1
Hardware: x86_64
OS: Linux
high
unspecified
Target Milestone: rc
: ---
Assignee: Vitaly Kuznetsov
QA Contact: Yu Wang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-07-11 11:08 UTC by liunana
Modified: 2021-01-08 16:52 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1727238
Environment:
Last Closed: 2021-01-08 16:52:51 UTC
Type: Bug
Target Upstream Version:


Attachments (Terms of Use)

Description liunana 2019-07-11 11:08:19 UTC
Description of problem:
flag 'hv_vapic' doesn't improve Windows' performance evidently


Version-Release number of selected component (if applicable):
Host
   qemu-kvm-4.0.0-5.module+el8.1.0+3622+5812d9bf.x86_64
   kernel-4.18.0-112.el8.x86_64
   seabios-bin-1.11.1-4.module+el8.1.0+3531+2918145b.noarch
Guest
   en_windows_10_business_editions_version_1903_x64_dvd_37200948.iso


How reproducible:
3/3


Steps to Reproduce:
1. boot guest with command [1] without flag 'hv_vapic'
2. Using the IOmeter tool observer the storage performance
   a.download the tool inside the guest
     http://sourceforge.net/projects/iometer/files/iometer-stable/1.1.0/iometer-1.1.0-win64.x86_64-bin.zip/download
   b.Open the IOmeter and do configuration
     "Disk Target" ==> "D: ""
     "Access Specifications" ==> "4KiB 100% Read;"
     "Test Setup" ==> "30 Minutes"
   c. Start Test
3.Shutdown the guest. Then boot the same guest again with command "-cpu Skylake-Client-IBRS,+kvm_pv_unhalt,hv_vapic". Repeat the step 2

Actual results:

[test 1]--two work1 in iometer: 
   storage performance without any flag
   PROCESSOR,CPU  ==> 24.24%
   IOPS ==> 5388.54

   storage performance with the flag "hv_vapic"
   PROCESSOR,CPU  ==> 23.80%
   IOPS ==> 5400.17

[test 2]-- two work1 in iometer: 
   storage performance without any flag
   PROCESSOR,CPU  ==> 24.39%
   IOPS ==> 5459.93

   storage performance with the flag "hv_vapic"
   PROCESSOR,CPU  ==> 24.01%
   IOPS ==> 5438

[test 3]--one work1 in iometer: 
   storage performance without any flag
   PROCESSOR,CPU  ==> 24.60%
   IOPS ==> 5423.28

   storage performance with the flag "hv_vapic"
   PROCESSOR,CPU  ==> 24.00%
   IOPS ==> 5393.71

Expected results:
flag 'hv_vapic' can improve Windows' performance evidently


Additional info:
[1]
/usr/libexec/qemu-kvm -name win10-edk2 -M q35 -enable-kvm \
-cpu SandyBridge-IBRS,+kvm_pv_unhalt,hv_time \
-monitor stdio \
-nodefaults -rtc base=utc \
-m 4G \
-boot menu=on,splash-time=12000 \
-global driver=cfi.pflash01,property=secure,value=on \
-drive file=/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd,if=pflash,format=raw,readonly=on,unit=0 \
-drive file=/home/1-win10-edk2/OVMF/OVMF_VARS.fd,if=pflash,format=raw,unit=1,readonly=off \
-smp 2,sockets=1,cores=2,threads=2,maxcpus=4 \
-object secret,id=sec0,data=redhat \
-blockdev node-name=back_image,driver=file,cache.direct=on,cache.no-flush=off,filename=/home/1-win10-edk2/win10.luks,aio=threads \
-blockdev node-name=drive-virtio-disk0,driver=luks,cache.direct=on,cache.no-flush=off,file=back_image,key-secret=sec0 \
-device pcie-root-port,id=root0,slot=0 \
-device virtio-blk-pci,drive=drive-virtio-disk0,id=disk0,bus=root0 \
-device pcie-root-port,id=root1,slot=1 \
-device virtio-net-pci,mac=70:5a:0f:38:cd:a3,id=idhRa7sf,vectors=4,netdev=idNIlYmb,bus=root1 -netdev tap,id=idNIlYmb,vhost=on \
-drive id=drive_cd1,if=none,snapshot=off,aio=threads,cache=none,media=cdrom,file=/home/iso/windows/virtio-win-prewhql-0.1-172.iso \
-device ide-cd,id=cd1,drive=drive_cd1,bus=ide.0,unit=0 \
-device ich9-usb-uhci6 \
-device usb-tablet,id=mouse \
-device qxl-vga,id=video1 \
-spice port=5901,disable-ticketing \
-device virtio-serial-pci,id=virtio-serial1 \
-chardev spicevmc,id=charchannel0,name=vdagent \
-device virtserialport,bus=virtio-serial1.0,nr=3,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 \

Comment 2 liunana 2019-11-19 03:29:22 UTC
Update status of RHEL8.2 fast train:

Test Environments:
  4.18.0-148.el8.x86_64
  qemu-kvm-4.2.0-0.module+el8.2.0+4714+8670762e.x86_64
  seabios-1.12.0-5.module+el8.2.0+4673+ff4b3b61.x86_64
  en_windows_server_2019_updated_march_2019_x64_dvd_2ae967ab.iso

Test results:
   storage performance without any flag
   PROCESSOR,CPU 1 ==> 24.84%
   IOPS ==> 7652.10

   storage performance with the flag "+kvm_pv_unhalt,hv_vapic"
   PROCESSOR,CPU  ==> 21.94%
   IOPS ==> 7658.34

Comment 4 Ademar Reis 2020-02-05 23:00:38 UTC
QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks

Comment 6 Vitaly Kuznetsov 2020-03-02 09:06:04 UTC
Vadim, do you by any chance remember how 'hv_vapic' feature was tested when it was introduced? Or, maybe, you know how to construct a good test for it? Thanks!

Comment 7 Vadim Rozenfeld 2020-03-02 10:24:49 UTC
(In reply to Vitaly Kuznetsov from comment #6)
> Vadim, do you by any chance remember how 'hv_vapic' feature was tested when
> it was introduced? Or, maybe, you know how to construct a good test for it?
> Thanks!

Honestly, at that time the only tool that I used to check any hyper-v related performance
improvements was IoMeter. I was testing 512B read/write IO against a FAT-formatted volume 
with 512B sector size. 

Best,
Vadim.

Comment 8 Vitaly Kuznetsov 2020-03-03 12:18:31 UTC
I checked and the old trick seems to work, IO test shows moderate improvement.

What I did was:
1) Create a new raw volume on the host on tmpfs (important):
# qemu-img create -f raw /tmp/disk.raw 8G 
2) Start Windows guest, I used WS2016, the command line was:
qemu-system-x86_64 -machine q35,accel=kvm,kernel-irqchip=split -name guest=win10 -cpu host,hv_vpindex,hv_time,hv_relaxed,hv_vapic,hv_synic,hv_stimer,-vmx -smp 6 -m 16384 -drive file=/var/lib/libvirt/images/WindowsServer2016_Gen1.qcow2,format=qcow2,if=none,id=drive-ide0-0-0 -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -drive file=/tmp/disk.raw,format=raw,if=none,id=drive-ide1-0-0 -device ide-hd,bus=ide.1,unit=0,drive=drive-ide1-0-0,id=ide1-0-0,bootindex=2 -net nic,model=e1000e -net bridge,br=br0 -vnc :0
3) Partition hard drive, create an NTFS partition (D: in my case)
4) Install FIO (https://bsdio.com/fio/)
5) Create fio job, I used the following:

[global]
name=fio-rand-RW
filename=fio-rand-RW
directory=D\:\
rw=randwrite
bs=512B
direct=1
numjobs=6
time_based=1
runtime=300

[file1]
size=1G
iodepth=16

6) Note, the job uses the same 'numjobs' as the number of vCPUs the guest has
7) Run the job, 'fio job.fio'
8) Reboot the guest without 'hv_apic', let it calm down and run the same job, compare the result.

You may also want to do vCPU pinning (can use libvirt for that). In my testsing I'm seeing >10% improvement.

Comment 9 Yu Wang 2020-03-05 09:51:01 UTC
(In reply to Vitaly Kuznetsov from comment #8)
> I checked and the old trick seems to work, IO test shows moderate
> improvement.
> 
> What I did was:
> 1) Create a new raw volume on the host on tmpfs (important):
> # qemu-img create -f raw /tmp/disk.raw 8G 
> 2) Start Windows guest, I used WS2016, the command line was:
> qemu-system-x86_64 -machine q35,accel=kvm,kernel-irqchip=split -name
> guest=win10 -cpu
> host,hv_vpindex,hv_time,hv_relaxed,hv_vapic,hv_synic,hv_stimer,-vmx -smp 6
> -m 16384 -drive
> file=/var/lib/libvirt/images/WindowsServer2016_Gen1.qcow2,format=qcow2,
> if=none,id=drive-ide0-0-0 -device
> ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -drive
> file=/tmp/disk.raw,format=raw,if=none,id=drive-ide1-0-0 -device
> ide-hd,bus=ide.1,unit=0,drive=drive-ide1-0-0,id=ide1-0-0,bootindex=2 -net
> nic,model=e1000e -net bridge,br=br0 -vnc :0
> 3) Partition hard drive, create an NTFS partition (D: in my case)
> 4) Install FIO (https://bsdio.com/fio/)
> 5) Create fio job, I used the following:
> 
> [global]
> name=fio-rand-RW
> filename=fio-rand-RW
> directory=D\:\
> rw=randwrite
> bs=512B
> direct=1
> numjobs=6
> time_based=1
> runtime=300
> 
> [file1]
> size=1G
> iodepth=16
> 
> 6) Note, the job uses the same 'numjobs' as the number of vCPUs the guest has
> 7) Run the job, 'fio job.fio'
> 8) Reboot the guest without 'hv_vapic', let it calm down and run the same
> job, compare the result.

Tried as above (win10-64), 
without flag, bw= 7505Kib/s
with flag "hv_vpindex,hv_time,hv_relaxed,hv_vapic,hv_synic,hv_stimer", bw=8209Kib/s
About 9.38% improvement.

And I have some question:
1 this case is testing for hv_vapic, is it enough that only testing with hv_vapic, not 
"hv_vpindex,hv_time,hv_relaxed,hv_vapic,hv_synic,hv_stimer"

2 Create a new raw volume on the host on tmpfs (important)
we must use tmpfs filesystem to test,right? 

I do this as below, is it right?
1 #mount tmpfs /mnt/tmpfs -t tmpfs
2 #qemu-img create -f raw /mnt/tmpfs/data.raw 10G

And why we must use tmpfs? Is qcow2 ok for this case ?


> 
> You may also want to do vCPU pinning (can use libvirt for that). 

Do you mean we must do vCPU pinning for our test? Since we usually test with qemu cml not libvirt.

In my
> testsing I'm seeing >10% improvement.

In my testing, it increase 9% and sometimes 8%, what percentage we should achieve at least?

Thanks
Yu Wang

Comment 10 Vitaly Kuznetsov 2020-03-05 10:16:36 UTC
(In reply to Yu Wang from comment #9)
> About 9.38% improvement.

Sounds great)

> 
> And I have some question:
> 1 this case is testing for hv_vapic, is it enough that only testing with
> hv_vapic, not 
> "hv_vpindex,hv_time,hv_relaxed,hv_vapic,hv_synic,hv_stimer"

Should be but I haven't tried myself.

> 
> 2 Create a new raw volume on the host on tmpfs (important)
> we must use tmpfs filesystem to test,right? 
> 
> I do this as below, is it right?
> 1 #mount tmpfs /mnt/tmpfs -t tmpfs
> 2 #qemu-img create -f raw /mnt/tmpfs/data.raw 10G
> 
> And why we must use tmpfs? Is qcow2 ok for this case ?

We use tmpfs to not get blocked on the real storage as we won't likely see any
improvements. In case you can get a fast storage (NVMe SSD, for example) it
should be equally good.

'qcow2' will work after it's fully populated. When created, it is small and
it grows over time. You may have unstable test results before it reaches the
desired capacity. 'raw', on the other hand, doesn't have this problem as it
is fully populated upon creation.
> 
> In my testing, it increase 9% and sometimes 8%, what percentage we should
> achieve at least?

It probably depends on the environment and may depend on Windows version used,
in case you need a number for test automation I'd say let's set it fairly low
(e.g. 5%), this way we know that the feature works and we'll catch regressions
if they ever happen.

Comment 11 Yu Wang 2020-03-06 10:04:14 UTC
Hi Vitaly

> You may also want to do vCPU pinning (can use libvirt for that). 

Do you mean vCPU pinning is a must for our test or not?
we use "numactl  --physcpubind=1,2,3,4" for cpu pinning, is that right?

Another question:
>-drive file=/tmp/disk.raw,format=raw,if=none,id=drive-ide1-0-0 -device ide-hd,bus=ide.1,unit=0,drive=drive-ide1-0-0,id=ide1-0-0,bootindex=2 

Do you suggest testing with ide disk? Or virtio-scsi/virtio-blk is ok?
Since we use our own driver for virtio-scsi/virtio-blk, not a microsoft build-in driver,
will it influence this hv flag performance?

I retest this case today, the result is not as well as expected (without cpu pinning)

     KiB/s              with hv_vapic         without hv_vapic
------------------------------------------------------------------------------
win2019/ide                 9603                  9256
win10-64/ide                7835                  7565
win2019/virtio-scsi        11100                 10400
win10-64/virtio-scsi        9153                  8803

I used  vCPU pinning to test it, but the improvement is low or even no improvement.
win2019/virtio-scsi        11800                 11900


Boot cmd (full):
numactl --physcpubind=1,2,3,4,5,6 /usr/libexec/qemu-kvm \
    -S  \
    -name 'avocado-vt-vm1'  \
    -sandbox on  \
    -machine q35 \
    -device pcie-root-port,id=pcie-root-port-0,multifunction=on,port=0x1,bus=pcie.0,addr=0x1,chassis=1 \
    -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
    -nodefaults \
    -device VGA,bus=pcie.0,addr=0x2 \
    -m 6144  \
    -smp 6,maxcpus=6,cores=3,threads=1,sockets=2  \
    -cpu 'Skylake-Server',+kvm_pv_unhalt,hv_vapic  \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/avocado_y_i74ftz1,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/avocado_y_i74ftz1,server,nowait \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=idOFBXlG \
    -chardev socket,server,path=/var/tmp/avocado_y_i74ftz1,nowait,id=chardev_serial0 \
    -device isa-serial,id=serial0,chardev=chardev_serial0  \
    -chardev socket,id=seabioslog_id_20200210-015430-r3CRLBJ0,path=/var/tmp/avocado_y_i74ftz1,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20200210-015430-r3CRLBJ0,iobase=0x402 \
    -device pcie-root-port,id=pcie-root-port-2,addr=0x1.0x1,port=0x2,bus=pcie.0,chassis=2 \
    -device qemu-xhci,id=usb1,bus=pcie-root-port-2,addr=0x0 \
    -drive file=/home/kvm_autotest_root/images/win2019-64-virtio-scsi.qcow2,format=qcow2,if=none,id=drive-ide0-0-0 -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 \
    -device pcie-root-port,id=pcie.0-root-port-5,slot=5,chassis=5,addr=0x5,bus=pcie.0 \
    -device virtio-scsi-pci,id=virtio_scsi_pci1,bus=pcie.0-root-port-5,addr=0x0 \
    -drive id=drive_image2,if=none,snapshot=off,aio=threads,format=raw,file=/mnt/tmpfs/data.raw \
    -device scsi-hd,id=image2,drive=drive_image2 \
    -device pcie-root-port,id=pcie-root-port-4,addr=0x1.0x3,port=0x4,bus=pcie.0,chassis=4 \
    -device virtio-net-pci,mac=9a:24:b5:18:61:aa,id=idb3Oo7Z,mq=on,vectors=14,netdev=id1yLkRM,bus=pcie-root-port-4,addr=0x0  \
    -netdev tap,id=id1yLkRM,vhost=on,queues=6 \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -vnc :0  \
    -rtc base=localtime,clock=host,driftfix=slew  \
    -boot menu=on,order=cdn,once=c,strict=off \
    -enable-kvm \
    -device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,port=0x5,bus=pcie.0,addr=0x3,chassis=5 \
    -monitor stdio \


Thanks
Yu Wang

Comment 12 Vitaly Kuznetsov 2020-03-06 10:43:38 UTC
(In reply to Yu Wang from comment #11)
> Hi Vitaly
> 
> > You may also want to do vCPU pinning (can use libvirt for that). 
> 
> Do you mean vCPU pinning is a must for our test or not?
> we use "numactl  --physcpubind=1,2,3,4" for cpu pinning, is that right?

It is not a must but may give you more stable results, like with any performance
related testing.

> 
> Another question:
> >-drive file=/tmp/disk.raw,format=raw,if=none,id=drive-ide1-0-0 -device ide-hd,bus=ide.1,unit=0,drive=drive-ide1-0-0,id=ide1-0-0,bootindex=2 
> 
> Do you suggest testing with ide disk? Or virtio-scsi/virtio-blk is ok?
> Since we use our own driver for virtio-scsi/virtio-blk, not a microsoft
> build-in driver, will it influence this hv flag performance?

Modern devices may not generate that many interrupts and that's why I was using IDE
as the simplest possible test. It's also possible to use some legacy networking
device instead of storage I guess but I haven't tested that.

Comment 14 Yu Wang 2020-03-11 02:17:33 UTC
Could you have a look at comment#13? The result is not well expected on my side.

Thanks a lot
Yu Wang

Comment 15 Vitaly Kuznetsov 2020-03-11 11:28:19 UTC
Is there any difference between testing you've done for https://bugzilla.redhat.com/show_bug.cgi?id=1729077#c9 and https://bugzilla.redhat.com/show_bug.cgi?id=1729077#c13?
Is it the same guest and the same hardware on the host? It's not very easy to achieve very stable test results with Windows guests, unfortunately. It would probably be
possible to write a synthetic test (e.g. for kvm-unit-tests) for the feature but this won't tell us much about genuine Windows behavior across versions.

Comment 16 Yu Wang 2020-03-13 09:14:42 UTC
(In reply to Vitaly Kuznetsov from comment #15)
> Is there any difference between testing you've done for
> https://bugzilla.redhat.com/show_bug.cgi?id=1729077#c9 and
> https://bugzilla.redhat.com/show_bug.cgi?id=1729077#c13?
> Is it the same guest and the same hardware on the host? It's not very easy
> to achieve very stable test results with Windows guests, unfortunately. It
> would probably be
> possible to write a synthetic test (e.g. for kvm-unit-tests) for the feature
> but this won't tell us much about genuine Windows behavior across versions.

The only difference is used "hv_vpindex,hv_time,hv_relaxed,hv_vapic,hv_synic,hv_stimer"
as comment#8, and I re-tried on win10-64 with these flag above,only hv_vapic and no flag:


           flag"hv_vpindex,hv_time,hv_relaxed,hv_vapic,hv_synic,hv_stimer"     only hv_vpaic        no flag

win10-64                      3448                                                    2500           2303


So it seems that with more flags, the perfomance is better, the performance for only hv_vapic and no flag 
not incresed obviously.


Thanks
Yu Wang

numactl --physcpubind=1,2,3,4,5,6 /usr/libexec/qemu-kvm \
    -S  \
    -name 'avocado-vt-vm1'  \
    -sandbox on  \
    -machine q35 \
    -device pcie-root-port,id=pcie-root-port-0,multifunction=on,port=0x1,bus=pcie.0,addr=0x1,chassis=1 \
    -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
    -nodefaults \
    -device VGA,bus=pcie.0,addr=0x2 \
    -m 6144  \
    -smp 6,maxcpus=6,cores=3,threads=1,sockets=2  \
    -cpu 'Skylake-Server',hv_vpindex,hv_time,hv_relaxed,hv_vapic,hv_synic,hv_stimer  \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/avocado_y_i74ftz1,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/avocado_y_i74ftz1,server,nowait \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=idOFBXlG \
    -chardev socket,server,path=/var/tmp/avocado_y_i74ftz1,nowait,id=chardev_serial0 \
    -device isa-serial,id=serial0,chardev=chardev_serial0  \
    -chardev socket,id=seabioslog_id_20200210-015430-r3CRLBJ0,path=/var/tmp/avocado_y_i74ftz1,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20200210-015430-r3CRLBJ0,iobase=0x402 \
    -device pcie-root-port,id=pcie-root-port-2,addr=0x1.0x1,port=0x2,bus=pcie.0,chassis=2 \
    -device qemu-xhci,id=usb1,bus=pcie-root-port-2,addr=0x0 \
    -drive file=/home/kvm_autotest_root/images/win10-64-virtio-scsi.qcow2,format=qcow2,if=none,id=drive-ide0-0-0 -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 \
    -drive file=/mnt/tmpfs/data.raw,format=raw,if=none,id=drive-ide1-0-0 -device ide-hd,bus=ide.1,unit=0,drive=drive-ide1-0-0,id=ide1-0-0,bootindex=2 \
    -device pcie-root-port,id=pcie-root-port-4,addr=0x1.0x3,port=0x4,bus=pcie.0,chassis=4 \
    -device virtio-net-pci,mac=9a:24:b5:18:61:aa,id=idb3Oo7Z,mq=on,vectors=14,netdev=id1yLkRM,bus=pcie-root-port-4,addr=0x0  \
    -netdev tap,id=id1yLkRM,vhost=on,queues=6 \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -vnc :0  \
    -rtc base=localtime,clock=host,driftfix=slew  \
    -boot menu=on,order=cdn,once=c,strict=off \
    -enable-kvm \
    -device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,port=0x5,bus=pcie.0,addr=0x3,chassis=5 \
    -monitor stdio \

Comment 17 Vitaly Kuznetsov 2020-03-13 10:24:11 UTC
(In reply to Yu Wang from comment #16)
> (In reply to Vitaly Kuznetsov from comment #15)
> > Is there any difference between testing you've done for
> > https://bugzilla.redhat.com/show_bug.cgi?id=1729077#c9 and
> > https://bugzilla.redhat.com/show_bug.cgi?id=1729077#c13?
> > Is it the same guest and the same hardware on the host? It's not very easy
> > to achieve very stable test results with Windows guests, unfortunately. It
> > would probably be
> > possible to write a synthetic test (e.g. for kvm-unit-tests) for the feature
> > but this won't tell us much about genuine Windows behavior across versions.
> 
> The only difference is used
> "hv_vpindex,hv_time,hv_relaxed,hv_vapic,hv_synic,hv_stimer"
> as comment#8, and I re-tried on win10-64 with these flag above,only hv_vapic
> and no flag:
> 
> 
>            flag"hv_vpindex,hv_time,hv_relaxed,hv_vapic,hv_synic,hv_stimer"  
> only hv_vpaic        no flag
> 
> win10-64                      3448                                          
> 2500           2303
> 
> 
> So it seems that with more flags, the perfomance is better, the performance
> for only hv_vapic and no flag 
> not incresed obviously.
> 

With no hv_time/hv_stimer we get way more vmexits and this may explain the result
you're getting: e.g. an exit for EOI is happening and timer gets inject but when
it's not needed (with hv_vapic) we will still exit.

That said I think it make sense to change the test to use 'all' and 'all but hv_vapic'
flags to actually see what our users are seeing (as noone will likely run with 
'hv_vapic' only).

Comment 18 Yu Wang 2020-03-13 11:36:15 UTC
Test with "all flags", "all but hv_vapic" and "none flag".

Results are as below:

                   all         all but hv_vapic      none
Win10-64        6742/6899         6926/7028         2288/2790


So, it's almost the same with "all" and "all but hv_vapic", but it 
is higher performance than "none flag".

Comment 19 Yu Wang 2020-03-18 06:46:39 UTC
Test with "all flags", "all but hv_vapic" on fastt rain
(hv_evmcs depends on hv_vapic, so no hv_evmcs either)

Results are as below (two runs):

                   all         all but hv_vapic   
Win10-64        3151/3409        2750/2881  
Win2016         2688/2439        2245/2200    

almost 10% improvement

Steps as https://bugzilla.redhat.com/show_bug.cgi?id=1727238#c19


Thanks
Yu Wang

Comment 20 Vitaly Kuznetsov 2020-03-18 09:27:48 UTC
(In reply to Yu Wang from comment #19)
> Test with "all flags", "all but hv_vapic" on fastt rain
> (hv_evmcs depends on hv_vapic, so no hv_evmcs either)
> 
> Results are as below (two runs):
> 
>                    all         all but hv_vapic   
> Win10-64        3151/3409        2750/2881  
> Win2016         2688/2439        2245/2200    
> 
> almost 10% improvement
> 

Looks good to me, thanks!

Comment 21 Yu Wang 2020-03-19 11:23:33 UTC
According to comment#20, change this bug to verified.

Thanks
Yu Wang

Comment 28 Jeff Nelson 2021-01-08 16:52:51 UTC
Changing this TestOnly BZ as CLOSED CURRENTRELEASE. Please reopen if the issue is not resolved.


Note You need to log in before you can comment on or make changes to this bug.