RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1727238 - flag 'hv_vapic' can't improve Windows' performance evidently
Summary: flag 'hv_vapic' can't improve Windows' performance evidently
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: qemu-kvm
Version: 8.1
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Vitaly Kuznetsov
QA Contact: Yu Wang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-07-05 07:08 UTC by liunana
Modified: 2023-06-16 10:21 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1729077 1733205 (view as bug list)
Environment:
Last Closed: 2020-05-13 09:12:57 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-26911 0 None None None 2023-06-16 10:21:17 UTC

Description liunana 2019-07-05 07:08:43 UTC
Description of problem:
flag 'hv_vapic' doesn't improve Windows' performance


Version-Release number of selected component (if applicable):
Host
   qemu-kvm-2.12.0-79.module+el8.1.0+3531+2918145b.x86_64
   kernel-4.18.0-107.el8.x86_64
   seabios-bin-1.11.1-4.module+el8.1.0+3531+2918145b.noarch
Guest
   en_windows_server_2016_updated_feb_2018_x64_dvd_11636692.iso


How reproducible:
3/3


Steps to Reproduce:
1. boot guest with command [1] without flag 'hv_vapic'
2. Using the IOmeter tool observer the storage performance
   a.download the tool inside the guest
     http://sourceforge.net/projects/iometer/files/iometer-stable/1.1.0/iometer-1.1.0-win64.x86_64-bin.zip/download
   b.Open the IOmeter and do configuration
     "Disk Target" ==> "D: ""
     "Access Specifications" ==> "4KiB 100% Read;"
     "Test Setup" ==> "30 Minutes"
   c. Start Test
3.Shutdown the guest. Then boot the same guest again with command "-cpu Skylake-Client-IBRS,+kvm_pv_unhalt,hv_vapic". Repeat the step 2

Actual results:

[test 1]--two work1 in iometer: 
   storage performance without any flag
   PROCESSOR,CPU  ==> 45.43%
   IOPS ==> 12509

   storage performance with the flag "hv_vapic"
   PROCESSOR,CPU  ==> 45.51%
   IOPS ==> 12591

[test 2]-- two work1 in iometer: 
   storage performance without any flag
   PROCESSOR,CPU  ==> 45.32%
   IOPS ==> 12784

   storage performance with the flag "hv_vapic"
   PROCESSOR,CPU  ==> 44.30%
   IOPS ==> 12647

[test 3]--one work1 in iometer: 
   storage performance without any flag
   PROCESSOR,CPU  ==> 26.95%
   IOPS ==> 9903

   storage performance with the flag "hv_vapic"
   PROCESSOR,CPU  ==> 26.88%
   IOPS ==> 9875

Expected results:
flag 'hv_vapic' can improve Windows' performance evidently


Additional info:
[1]
/usr/libexec/qemu-kvm -name win2016 -M q35 -enable-kvm \
-cpu Skylake-Client-IBRS \
-monitor stdio \
-nodefaults -rtc base=utc \
-m 4G \
-smp 1,sockets=1,cores=1,threads=1 \
-object secret,id=sec0,data=redhat \
-blockdev node-name=back_image,driver=file,cache.direct=on,cache.no-flush=off,filename=/home/3-win2016/w2016.luks,aio=threads \
-blockdev node-name=drive-virtio-disk0,driver=luks,cache.direct=on,cache.no-flush=off,file=back_image,key-secret=sec0 \
-device pcie-root-port,id=root0,slot=0 \
-device virtio-blk-pci,drive=drive-virtio-disk0,id=disk0,bus=root0 \
-blockdev node-name=back_image2,driver=file,cache.direct=on,cache.no-flush=off,filename=/home/3-win2016/win2016-disk.qcow2,aio=threads \
-blockdev node-name=drive-virtio-disk1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=back_image2 \
-device pcie-root-port,id=root3,slot=3 \
-device virtio-blk-pci,drive=drive-virtio-disk1,id=disk1,bus=root3 \
-device pcie-root-port,id=root1,slot=1 \
-device virtio-net-pci,mac=70:5a:0f:38:cd:a8,id=idhRa7sf,vectors=4,netdev=idNIlYmb,bus=root1 -netdev tap,id=idNIlYmb,vhost=on \
-drive id=drive_cd1,if=none,snapshot=off,aio=threads,cache=none,media=cdrom,file=/home/iso/windows/virtio-win-prewhql-0.1-172.iso \
-device ide-cd,id=cd1,drive=drive_cd1,bus=ide.0,unit=0 \
-device ich9-usb-uhci6 \
-device usb-tablet,id=mouse \
-device qxl-vga,id=video1 \
-spice port=5901,disable-ticketing \
-device virtio-serial-pci,id=virtio-serial1 \
-chardev spicevmc,id=charchannel0,name=vdagent \
-device virtserialport,bus=virtio-serial1.0,nr=3,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 \

Comment 6 Ademar Reis 2020-02-05 23:00:18 UTC
QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks

Comment 10 Vitaly Kuznetsov 2020-03-03 12:19:48 UTC
Please try test sequence from https://bugzilla.redhat.com/show_bug.cgi?id=1729077#c8

Comment 11 Yu Wang 2020-03-13 04:00:19 UTC
Tried with slow trian (win2019)
detail steps according to https://bugzilla.redhat.com/show_bug.cgi?id=1729077

it cannot increase the performance evidently.

without flag, 5497
with    flag, 5729


Thanks
Yu Wang

Comment 12 Vitaly Kuznetsov 2020-03-13 10:25:45 UTC
Just like https://bugzilla.redhat.com/show_bug.cgi?id=1729077#c17 please try with other
hv_* flags (compare 'all' and 'all but hv_vapic'), the result is supposed to be a bit
better.

Comment 13 Yu Wang 2020-03-13 11:39:05 UTC
Test with "all flags", "all but hv_vapic" and "none flag".

Results are as below:

                   all         all but hv_vapic      none
Win2019        5217/5309         5763/6023         4291/4293


So, it's almost the same with "all" and "all but hv_vapic", but it 
is higher performance than "none flag".
(almost the same as fasttrain)


qemu-kvm-2.12.0-98.module+el8.2.0+5698+10a84757.x86_64
kernel-4.18.0-179.el8.x86_64
seabios-1.13.0-1.module+el8.2.0+5520+4e5817f3.x86_64

Thanks
Yu Wang

Comment 14 Vitaly Kuznetsov 2020-03-13 11:59:44 UTC
(In reply to Yu Wang from comment #13)
> Test with "all flags", "all but hv_vapic" and "none flag".
> 
> Results are as below:
> 
>                    all         all but hv_vapic      none
> Win2019        5217/5309         5763/6023         4291/4293
> 
> 

What are these numbers, i.e. why do we have two for each test? Is this 
read/write or did you do two runs?

In any case,

 5217/5309 -> 5763/6023

looks like a good result to me (10%/13%) - but I hope I'm actually
comparing apples to apples here.

Comparing with 'none' is also interesting but it doesn't tell you
the impact of any particular feature (I think that hv_time and
hv_stimer have the biggest impact).

Comment 15 Yu Wang 2020-03-16 01:04:10 UTC
(In reply to Vitaly Kuznetsov from comment #14)
> (In reply to Yu Wang from comment #13)
> > Test with "all flags", "all but hv_vapic" and "none flag".
> > 
> > Results are as below:
> > 
> >                    all         all but hv_vapic      none
> > Win2019        5217/5309         5763/6023         4291/4293
> > 
> > 
> 
> What are these numbers, i.e. why do we have two for each test? Is this 
> read/write or did you do two runs?
> 

This means two runs.

> In any case,
> 
>  5217/5309 -> 5763/6023
> 
> looks like a good result to me (10%/13%) - but I hope I'm actually
> comparing apples to apples here.


But, “all but hv_vapic” means that "without hv_vapic".
The performance without hv_vapic is better than with hv_vapic.

> 
> Comparing with 'none' is also interesting but it doesn't tell you
> the impact of any particular feature (I think that hv_time and
> hv_stimer have the biggest impact).

Thanks
Yu Wang

Comment 16 Vitaly Kuznetsov 2020-03-16 07:47:14 UTC
(In reply to Yu Wang from comment #15)
> 
> But, “all but hv_vapic” means that "without hv_vapic".
> The performance without hv_vapic is better than with hv_vapic.
> 

This is very wrong. Let's try doing the test again to double check,
please make sure:
1) Power profile on the host is set to 'performance'
2) HyperThreading is disabled
3) There is nothing else running, just one QEMU
4) In case the host has NUMA pin the guest to CPUs from ONE numa node, 
e.g. 
  the servers has Node0: CPUs0-7, Node1: CPUs8-15
 you set the number of vCPUs to 8 and ping them to CPUs 0-7
 (better 1:1)
5) Make sure Windows guest has enough RAM (>8G) so no swapping is ongoing
6) Run Windows. Make sure there's nothing going on other than your test,
 e.g. that it is not trying to update itself/do some other background work
 (run task monitor and see CPU load inside the guest, it should be close to
  zero) - this needs to be monitored every time you run your test. Just
  disabling Windows Update is not enough, there are multiple other things
  Windows can be busy with.
7) Do the test

In case you're getting constant results where hv_vapic makes the result 
worse I'll need access to the host to investigate.

Comment 19 Yu Wang 2020-03-18 06:47:07 UTC
I tried with the latest method, it increased the performance with hv_vapic.

Results are as below (test 2 runs):

                   all         all but hv_vapic 
Win2019        2998/2863         2211/2463      
Win2016        2257/2447         1967/1629

It have more than 20% improvement.

Test with fast train, the result is almost 10% improvment
refer to : https://bugzilla.redhat.com/show_bug.cgi?id=1729077#c19

And, according to https://bugzilla.redhat.com/show_bug.cgi?id=1729077#c10
in case you need a number for test automation I'd say let's set it fairly low
(e.g. 5%), so , I will use 5% improvement as our testing standard.


Please confirm the steps for our test, correct me if it's not right.
Steps:

1) Create a new raw volume on the host on tmpfs
# qemu-img create -f raw /tmp/disk.raw 5G 

2) Start Windows guest(WS2019), ide/raw

    -smp 6,maxcpus=6,cores=3,threads=1,sockets=2  \
    -cpu 'Broadwell',hv_stimer,hv_synic,hv_vpindex,hv_relaxed,hv_spinlocks=0x1fff,hv_time,hv_frequencies,hv_runtime,hv_vapic,+kvm_pv_unhalt \
    -drive file=/home/kvm_autotest_root/images/win2019-64-virtio-scsi.qcow2,format=qcow2,if=none,id=drive-ide0-0-0 -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 \
    -drive file=/mnt/tmpfs/data.raw,format=raw,if=none,id=drive-ide1-0-0 -device ide-hd,bus=ide.1,unit=0,drive=drive-ide1-0-0,id=ide1-0-0,bootindex=2 \

3) Partition hard drive, create an NTFS partition

4) Install FIO (https://bsdio.com/fio/)

5) Create fio job, I used the following(numjobs=1,iodepth=1)

[global]
name=fio-rand-RW
filename=fio-rand-RW
directory=D\:\
rw=randwrite
bs=512B
direct=1
numjobs=1
time_based=1
runtime=300

[file1]
size=1G
iodepth=1

6) Let it calm down and run the job, 'fio job.fio'

7) Reboot the guest without 'hv_vapic', let it calm down and run the same job

8) compare the result(more than 5% improvement)

Comment 20 Yu Wang 2020-03-19 01:05:14 UTC
Hi Vitaly,

Could you help to check comment#19?
If the steps and result you accept, I will verify this bug.


Thanks a lot
Yu Wang

Comment 21 Vitaly Kuznetsov 2020-03-19 07:58:33 UTC
Steps look good. I would add one tiny thing to step 3:
disable write caching on the hard drive in Windows 
(device manager -> Disk drives -> Pick the second QEMU HARDDISK -> Policies)
just in case. Maybe this is not really needed for FIO but hard to say
without trying different Windows versions. I'd just disable it.

Comment 22 Yu Wang 2020-03-19 11:23:25 UTC
Got it, I will add it to test plan. Thanks a lot

According to comment#19 and comment#21, change this bug to verified.

Thanks
Yu Wang


Note You need to log in before you can comment on or make changes to this bug.