Bug 1807280 - [Hyper-V Enlightenment] flag 'hv_time' can't improve performance evidently
Summary: [Hyper-V Enlightenment] flag 'hv_time' can't improve performance evidently
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: qemu-kvm
Version: 8.2
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: rc
: 8.3
Assignee: Virtualization Maintenance
QA Contact: Yu Wang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-02-26 01:22 UTC by Yu Wang
Modified: 2020-04-14 08:00 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-03-03 12:24:57 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
gettime_cycles.c (543 bytes, text/x-csrc)
2020-02-26 01:22 UTC, Yu Wang
no flags Details
QPF test app (212.50 KB, application/x-ms-dos-executable)
2020-03-02 10:11 UTC, Vadim Rozenfeld
no flags Details

Description Yu Wang 2020-02-26 01:22:38 UTC
Created attachment 1665793 [details]
gettime_cycles.c

Description of problem:

The cycles time is not much smaller when adding the flag.

Version-Release number of selected component (if applicable):

qemu-kvm-4.2.0-11.module+el8.2.0+5837+4c1442ec.x86_64
or qemu-kvm-2.12.0-98.module+el8.2.0+5698+10a84757.x86_64
kernel-4.18.0-175.el8.x86_64
Guest: win10-64  win2012r2

How reproducible:
100%

Steps to Reproduce:
1. boot guest without "hv_time" ($cpu= $half_of_host)
    -m 6144  \
    -smp 24,maxcpus=24,cores=12,threads=1,sockets=2  \
    -cpu 'Skylake-Server',+kvm_pv_unhalt \
    -no-hpet \

2. Install the gcc command. Follow the install guide. Install the gcc package.
(Down load the cygwin tool from https://www.cygwin.com/)
3.1) View ==> Full
3.2) Install gcc-g++:GNU Compiler Collection (C++) 

3.$ gcc gettime_cycles.c  -o gettime_cycles.exe -lpthread

4. Execute the gettime_cycles.exe in the Cygwin.
$ ./gettime_cycles.exe

5. boot guest with "hv_time" ($cpu= $half_of_host)
    -m 6144  \
    -smp 24,maxcpus=24,cores=12,threads=1,sockets=2  \
    -cpu 'Skylake-Server',+kvm_pv_unhalt,hv_time \
    -no-hpet \

6. Repeat the step5. Get the execute result again.


Actual results:
After step4, it shows 9811
After step6, it shows 9555

Expected results:

The result is much smaller when adding the flag.

Additional info:
1 tried with q35+ovmf q35+seabios pc+seabios, all hit this issue
2 tried on both fasttrain and slow train, hit this issue
3 full cmd
MALLOC_PERTURB_=1  /usr/libexec/qemu-kvm \
    -S  \
    -name 'avocado-vt-vm1'  \
    -sandbox on  \
    -machine q35 \
    -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
    -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
    -nodefaults \
    -device VGA,bus=pcie.0,addr=0x2 \
    -m 6144  \
    -smp 24,maxcpus=24,cores=12,threads=1,sockets=2  \
    -cpu 'Skylake-Server',+kvm_pv_unhalt,hv_time \
    -no-hpet \
    -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
    -device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \
    -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \
    -blockdev node-name=file_image1,driver=file,aio=threads,filename=win2012-64r2-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_image1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1 \
    -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
    -vnc :1  \
    -rtc base=localtime,clock=host,driftfix=slew  \
    -boot menu=off,order=cdn,once=c,strict=off \
    -blockdev node-name=file_ovmf_code,driver=file,read-only=on,filename=/usr/share/OVMF/OVMF_CODE.secboot.fd \
    -blockdev node-name=file_ovmf_vars,driver=file,filename=win2012-64r2-virtio-scsi.qcow2.fd \
    -enable-kvm \
    -monitor stdio \
    -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
    -device virtio-net-pci,mac=9a:44:8f:fa:54:ea,id=id5vlMkk,mq=on,vectors=14,netdev=idu29Vvi,bus=pcie-root-port-3,addr=0x0  \
    -netdev tap,id=idu29Vvi,vhost=on,queues=6 \
    -qmp tcp:0:4666,server,nowait

Comment 1 Vitaly Kuznetsov 2020-02-26 09:48:19 UTC
Cc: Vadim

I tried this on my Ivy Bridge and hv_time works as expected. My QEMU command line was:

~/qemu/x86_64-softmmu/qemu-system-x86_64 -machine q35,accel=kvm,kernel-irqchip=split -name guest=win10 -cpu host,-vmx,+kvm_pv_unhalt,hv_time -smp 6 -m 16384 -drive file=/var/lib/libvirt/images/WindowsServer2016_Gen1.qcow2,format=qcow2,if=none,id=drive-ide0-0-0 -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -m 8G -net nic,model=e1000e -net bridge,br=br0 -vnc :0

clock_gettime() takes around 420 CPU cycles. This is the expected result.


What's the clocksource in use on the host? It should be 'tsc':

# cat /sys/devices/system/clocksource/clocksource0/current_clocksource 
tsc

Also, take a look at your dmesg and check that there's nothing about clocksource being unstable.

Comment 2 Vadim Rozenfeld 2020-02-26 10:51:15 UTC
The simplest way to check if hv_time is active from the Windows side will be
crafting a simple code that calls QPF function
https://docs.microsoft.com/en-us/windows/win32/api/profileapi/nf-profileapi-queryperformancefrequency
In case of hv_time the frequency will be 10MHz

Even with hv_time parameter in qemu command line, WIndows can disable this option and go with
other time stamp source, if for example "useplatformclock" was specified.

Best,
Vadim.

Comment 3 Vadim Rozenfeld 2020-03-02 10:11:35 UTC
Created attachment 1666929 [details]
QPF test app

Comment 4 Vadim Rozenfeld 2020-03-02 10:13:15 UTC
(In reply to Vadim Rozenfeld from comment #3)
> Created attachment 1666929 [details]
> QPF test app

Can QE run this app on the system and post the result
back?

Best,
Vadim.

Comment 6 Vitaly Kuznetsov 2020-03-02 16:14:18 UTC
QPF test app reports:
Frequency = 3.57954 MHz

so this is not Hyper-V TSC page clocksourse.

I copied over this image to my SandyBridge server where WS2016 works well
but it didn't help, apparently this Win10 image is somehow different.

I also traced KVM to see if Windows enables TSC page and apparently it does:

 qemu-system-x86-15441 [002] 19678.928019: kvm_exit:             reason MSR_READ rip 0xfffff80145f487a0 info 0 0
 qemu-system-x86-15441 [002] 19678.928020: kvm_msr:              msr_read 40000021 = 0x0
 qemu-system-x86-15441 [002] 19678.928024: kvm_exit:             reason MSR_WRITE rip 0xfffff80145f4886b info 0 0
 qemu-system-x86-15441 [002] 19678.928025: kvm_msr:              msr_write 40000021 = 0xd001

Comment 7 Vadim Rozenfeld 2020-03-02 21:32:29 UTC
please make sure that useplatformclock  is off. This parameter disables hy_time when it is 
turned on.

Comment 8 Yu Wang 2020-03-03 07:54:01 UTC
(In reply to Vadim Rozenfeld from comment #7)
> please make sure that useplatformclock  is off. This parameter disables
> hy_time when it is 
> turned on.

The useplatformclock is "yes" in guest.
After turn off useplatformclock, hv_time effective.

It maybe our automation installed image problem, after install it with automation, the useplatformclock is "yes".
If install it manually, there is no useplatformclock.

BTW, do you know which setting or tools-installed will influence the useplatformclock? I have no idea.

Thanks
Yu Wang

Comment 9 Vadim Rozenfeld 2020-03-03 08:51:20 UTC
(In reply to Yu Wang from comment #8)
> (In reply to Vadim Rozenfeld from comment #7)
> > please make sure that useplatformclock  is off. This parameter disables
> > hy_time when it is 
> > turned on.
> 
> The useplatformclock is "yes" in guest.
> After turn off useplatformclock, hv_time effective.
> 
> It maybe our automation installed image problem, after install it with
> automation, the useplatformclock is "yes".
> If install it manually, there is no useplatformclock.
> 
> BTW, do you know which setting or tools-installed will influence the
> useplatformclock? I have no idea.

I have no idea ether. Basically on pre Vista platforms useplatformclock
can be set by modifying boot.ini. For more modern platforms, if I'm not mistaken,
bcd editor is the only official option to modify boot time parameters.

Best,
Vadim.

> 
> Thanks
> Yu Wang

Comment 10 Yu Wang 2020-03-03 10:01:48 UTC
(In reply to Vadim Rozenfeld from comment #9)
> (In reply to Yu Wang from comment #8)
> > (In reply to Vadim Rozenfeld from comment #7)
> > > please make sure that useplatformclock  is off. This parameter disables
> > > hy_time when it is 
> > > turned on.
> > 
> > The useplatformclock is "yes" in guest.
> > After turn off useplatformclock, hv_time effective.
> > 
> > It maybe our automation installed image problem, after install it with
> > automation, the useplatformclock is "yes".
> > If install it manually, there is no useplatformclock.
> > 
> > BTW, do you know which setting or tools-installed will influence the
> > useplatformclock? I have no idea.
> 
> I have no idea ether. Basically on pre Vista platforms useplatformclock
> can be set by modifying boot.ini. For more modern platforms, if I'm not
> mistaken,
> bcd editor is the only official option to modify boot time parameters.
> 
Got it, I checked our automation installed log, it set the useplatformclock 
to yes for other use after installing.I will make a note for this case.

Thanks a lot!
Yu Wang


> Best,
> Vadim.
> 
> > 
> > Thanks
> > Yu Wang

Comment 11 Vitaly Kuznetsov 2020-03-03 12:24:57 UTC
'bcdedit /set useplatformclock No' in the guest resolves the issue.

Comment 12 Yu Wang 2020-04-10 01:11:54 UTC
Hi Vitaly,

I have a question about this case.

Since "The result is much smaller when adding the flag",how to judge "much smaller" ?
eg. the cycles time with hv_time is less than 50%/30%/10% without hv_time.

Thanks
Yu Wang

Comment 13 Vitaly Kuznetsov 2020-04-14 08:00:23 UTC
In my (In reply to Yu Wang from comment #12)
> 
> Since "The result is much smaller when adding the flag",how to judge "much
> smaller" ?

In my testing it is somewhere around 9000 cycles without 
'hv_time' and 400 with it so if you need a hard setting for 
an automated test set it to e.g. 10% (so we are 10x faster with
the feature).


Note You need to log in before you can comment on or make changes to this bug.