Red Hat Bugzilla – Bug 814051
[virtio-win][performance] degression performance in 2/8 RX TCP sessions tests w/ rx checksumming in Win2008 R2 guest
Last modified: 2013-11-21 18:56:29 EST
Created attachment 578515 [details]
rx checksum off vs on result w/virtio-win-prewhql-0.1-24
Description of problem:
From the attached result about w/o and w/ rx checksum performance , we could see following conclusion:
1.In single TCP_STREAM session tests, normalized result(measured by rx_throughput/host_cpu%) in guest get over 15% improvement w/ enabled rx checksum.
2.In 4 TCP_STREAM session tests,normalized result get alomost one.
3.In 2/8 TCP_STREAM sessions tests,normalized result depress around 10% or more w/ enabled rx checksum. it also means lower total rx throughput w/ higher or alomost host cpu consumption w/ enabled rx checksumming.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1.turn off gro on the host.
# ethtool -k eth2
Offload parameters for eth2:
2.boot guest on the host and pinning vhost & vcpus threads on same numa
numactl -m 1 /usr/libexec/qemu-kvm -name vm1 -drive
-device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -device
-netdev tap,id=idsVdEtL,vhost=on -m 4096 -smp 2,cores=1,threads=1,sockets=2
-cpu qemu64,+sse2 -spice port=8000,disable-ticketing -vga qxl -rtc
base=localtime,clock=host,driftfix=slew -boot order=cdn,once=c,menu=off -M
rhel6.3.0 -usb -device usb-tablet -enable-kvm -monitor stdio
taskset -p 20 $vhost_thread
taskset -p 40 $vcpu1_thread
taskset -p 80 $vcpu2_thread
3.turn on rx checksum feature in guest
4.running netserver on the guest and running netperf test on the external host.
getting first batch result.
5.turn off checksum feature in guest.
6.running netserver on the guest and running netperf test on the external host.
getting second batch result.
degression performance in 2/8 RX TCP sessions tests w/ rx checksumming.
This requires some more research, so I postpone this bug to 6.4.
Anyhow, our default is checksum=off.
It seems as if the pinning + the checksum calculation work somehow do not work well together.
Still, I would expect that with "rx-checksumming: on", the CPU consumption will be lower.
Several suggestions from Michael Tsirkin regarding the research:
1. Disable rx checksumming on the host
2. Try to tun the test of different host
3. Try to run busy loop on the gust during the test (Michael referenced some different bug with benchmarks).
(In reply to comment #3)
> Several suggestions from Michael Tsirkin regarding the research:
> 1. Disable rx checksumming on the host
I will test it w/ disabled rx checksumming on host.
> 2. Try to tun the test of different host
It's hard for me to try since the two 10Gb network hardwares dedicated on the certain two hosts.
> 3. Try to run busy loop on the gust during the test (Michael referenced some
> different bug with benchmarks).
could you describe more details about this research ?
Too late for 6.4. Deferring again, to 6.5
Please retest with build 59.
(In reply to comment #8)
> Please retest with build 59.
> Best regards,
The tests on build 59 is on running.
According to results in comment #10, the issue does not existed on build-59. Change the bug to closed.
Moving status to VERIFIED based on comment #10
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.