Bug 602282 - Win2003-64 virtio net got 30% lower network throughput compared with rhel5 guest
Summary: Win2003-64 virtio net got 30% lower network throughput compared with rhel5 guest
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: virtio-win
Version: 5.5
Hardware: All
OS: Linux
high
high
Target Milestone: ---
: ---
Assignee: Yvugenfi@redhat.com
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: Rhel5KvmTier2
TreeView+ depends on / blocked
 
Reported: 2010-06-09 14:19 UTC by Keqin Hong
Modified: 2013-01-09 22:42 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-06-21 09:04:52 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
win2k3_64_nuttcp log for 30 sec (1.35 KB, text/plain)
2010-06-09 14:19 UTC, Keqin Hong
no flags Details
rhel5 nuttcp log for 30 secs (1.39 KB, text/plain)
2010-06-09 14:20 UTC, Keqin Hong
no flags Details
kvm_stat of 6sec interval (8.89 KB, application/octet-stream)
2010-06-11 13:32 UTC, Keqin Hong
no flags Details

Description Keqin Hong 2010-06-09 14:19:31 UTC
Created attachment 422566 [details]
win2k3_64_nuttcp log for 30 sec

Description of problem:

Running nuttcp benchmark, Win2003-64 guest got 30% lower network throughput compared with rhel5 guest.

Version-Release number of selected component (if applicable):
rhev-hypervisor-5.5-2.2.4.el5rhev
vdsm 2.2.0.62
kvm-83-164.el5_5.10

How reproducible:
100%


Setup:
1. A rhel5 host working as nuttcp server 
2. A rhev-h hosting both Win2003 and rhel5 guests 
3. rhev-h connected to rhel5 host using machine-to-machine connection, no switch/hub was used

Steps to Reproduce:
1. Start "nuttcp -s" on rhel5 host
2. Start only Win2003 guest with virtio NIC, and run "nuttcp -t -R1024M -T30s -i1 $rhel5host"
3. Start only rhel5 guest with virtio NIC, and run "nuttcp -t -R1024M -T30s -i1 $rhel5host"
4. compare the network throughputs returned from step 2 and step 3

Actual results:
Win2003-64 guest (560.9239 Mbps) got 30% lower network throughput compared with rhel5 guest (820.7432 Mbps)

Expected results:
no such amount of difference between Windows-virtio-NIC and rhel-virtio-NIC

Additional info:    

nuttcp-5.1.11

CLI for Windows:
/usr/libexec/qemu-kvm -no-hpet -usbdevice tablet -rtc-td-hack -startdate 2010-06-09T19:59:52 -name Net_win03_64 -smp 1,cores=1 -k en-us -m 2048 -boot c -net nic,vlan=1,macaddr=00:1a:4a:a8:02:20,model=virtio -net tap,vlan=1,ifname=virtio_11_1,script=no -net nic,vlan=2,macaddr=00:1a:4a:a8:02:23,model=virtio -net tap,vlan=2,ifname=virtio_11_2,script=no -drive file=/rhev/data-center/6bfdfd58-c65c-4f26-8699-3f26b6266722/684efde0-002d-4590-b898-5ee571876796/images/8342d3f2-6382-4951-a607-6aee3476a291/02ff8dc4-1036-4a0d-9d9b-b95918095a08,media=disk,if=virtio,cache=off,serial=51-a607-6aee3476a291,boot=on,format=raw,werror=stop -pidfile /var/vdsm/50c9a0da-13e6-4d46-bb2a-acb4ca2594f9.pid -vnc 0:11,password -cpu qemu64,+sse2,+cx16 -M rhel5.5.0 -notify all -balloon none -smbios type=1,manufacturer=Red Hat,product=RHEV Hypervisor,version=5.5-2.2-4,serial=58A72E46-7236-DF11-BBDA-2905B918001F_00:1b:21:55:b3:b8,uuid=50c9a0da-13e6-4d46-bb2a-acb4ca2594f9 -vmchannel di:0200,unix:/var/vdsm/50c9a0da-13e6-4d46-bb2a-acb4ca2594f9.guest.socket,server -monitor unix:/var/vdsm/50c9a0da-13e6-4d46-bb2a-acb4ca2594f9.monitor.socket,server

CLI for RHEL:
/usr/libexec/qemu-kvm -no-hpet -no-kvm-pit-reinjection -usbdevice tablet -rtc-td-hack -startdate 2010-06-09T05:48:15 -name networ_rhel5u5 -smp 1,cores=1 -k en-us -m 1024 -boot c -net nic,vlan=1,macaddr=00:1a:4a:a8:02:21,model=virtio -net tap,vlan=1,ifname=virtio_10_1,script=no -net nic,vlan=2,macaddr=00:1a:4a:a8:02:22,model=virtio -net tap,vlan=2,ifname=virtio_10_2,script=no -drive file=/rhev/data-center/6bfdfd58-c65c-4f26-8699-3f26b6266722/684efde0-002d-4590-b898-5ee571876796/images/e3651fcf-1142-40fd-94ea-8b0cf1166ee3/72c91367-c018-40b4-8a33-5017fb744ce2,media=disk,if=virtio,cache=off,serial=fd-94ea-8b0cf1166ee3,boot=on,format=raw,werror=stop -pidfile /var/vdsm/9cdfedd2-4581-42c8-a9cb-3e76243ec634.pid -vnc 0:10,password -cpu qemu64,+sse2,+cx16 -M rhel5.5.0 -notify all -balloon none -smbios type=1,manufacturer=Red Hat,product=RHEV Hypervisor,version=5.5-2.2-4,serial=58A72E46-7236-DF11-BBDA-2905B918001F_00:1b:21:55:b3:b8,uuid=9cdfedd2-4581-42c8-a9cb-3e76243ec634 -vmchannel di:0200,unix:/var/vdsm/9cdfedd2-4581-42c8-a9cb-3e76243ec634.guest.socket,server -monitor unix:/var/vdsm/9cdfedd2-4581-42c8-a9cb-3e76243ec634.monitor.socket,server

Comment 1 Keqin Hong 2010-06-09 14:20:17 UTC
Created attachment 422567 [details]
rhel5 nuttcp log for 30 secs

Comment 3 Yaniv Kaul 2010-06-09 14:35:43 UTC
Have you verified the nuttcp implementation on Windows is not 30% less efficient than the Linux implementation?
I warmly suggest running more tests before reaching a conclusion here (though I suspect you are right and Windows is slower than Linux).
For Windows I suggest using http://www.microsoft.com/whdc/device/network/tcp_tool.mspx .

Comment 4 Keqin Hong 2010-06-10 01:54:49 UTC
(In reply to comment #3)
> Have you verified the nuttcp implementation on Windows is not 30% less
> efficient than the Linux implementation?
> I warmly suggest running more tests before reaching a conclusion here (though I
> suspect you are right and Windows is slower than Linux).
> For Windows I suggest using
> http://www.microsoft.com/whdc/device/network/tcp_tool.mspx .    

Thank you for the suggestion. I will test with more perf tools, then put results here later.

Comment 5 XinSun 2010-06-10 11:02:53 UTC
According to comment 3, today I use NTttcp tool to re-test this issue again. I try two direction testing: 
(1) Windows2003 guest ------->  Winxp remote host :   540M throughput(Mbit/s)
(2) Winxp remote host ------->  Windows2003 guest :   550M throughput(Mbit/s)
For sender, use command as:  NTttcps.exe -m 1,0,192.168.5.1 -a 2
For receiver, use command as:  NTttcpr.exe -m 1,0,192.168.5.1 -a 6
From the results, seems that the windows guest still get 550Mbps bandwidth.

Comment 6 Dor Laor 2010-06-10 11:33:42 UTC
Did you use the registry config from http://www.linux-kvm.org/page/WindowsGuestDrivers/kvmnet/registry ?

Is TSO on in the guest and the host?
What's the cpu consumption on the host?
kvm_stat?

Comment 7 Yaniv Kaul 2010-06-10 12:53:32 UTC
(In reply to comment #5)
> According to comment 3, today I use NTttcp tool to re-test this issue again. I
> try two direction testing: 
> (1) Windows2003 guest ------->  Winxp remote host :   540M throughput(Mbit/s)
> (2) Winxp remote host ------->  Windows2003 guest :   550M throughput(Mbit/s)

who's saturated? Is the guest in 100% CPU?

> For sender, use command as:  NTttcps.exe -m 1,0,192.168.5.1 -a 2
> For receiver, use command as:  NTttcpr.exe -m 1,0,192.168.5.1 -a 6

We are also using '-l 256k -p'

> From the results, seems that the windows guest still get 550Mbps bandwidth.

Comment 8 Keqin Hong 2010-06-10 12:57:55 UTC
I guess XinSun will give answer to Comment 6.

Below was what the iperf benchmark I got. From the result below, we can see
that Windows could gain great network throughput with 128KB or 256KB TCP window
size no matter vhost is on or off.


qemu-kvm-0.12.1.2-2.71.el6.x86_64

Setup:
Host A: RHEL6 host running iperf server
# iperf -s -w $TCPWinSize

Host B: hosting guests which run iperf client
# iperf -c $serverip -w $TCPWinSize

Both Host A and Host B have a "Broadcom Corporation NetXtreme BCM5754 Gigabit
Ethernet PCI Express"
----------------------------------------------------------------------------------------
guest\TCP Window Size  16KB(Mb/s)    32KB(Mb/s)   64KB(Mb/s)   128KB(Mb/s)  
256KB(Mb/s)
----------------------------------------------------------------------------------------
win2008r2-vhost-off    38            148          429          728          
780
win2008r2-vhost-on     148           381          667          806          
822
win7-64-vhost-off      75.5          157          409          711          
790
win7-64-vhost-on       145           369          689          833          
859
rhel6-vhost_off        73.8          63.8         217          702          
906
rhel6-vhost_on         172           378          638          901          
891
rhel5.5-64-vhost-off   81.8          164          346          794          
896
rhel5.5-64-vhost-on    206           395          669          804          
838
rhel4.9-64-vhost-off   87.6          135          317          578          
597
rhel4.9-64-vhost-on    184           303          401          656          
691
----------------------------------------------------------------------------------------

Again, from the above table, we can see that network throughputs increase
significantly with vhost on when TCP Window size is below 128KByte.
When TCP window size is above 128KByte, network throughputs wouldn't gain
benifit by enabling vhost, especially for RHEL guests.

Comment 9 Yaniv Kaul 2010-06-10 13:21:00 UTC
(In reply to comment #8)
> I guess XinSun will give answer to Comment 6.
> 
> Below was what the iperf benchmark I got. From the result below, we can see
> that Windows could gain great network throughput with 128KB or 256KB TCP window
> size no matter vhost is on or off.

Again - where's the bottleneck? If the VM is in 100%, can you see if it's in the user or kernel? In any case, if possible, can you try with iperf 2.05b1? At least on the Linux side, should be easy to compile. On Windows, you might need to try with cygwin, and might be more complex.

BTW, it sounds wrong to test with a (only) 1GB/sec interface! How do you expect to go above 1Gb/sec?

Comment 10 Keqin Hong 2010-06-11 02:39:32 UTC
(In reply to comment #9)
>> Below was what the iperf benchmark I got. From the result below, we can see
>> that Windows could gain great network throughput with 128KB or 256KB TCP window
>> size no matter vhost is on or off.
> Again - where's the bottleneck? If the VM is in 100%, can you see if it's in
> the user or kernel?
Sorry, I am not sure whether I understand your meaning. What I can tell from the table in Comment 8 is the network throughput is greatly affected by TCP window size (consider here only TCP perf) within a certain point (<256KB). For RHEL6 I can't change TCP window size greater than 256K by using "iperf -w $SIZE", Windows can though. 

Here is the CPU usage inside RHEL6-64 guest when running iperf client
"top - 09:55:32 up 13 min,  3 users,  load average: 0.06, 0.03, 0.00
Tasks: 100 total,   1 running,  99 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.5%us,  8.9%sy,  0.0%ni, 76.8%id,  0.0%wa,  1.3%hi, 12.6%si,  0.0%st
Mem:   2055508k total,   229536k used,  1825972k free,    10872k buffers
Swap:  4128760k total,        0k used,  4128760k free,    60752k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            
 1739 root      20   0 97.7m 1284 1096 S 41.1  0.1   0:06.37 iperf              
 1743 root      20   0 14936 1136  880 R  0.7  0.1   0:00.05 top                
   10 root      20   0     0    0    0 S  0.3  0.0   0:00.48 events/1           
    1 root      20   0 19236 1408 1140 S  0.0  0.1   0:00.81 init       
"
and the corresponding CPU usage of the host running the VM
"
top - 09:50:40 up 3 days, 17:51,  3 users,  load average: 0.17, 0.06, 0.01
Tasks:   2 total,   1 running,   1 sleeping,   0 stopped,   0 zombie
Cpu(s): 14.8%us, 10.6%sy,  0.0%ni, 70.7%id,  0.9%wa,  0.5%hi,  2.6%si,  0.0%st
Mem:   7786200k total,  6797204k used,   988996k free,    38944k buffers
Swap:  9764856k total,        0k used,  9764856k free,  5832556k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            
16254 root      20   0 2426m 349m 3228 S 79.2  4.6   1:27.54 qemu-kvm           
 2018 root      20   0     0    0    0 R 30.6  0.0   4:56.69 vhost  
"
> In any case, if possible, can you try with iperf 2.05b1? At
> least on the Linux side, should be easy to compile. On Windows, you might need
> to try with cygwin, and might be more complex.
I used iperf version 2.0.4 compiled from src at http://sourceforge.net/projects/iperf/.  On Windows, I used iperf version 1.7.0 from http://www.noc.ucf.edu/Tools/Iperf/, which doesn't require cygwin. I will try iperf 2.05b1, but expect no big difference.
> 
> BTW, it sounds wrong to test with a (only) 1GB/sec interface! How do you expect
> to go above 1Gb/sec?    
It might not be enough to test only 1Gigabit NIC. However, I don't think it is wrong to produce result for such a environment.

Comment 11 Keqin Hong 2010-06-11 12:30:34 UTC
(In reply to comment #6)
> Did you use the registry config from
> http://www.linux-kvm.org/page/WindowsGuestDrivers/kvmnet/registry ?
> 
> Is TSO on in the guest and the host?
Both are on.
host# ethtool -k eth5
...
tcp segmentation offload: on
..

> What's the cpu consumption on the host?
> kvm_stat?    
host cpu consumption and kvm_stat will be posted later for new test results. Pls note that following status is NOT related to comment 5, but new tests instead.

Comment 12 Keqin Hong 2010-06-11 12:33:57 UTC
(In reply to comment #6)
> Did you use the registry config from
> http://www.linux-kvm.org/page/WindowsGuestDrivers/kvmnet/registry ?
comment 5's results didn't configure the registry mentioned right above, but new tests later did (see below).
> 
> Is TSO on in the guest and the host?
> What's the cpu consumption on the host?
> kvm_stat?

Comment 13 Keqin Hong 2010-06-11 13:32:11 UTC
Created attachment 423271 [details]
kvm_stat of 6sec interval

Comment 14 Keqin Hong 2010-06-11 13:33:28 UTC
Setup:
Host A: WinXP-64
Host B: RHEV-H hosting Win2003-64 guest

1. Win2003 guest as sender running NTttcps.exe, and WinXP as receiver running NTttcpr.exe
sender:   NTttcps.exe -m 1,0,$IP_WinXP -a 2 $sender_option
receiver: NTttcpr.exe -m 1,0,$IP_WinXP -a 6 $recv_option

---------------------------------------------------------------------------
ID  sender_option      recv_option       guest_cpu           Throughput
---------------------------------------------------------------------------
1   --                 --                89.42%              523.394
2   -l 256k            --                92.45%              551.426
3   -l 256k            -rb 128k          94.29%              738.787
4   -l 256k            -rb 256k          95.86%              861.297     *
5   -l 256k            -rb 2048k         94.61%              804.663
6   -l 1024k           -rb 2048k         89.83%              776.489
7   -l 512k            -rb 512k          93.57%              818.409
8   -l 128k            -rb 128k          94.29%              712.764
---------------------------------------------------------------------------
From the table we can see the Win2003 guest got maximum throughput with case 4. Also notice that 
case 1 is just what was mentioned in comment 5.

In such cases, Host B (which ran win2003 guest) cpu usage was as follows
"
top - 10:13:19 up 1 day,  1:07,  2 users,  load average: 0.67, 0.36, 0.27
Tasks: 226 total,   1 running, 225 sleeping,   0 stopped,   0 zombie
Cpu(s):  5.5%us,  4.2%sy,  0.0%ni, 89.3%id,  0.0%wa,  0.1%hi,  1.0%si,  0.0%st
Mem:  32835872k total,  2899552k used, 29936320k free,   112996k buffers
Swap: 24809464k total,        0k used, 24809464k free,   492388k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            
 9519 vdsm      16   0 2272m 2.0g 3872 S 125.1  6.5  62:21.19 qemu-kvm          
 7665 ntp       15   0 19188 4888 3788 S  0.0  0.0   0:00.00 ntpd               
 7998 vdsm      10  -5  368m  12m 3004 S  0.3  0.0   5:37.21 vdsm               
 4047 root      RT   0 87620 3676 2800 S  0.0  0.0   0:00.22 multipathd         
24366 root      15   0 87972 3344 2620 S  0.0  0.0   0:00.07 sshd               
 8825 root       0 -15  4476 2432 1656 S  0.0  0.0   0:00.00 iscsid             
 7563 haldaemo  15   0 31196 4380 1640 S  0.0  0.0   0:00.64 hald               
 7994 vdsm      20  -5 82048 4988 1364 S  0.0  0.0   0:00.02 vdsm               
26117 root      15   0 10900 1472 1136 S  0.0  0.0   0:00.00 bash   
"
kvm_stat with 6 sec inteval can be seen from attachment of Comment 13.

Comment 15 Keqin Hong 2010-06-11 14:02:30 UTC
Running iperf client in Win2003 guest and iperf server in WinXP host both with TCP window size 256K, I got a throughtput of 752Mbits/sec, which was smaller than the maximum value (861.297) listed in Comment 14. (I believe it is related to the different way Winsock APIs are used. Maybe I am wrong?)

The nuttcp benchmark tool used in comment 0 doesn't make the best use of Winsock API, not to mention it calls cygwin lib. I think Yaniv was right as he mentioned in comment 3 that nuttcp was not efficient.

In all, based on NTttcp test result, the performance of virtio network on Windows is generally good. Could I close this as NOTABUG?

Comment 16 Keqin Hong 2010-06-11 14:03:57 UTC
FYI:
(http://www.myri.com/serve/cache/511.html#windows)
Benchmarking Network Performance on Windows?

The performance of socket applications under the Windows operating system is very sensitive with respect to the underlying socket API.

A key aspect for getting good performance is that the Winsock2 API has been used. Winsock2 introduces overlapping of communication and allows multiple outstanding send or recv requests at a time. Sockets need to be created using WSASocket with the overlap flag.

The network benchmarking program

    * ntttcp 

is a good example of a benchmark program which uses the Winsock2 API. NTTTCP is a closed-source benchmark available from Microsoft at

    * http://www.microsoft.com/whdc/device/network/TCP_tool.mspx 

and is based on the original ttcp benchmark. Performance results can vary and are dependent on CPU type and the Windows operating system version. Refer to Myri-10G 10-Gigabit Ethernet Performance Measurements web page for further details.

In contrast, some UNIX network benchmarking tools like

    * iperf,
    * netperf,
    * nuttcp,
    * ttcp,
    * NetIO 

and others do not use the Winsock2 API and may not perform well (and sometimes not even function correctly) on Windows.

Even worse can be the performance when an additional intermediate library such as cygwin.dll is required to run the application.

You can check your source code and look for WSASend and WSARecv which will indicate the use of the Winsock2 API

Comment 17 Mark Wagner 2010-06-11 14:39:25 UTC
I think the underlying problem may be that your buffer size is the same or smaller than your message size.  That would imply that the buffer can only hold at most one message. 

Can you try making the buffer size much bigger (factor of 4 or more) or using a much smaller message size, say 8K ?

Either of those cases with the large buffer size should allow for multiple messages to get handled more efficiently.

Comment 18 Keqin Hong 2010-06-12 03:14:50 UTC
(In reply to comment #17)
> I think the underlying problem may be that your buffer size is the same or
> smaller than your message size.  That would imply that the buffer can only hold
> at most one message. 
> 
> Can you try making the buffer size much bigger (factor of 4 or more) or using a
> much smaller message size, say 8K ?
Have you seen they are included in comment 14 already? case 4 being the same msg size (-l 256K) and receiver buffer size (-rb 256k); case 5 being the receiver buffer size (-rb 2048k) 4 times the size of the msg (-l 256k), however case 4 gives a higher throughput than case 5. Not the bigger the better, IMHO.

> 
> Either of those cases with the large buffer size should allow for multiple
> messages to get handled more efficiently.

Comment 19 Yvugenfi@redhat.com 2010-06-12 12:45:10 UTC
In addition - registry setting for improved performance:


http://www.linux-kvm.org/page/WindowsGuestDrivers/kvmnet/registry

Comment 20 Keqin Hong 2010-06-21 09:04:52 UTC
Closed according to Comment 14 - 16, 19.


Note You need to log in before you can comment on or make changes to this bug.