Description of problem:
We need tx array support in tun for accelerating rx pps in guest:
Author: Jason Wang <firstname.lastname@example.org>
Date: Thu Jun 30 14:45:36 2016 +0800
tun: switch to use skb array for tx
We used to queue tx packets in sk_receive_queue, this is less
efficient since it requires spinlocks to synchronize between producer
This patch tries to address this by:
- switch from sk_receive_queue to a skb_array, and resize it when
tx_queue_len was changed.
- introduce a new proto_ops peek_len which was used for peeking the
- implement a tun version of peek_len for vhost_net to use and convert
vhost_net to use peek_len if possible.
Pktgen test shows about 15.3% improvement on guest receiving pps for small
After : ~1500000pps
Signed-off-by: Jason Wang <email@example.com>
Signed-off-by: David S. Miller <firstname.lastname@example.org>
Version-Release number of selected component (if applicable):
Steps to Reproduce:
Note for QE:
- Since this touches tun, it would be better test something like vpn to make sure it does not break anything.
Move back to ASSIGNED.
The percentage of improvement for rhel7 is the same as Jason mentioned in
upstream commit log.
Get pps gap between beaker machine and Jason's data and my developing machine(5x slower).
Beaker: 48 Cores E5-4650 v3 @ 2.10GHz / 30M L3 Cache
128G DDR4 2133MHz
Before: ~119283 pps
After: ~140998 pps
Upstream: ~221630 pps
Mine: 4 Cores i5-6500 CPU @ 3.20GHz / 6M L3 Cache
16G DDR4 2133
Upstream: ~150000 pps
This maybe caused by hardware platform difference.
(In reply to Wei from comment #6)
> The percentage of improvement for rhel7 is the same as Jason mentioned in
> upstream commit log.
> Get pps gap between beaker machine and Jason's data and my developing
> machine(5x slower).
> Beaker: 48 Cores E5-4650 v3 @ 2.10GHz / 30M L3 Cache
> 128G DDR4 2133MHz
> Before: ~119283 pps
> After: ~140998 pps
> Upstream: ~221630 pps
> Mine: 4 Cores i5-6500 CPU @ 3.20GHz / 6M L3 Cache
> 16G DDR4 2133
> Upstream: ~150000 pps
> This maybe caused by hardware platform difference.
What's your networking configuration and qemu command line?
I'm sending packets(pktgen) from local host to tap interface directly. The guest is running l2fwd with uio driver. All threads(vhost, guest vcpu) bindings are correct.
My qemu command line:
./x86_64-softmmu/qemu-system-x86_64 /vm-tmp/uio-fedora-22-guest-DMAR-tmpfs.qcow2 -netdev tap,id=hn1,script=/etc/qemu-ifup-wei,vhost=on
-enable-kvm -vnc 0.0.0.0:2 -smp 3 -m 10G
-cpu qemu64,+ssse3,+sse4.1,+sse4.2 -serial stdio
pktgen log & tap statistics:
[root@hp-bl660cgen9-01 home]# ./pktgen-thread1.sh -i tap1 -d 22.214.171.124 -m 52:54:00:11:22:12
Running... ctrl^C to stop
Result device: tap1
Params: count 100000000 min_pkt_size: 60 max_pkt_size: 60
Result: OK: 173386108(c173373857+d12251) usec, 100000000 (60byte,0frags)
576747pps 276Mb/sec (276838560bps) errors: 0
[root@hp-bl660cgen9-01 home]# ./05-calc-pps.sh tap1
tap1 TX 141440 pkts/s TX Dropped: 386656 pkts/s
tap1 RX 0 pkts/s RX Dropped: 0 pkts/s
[root@hp-bl660cgen9-01 home]# ./05-calc-pps.sh tap2
tap2 TX 0 pkts/s TX Dropped: 0 pkts/s
tap2 RX 140998 pkts/s RX Dropped: 0 pkts/s
Wei, can you try not using l2fwd in the guest (just let kernel drops the packets in guest) and post the result here? That's what I test for tx array.
I tried benchmarks for different platforms and got different performance with all upstream code.
Beaker Server1: 48 Cores E5-4650 v3 @ 2.10GHz / 30M L3 Cache ~250k pps
T450s laptop: 4 Cores i7-5600U CPU @ 2.60GHz/ 4M L3 Cache ~500k pps
Desktop: 4 Cores i5-6500 CPU @ 3.20GHz / 6M L3 Cache ~1.5m pps
The performance gap is caused by the debug dma in kernel config which i generated from my desktop. I did a new round test and also get another server in beaker for try with rhel config. Here is the update number with upstream kernel both for host and guest.
Beaker Server1: 16 Cores E5-5530 @ 2.4GHz / 8M L3 Cache ~1.2M pps
Beaker Server2: 48 Cores E5-4650 v3 @ 2.10GHz / 30M L3 Cache ~1.4M pps
T450s laptop: 4 Cores i7-5600U CPU @ 2.60GHz / 4M L3 Cache ~1.5M pps
Desktop: 4 Cores i5-6500 CPU @ 3.20GHz / 6M L3 Cache ~2m pps
RHEL7.4 performance data:
Beaker Server1: 16 Cores E5-5530 @ 2.4GHz / 8M L3 Cache
Guest kernel: 4.9 upstream
Running dpdk with uio mode in guest.
Sending packets to tap device directly with pktgen on host.
before: ~0,97 mpps
after: ~1.16 mpps
Patch(es) committed on kernel repository and an interim kernel build is undergoing testing
Patch(es) available on kernel-3.10.0-656.el7
It is good to keep it out of the release note because this bz is a performance improvement which differs from a new feature.
Functional test, result as below.
1. boot up a guest
/usr/libexec/qemu-kvm -name rhel7.4 -cpu IvyBridge -m 4096 -realtime mlock=off -smp 4 \
-drive file=/home/rhel7.4.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,snapshot=off -device virtio-blk-pci,drive=drive-virtio-disk0,id=virtio-disk0 \
-netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown,queues=2 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:a1:d0:5f,vectors=6,mq=on,host_mtu=9000 \
-monitor stdio -device qxl-vga,id=video0 -serial unix:/tmp/console,server,nowait -vnc :1 -spice port=5900,disable-ticketing
2. install pkcs11-helper and openvpn in guest from brewweb
3. install redhat-internal-cert and redhat-internal-openvpn-profiles in guest from https://redhat.service-now.com/rh_ess/kb_view.do?sysparm_article=KB0005424
4. run 'openvpn --config /etc/openvpn/ovpn-bne-udp.conf' in guest(since pek2 vpn server could not be connected)
5. check tun in guest
redhat0: flags=4305<UP,POINTOPOINT,RUNNING,NOARP,MULTICAST> mtu 1360
inet 10.64.54.50 netmask 255.255.254.0 destination 10.64.54.50
inet6 fe80::ba22:5534:6b03:e397 prefixlen 64 scopeid 0x20<link>
unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 txqueuelen 100 (UNSPEC)
RX packets 112 bytes 14230 (13.8 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 106 bytes 49263 (48.1 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
6. ping redhat internal website from guest
# ping mail.corp.redhat.com -c 3 -I redhat0
PING mail.corp.redhat.com (10.4.203.66) from 10.64.54.22 redhat0: 56(84) bytes of data.
64 bytes from mail.corp.redhat.com (10.4.203.66): icmp_seq=1 ttl=247 time=352 ms
64 bytes from mail.corp.redhat.com (10.4.203.66): icmp_seq=2 ttl=247 time=352 ms
64 bytes from mail.corp.redhat.com (10.4.203.66): icmp_seq=3 ttl=247 time=353 ms
--- mail.corp.redhat.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2001ms
rtt min/avg/max/mdev = 352.395/352.637/353.078/0.312 ms
7. ping redhat internal host from guest
# ping 10.64.54.20 -c 3 -I redhat0
PING 10.64.54.20 (10.64.54.20) from 10.64.54.22 redhat0: 56(84) bytes of data.
64 bytes from 10.64.54.20: icmp_seq=1 ttl=63 time=363 ms
From 10.64.54.1 icmp_seq=2 Redirect Host(New nexthop: 10.64.54.20)
From 10.64.54.1: icmp_seq=2 Redirect Host(New nexthop: 10.64.54.20)
64 bytes from 10.64.54.20: icmp_seq=2 ttl=63 time=361 ms
--- 10.64.54.20 ping statistics ---
2 packets transmitted, 2 received, +1 errors, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 361.664/362.629/363.594/0.965 ms
and no error message in host or guest dmesg
According to https://bugzilla.redhat.com/show_bug.cgi?id=1352741#c20
see step 7
I used openvpn with bne-udp.conf in both guest and host, since pek2 could not be connected currently.
But while I ping host from guest using tun, I got some err msg of 'Redirect Host...'.
I'm not sure it's caused by the fact that I used bne openvpn or by this bz.
Could you help to take a look?
(In reply to xiywang from comment #21)
> Hi Jason,
> According to https://bugzilla.redhat.com/show_bug.cgi?id=1352741#c20
> see step 7
> I used openvpn with bne-udp.conf in both guest and host, since pek2 could
> not be connected currently.
> But while I ping host from guest using tun, I got some err msg of 'Redirect
> I'm not sure it's caused by the fact that I used bne openvpn or by this bz.
> Could you help to take a look?
Looks not the problem of this bug. In order to be safe, do you see this on 655?
Tested on 3.10.0-655.el7.x86_64, same behavior.
So it should not be a bug related issue.
# ping 10.64.242.69
PING 10.64.242.69 (10.64.242.69) 56(84) bytes of data.
64 bytes from 10.64.242.69: icmp_seq=1 ttl=63 time=188 ms
From 10.64.242.1 icmp_seq=2 Redirect Host(New nexthop: 10.64.242.69)
From 10.64.242.1: icmp_seq=2 Redirect Host(New nexthop: 10.64.242.69)
64 bytes from 10.64.242.69: icmp_seq=2 ttl=63 time=191 ms
From 10.64.242.1 icmp_seq=3 Redirect Host(New nexthop: 10.64.242.69)
From 10.64.242.1: icmp_seq=3 Redirect Host(New nexthop: 10.64.242.69)
64 bytes from 10.64.242.69: icmp_seq=3 ttl=63 time=187 ms
From 10.64.242.1 icmp_seq=4 Redirect Host(New nexthop: 10.64.242.69)
From 10.64.242.1: icmp_seq=4 Redirect Host(New nexthop: 10.64.242.69)
64 bytes from 10.64.242.69: icmp_seq=4 ttl=63 time=189 ms
From 10.64.242.1 icmp_seq=5 Redirect Host(New nexthop: 10.64.242.69)
From 10.64.242.1: icmp_seq=5 Redirect Host(New nexthop: 10.64.242.69)
64 bytes from 10.64.242.69: icmp_seq=5 ttl=63 time=188 ms
--- 10.64.242.69 ping statistics ---
5 packets transmitted, 5 received, +4 errors, 0% packet loss, time 4005ms
rtt min/avg/max/mdev = 187.904/188.865/191.128/1.317 ms
Could you help to do performance test?
(In reply to xiywang from comment #24)
> Hi Wenli,
> Could you help to do performance test?
The tx performance in tun indeed improve with kernel-3.10.0-656.
1. pktgen on tap device on host.
2. gather pps result on guest.
Verified both in functional level and performance level. Set to Veirified.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.