Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1352741 - tx array support in tun
tx array support in tun
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: kernel (Show other bugs)
7.3
Unspecified Unspecified
high Severity unspecified
: rc
: 7.4
Assigned To: Wei
xiywang
Jiri Herrmann
: FutureFeature
Depends On:
Blocks: 1395265 1401433 1414006
  Show dependency treegraph
 
Reported: 2016-07-04 21:20 EDT by jason wang
Modified: 2017-08-01 20:39 EDT (History)
15 users (show)

See Also:
Fixed In Version: kernel-3.10.0-656.el7
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-08-01 16:15:05 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:1842 normal SHIPPED_LIVE Important: kernel security, bug fix, and enhancement update 2017-08-01 14:22:09 EDT

  None (edit)
Description jason wang 2016-07-04 21:20:22 EDT
Description of problem:

We need tx array support in tun for accelerating rx pps in guest:

commit 1576d98605998fb59d121a39581129e134217182
Author: Jason Wang <jasowang@redhat.com>
Date:   Thu Jun 30 14:45:36 2016 +0800

    tun: switch to use skb array for tx
    
    We used to queue tx packets in sk_receive_queue, this is less
    efficient since it requires spinlocks to synchronize between producer
    and consumer.
    
    This patch tries to address this by:
    
    - switch from sk_receive_queue to a skb_array, and resize it when
      tx_queue_len was changed.
    - introduce a new proto_ops peek_len which was used for peeking the
      skb length.
    - implement a tun version of peek_len for vhost_net to use and convert
      vhost_net to use peek_len if possible.
    
    Pktgen test shows about 15.3% improvement on guest receiving pps for small
    buffers:
    
    Before: ~1300000pps
    After : ~1500000pps
    
    Signed-off-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
Comment 2 jason wang 2016-07-07 22:48:01 EDT
Note for QE:

- Since this touches tun, it would be better test something like vpn to make sure it does not break anything.

Thanks
Comment 4 jason wang 2016-08-02 21:56:18 EDT
Move back to ASSIGNED.
Comment 6 Wei 2017-01-04 08:30:11 EST
The percentage of improvement for rhel7 is the same as Jason mentioned in 
upstream commit log.

Get pps gap between beaker machine and Jason's data and my developing machine(5x slower).
Beaker: 48 Cores E5-4650 v3 @ 2.10GHz / 30M L3 Cache
                128G DDR4 2133MHz
        Before:   ~119283 pps
        After:    ~140998 pps
        Upstream: ~221630 pps

Mine:   4 Cores i5-6500 CPU @ 3.20GHz / 6M L3 Cache
                16G DDR4 2133
        Upstream: ~150000 pps

This maybe caused by hardware platform difference.
Comment 7 jason wang 2017-01-04 08:40:13 EST
(In reply to Wei from comment #6)
> The percentage of improvement for rhel7 is the same as Jason mentioned in 
> upstream commit log.
> 
> Get pps gap between beaker machine and Jason's data and my developing
> machine(5x slower).
> Beaker: 48 Cores E5-4650 v3 @ 2.10GHz / 30M L3 Cache
>                 128G DDR4 2133MHz
>         Before:   ~119283 pps
>         After:    ~140998 pps
>         Upstream: ~221630 pps
> 
> Mine:   4 Cores i5-6500 CPU @ 3.20GHz / 6M L3 Cache
>                 16G DDR4 2133
>         Upstream: ~150000 pps
> 
> This maybe caused by hardware platform difference.

What's your networking configuration and qemu command line?
Comment 8 Wei 2017-01-04 08:58:38 EST
I'm sending packets(pktgen) from local host to tap interface directly. The guest is running l2fwd with uio driver. All threads(vhost, guest vcpu) bindings are correct.

My qemu command line:
./x86_64-softmmu/qemu-system-x86_64  /vm-tmp/uio-fedora-22-guest-DMAR-tmpfs.qcow2 -netdev tap,id=hn1,script=/etc/qemu-ifup-wei,vhost=on 
-device virtio-net-pci,netdev=hn1,mac=52:54:00:11:22:10
-netdev tap,id=hn2,script=/etc/qemu-ifup-private1,vhost=on
-device virtio-net-pci,netdev=hn2,mac=52:54:00:11:22:12
-netdev tap,id=hn3,script=/etc/qemu-ifup-private2,vhost=on
-device virtio-net-pci,netdev=hn3,mac=52:54:00:11:22:13
-enable-kvm  -vnc 0.0.0.0:2  -smp 3 -m 10G
-cpu qemu64,+ssse3,+sse4.1,+sse4.2 -serial stdio
-machine q35
Comment 9 Wei 2017-01-04 09:09:13 EST
pktgen log & tap statistics:
[root@hp-bl660cgen9-01 home]# ./pktgen-thread1.sh -i tap1 -d 192.169.1.102 -m 52:54:00:11:22:12
Running... ctrl^C to stop
Done
Result device: tap1
Params: count 100000000  min_pkt_size: 60  max_pkt_size: 60
Result: OK: 173386108(c173373857+d12251) usec, 100000000 (60byte,0frags)
  576747pps 276Mb/sec (276838560bps) errors: 0

tap1(Rx):
[root@hp-bl660cgen9-01 home]# ./05-calc-pps.sh tap1
tap1 TX  141440 pkts/s TX Dropped: 386656 pkts/s
tap1 RX  0 pkts/s RX Dropped: 0 pkts/s

tap2(Tx):
[root@hp-bl660cgen9-01 home]# ./05-calc-pps.sh tap2
tap2 TX  0 pkts/s TX Dropped: 0 pkts/s
tap2 RX  140998 pkts/s RX Dropped: 0 pkts/s
Comment 10 jason wang 2017-01-04 22:05:24 EST
Wei, can you try not using l2fwd in the guest (just let kernel drops the packets in guest) and post the result here? That's what I test for tx array.

Thanks
Comment 11 Wei 2017-01-05 12:32:02 EST
I tried benchmarks for different platforms and got different performance with all upstream code.

Beaker Server1: 48 Cores E5-4650 v3 @ 2.10GHz /  30M L3 Cache   ~250k pps
T450s laptop:   4  Cores i7-5600U CPU @ 2.60GHz/ 4M  L3 Cache   ~500k pps
Desktop:        4  Cores i5-6500 CPU @ 3.20GHz / 6M  L3 Cache   ~1.5m pps
Comment 12 Wei 2017-01-10 12:52:38 EST
The performance gap is caused by the debug dma in kernel config which i generated from my desktop. I did a new round test and also get another server in beaker for try with rhel config. Here is the update number with upstream kernel both for host and guest.

Beaker Server1: 16 Cores E5-5530      @ 2.4GHz  / 8M  L3 Cache   ~1.2M pps
Beaker Server2: 48 Cores E5-4650 v3   @ 2.10GHz / 30M L3 Cache   ~1.4M pps
T450s laptop:   4  Cores i7-5600U CPU @ 2.60GHz / 4M  L3 Cache   ~1.5M pps
Desktop:        4  Cores i5-6500 CPU  @ 3.20GHz / 6M  L3 Cache   ~2m pps
Comment 13 Wei 2017-01-23 08:02:54 EST
RHEL7.4 performance data:

Test environment.
Beaker Server1: 16 Cores E5-5530      @ 2.4GHz  / 8M L3 Cache
Guest kernel: 4.9 upstream
Running dpdk with uio mode in guest.
Sending packets to tap device directly with pktgen on host.

pps: 
    before: ~0,97 mpps
    after:  ~1.16 mpps
Comment 14 Rafael Aquini 2017-04-20 14:06:15 EDT
Patch(es) committed on kernel repository and an interim kernel build is undergoing testing
Comment 16 Rafael Aquini 2017-04-21 08:34:19 EDT
Patch(es) available on kernel-3.10.0-656.el7
Comment 19 Wei 2017-04-24 11:49:57 EDT
Hi Jiri,
It is good to keep it out of the release note because this bz is a performance improvement which differs from a new feature.
Comment 20 xiywang 2017-05-09 03:50:18 EDT
Functional test, result as below.
host&guest 3.10.0-663.el7.x86_64
qemu-kvm-rhev-2.9.0-2.el7.x86_64

1. boot up a guest
/usr/libexec/qemu-kvm -name rhel7.4 -cpu IvyBridge -m 4096 -realtime mlock=off -smp 4 \
-drive file=/home/rhel7.4.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,snapshot=off -device virtio-blk-pci,drive=drive-virtio-disk0,id=virtio-disk0 \
-netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown,queues=2 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:a1:d0:5f,vectors=6,mq=on,host_mtu=9000 \
-monitor stdio -device qxl-vga,id=video0 -serial unix:/tmp/console,server,nowait -vnc :1 -spice port=5900,disable-ticketing

2. install pkcs11-helper and openvpn in guest from brewweb

3. install redhat-internal-cert and redhat-internal-openvpn-profiles in guest from https://redhat.service-now.com/rh_ess/kb_view.do?sysparm_article=KB0005424

4. run 'openvpn --config /etc/openvpn/ovpn-bne-udp.conf' in guest(since pek2 vpn server could not be connected)

5. check tun in guest
# ifconfig
redhat0: flags=4305<UP,POINTOPOINT,RUNNING,NOARP,MULTICAST>  mtu 1360
        inet 10.64.54.50  netmask 255.255.254.0  destination 10.64.54.50
        inet6 fe80::ba22:5534:6b03:e397  prefixlen 64  scopeid 0x20<link>
        unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  txqueuelen 100  (UNSPEC)
        RX packets 112  bytes 14230 (13.8 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 106  bytes 49263 (48.1 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

6. ping redhat internal website from guest
# ping mail.corp.redhat.com -c 3 -I redhat0
PING mail.corp.redhat.com (10.4.203.66) from 10.64.54.22 redhat0: 56(84) bytes of data.
64 bytes from mail.corp.redhat.com (10.4.203.66): icmp_seq=1 ttl=247 time=352 ms
64 bytes from mail.corp.redhat.com (10.4.203.66): icmp_seq=2 ttl=247 time=352 ms
64 bytes from mail.corp.redhat.com (10.4.203.66): icmp_seq=3 ttl=247 time=353 ms

--- mail.corp.redhat.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2001ms
rtt min/avg/max/mdev = 352.395/352.637/353.078/0.312 ms

7. ping redhat internal host from guest
# ping 10.64.54.20 -c 3 -I redhat0
PING 10.64.54.20 (10.64.54.20) from 10.64.54.22 redhat0: 56(84) bytes of data.
64 bytes from 10.64.54.20: icmp_seq=1 ttl=63 time=363 ms
From 10.64.54.1 icmp_seq=2 Redirect Host(New nexthop: 10.64.54.20)
From 10.64.54.1: icmp_seq=2 Redirect Host(New nexthop: 10.64.54.20)
64 bytes from 10.64.54.20: icmp_seq=2 ttl=63 time=361 ms

--- 10.64.54.20 ping statistics ---
2 packets transmitted, 2 received, +1 errors, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 361.664/362.629/363.594/0.965 ms

and no error message in host or guest dmesg
Comment 21 xiywang 2017-05-09 03:55:20 EDT
Hi Jason,

According to https://bugzilla.redhat.com/show_bug.cgi?id=1352741#c20 
see step 7

I used openvpn with bne-udp.conf in both guest and host, since pek2 could not be connected currently.
But while I ping host from guest using tun, I got some err msg of 'Redirect Host...'. 
I'm not sure it's caused by the fact that I used bne openvpn or by this bz.
Could you help to take a look? 

Thanks,
Xiyue
Comment 22 jason wang 2017-05-09 04:57:15 EDT
(In reply to xiywang from comment #21)
> Hi Jason,
> 
> According to https://bugzilla.redhat.com/show_bug.cgi?id=1352741#c20 
> see step 7
> 
> I used openvpn with bne-udp.conf in both guest and host, since pek2 could
> not be connected currently.
> But while I ping host from guest using tun, I got some err msg of 'Redirect
> Host...'. 
> I'm not sure it's caused by the fact that I used bne openvpn or by this bz.
> Could you help to take a look? 
> 
> Thanks,
> Xiyue

Looks not the problem of this bug. In order to be safe, do you see this on 655?

Thanks
Comment 23 xiywang 2017-05-10 22:06:00 EDT
Tested on 3.10.0-655.el7.x86_64, same behavior.
So it should not be a bug related issue.

# ping 10.64.242.69
PING 10.64.242.69 (10.64.242.69) 56(84) bytes of data.
64 bytes from 10.64.242.69: icmp_seq=1 ttl=63 time=188 ms
From 10.64.242.1 icmp_seq=2 Redirect Host(New nexthop: 10.64.242.69)
From 10.64.242.1: icmp_seq=2 Redirect Host(New nexthop: 10.64.242.69)
64 bytes from 10.64.242.69: icmp_seq=2 ttl=63 time=191 ms
From 10.64.242.1 icmp_seq=3 Redirect Host(New nexthop: 10.64.242.69)
From 10.64.242.1: icmp_seq=3 Redirect Host(New nexthop: 10.64.242.69)
64 bytes from 10.64.242.69: icmp_seq=3 ttl=63 time=187 ms
From 10.64.242.1 icmp_seq=4 Redirect Host(New nexthop: 10.64.242.69)
From 10.64.242.1: icmp_seq=4 Redirect Host(New nexthop: 10.64.242.69)
64 bytes from 10.64.242.69: icmp_seq=4 ttl=63 time=189 ms
From 10.64.242.1 icmp_seq=5 Redirect Host(New nexthop: 10.64.242.69)
From 10.64.242.1: icmp_seq=5 Redirect Host(New nexthop: 10.64.242.69)
64 bytes from 10.64.242.69: icmp_seq=5 ttl=63 time=188 ms
^C
--- 10.64.242.69 ping statistics ---
5 packets transmitted, 5 received, +4 errors, 0% packet loss, time 4005ms
rtt min/avg/max/mdev = 187.904/188.865/191.128/1.317 ms
Comment 24 xiywang 2017-05-22 22:30:04 EDT
Hi Wenli,

Could you help to do performance test?

Thanks,
Xiyue
Comment 25 Quan Wenli 2017-05-23 23:19:59 EDT
(In reply to xiywang from comment #24)
> Hi Wenli,
> 
> Could you help to do performance test?
> 
> Thanks,
> Xiyue
. 
The tx performance in tun indeed improve with kernel-3.10.0-656. 

Steps: 
1. pktgen on tap device on host. 
2. gather pps result on guest. 

kernel         pkts/s
------------+---------------
3.10.0-655      977662 
------------+---------------
3.10.0-656     1041984
------------+---------------
Comment 26 xiywang 2017-05-24 02:05:29 EDT
Verified both in functional level and performance level. Set to Veirified.
Comment 29 errata-xmlrpc 2017-08-01 16:15:05 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:1842
Comment 30 errata-xmlrpc 2017-08-01 20:39:46 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:1842

Note You need to log in before you can comment on or make changes to this bug.