Bug 1414627 - Rx batching support for tun
Summary: Rx batching support for tun
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Martin Kletzander
QA Contact: yalzhang@redhat.com
Yehuda Zimmerman
URL:
Whiteboard:
Depends On: 1401433
Blocks: 1395265 1445257
TreeView+ depends on / blocked
 
Reported: 2017-01-19 03:47 UTC by jason wang
Modified: 2017-08-02 07:29 UTC (History)
9 users (show)

Fixed In Version: libvirt-3.2.0-13.el7
Doc Type: Enhancement
Doc Text:
Added support for rx batching on tun/tap devices With this release, rx batching for tun/tap devices is now supported. This enables receiving bundled network frames which can improve performance.
Clone Of:
: 1445257 (view as bug list)
Environment:
Last Closed: 2017-08-01 17:21:45 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:1846 0 normal SHIPPED_LIVE libvirt bug fix and enhancement update 2017-08-01 18:02:50 UTC

Description jason wang 2017-01-19 03:47:03 UTC
Description of problem:

We will backport rx batching for tun to 7.4. Which can batch several packets before submitting them to kernel. This could be configured through ethtool -C $tap rx-frames N while N is the maximum numbers of packets that could be batched by tun. Default value is 0, maximum value is 64.

Configurations that has long xmit path may benefit from this because of the improvement of cache utilization.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 8 Jaroslav Suchanek 2017-04-20 13:38:37 UTC
V2 posted upstream:
https://www.redhat.com/archives/libvir-list/2017-April/msg00939.html

Comment 9 Martin Kletzander 2017-04-21 11:44:58 UTC
Fixed upstream with commits v3.2.0-206-g652ef9bc8c72..v3.2.0-208-gfcef44728dff:

commit fcef44728dff9cb708d00d17f5e0b44aa513f27b
Author: Martin Kletzander <mkletzan>
Date:   2017-04-07 17:54:12 +0200

    Set coalesce settings for domain interfaces
    
commit 523c9960621eaf307ae8d4ae2735fb66f89d5634
Author: Martin Kletzander <mkletzan>
Date:   2017-04-07 17:46:32 +0200

    conf, docs: Add support for coalesce setting(s)
    
commit 652ef9bc8c72a7118436a8c95c5742ec6a6d12a3
Author: Martin Kletzander <mkletzan>
Date:   2017-04-07 17:38:06 +0200

    util: Add virNetDevSetCoalesce function

Comment 12 Martin Kletzander 2017-04-25 11:57:52 UTC
I would probably say "Added support for rx batching on tun/tap devices"
for the first line of the Doc Text, so I fixed that.  But other than that it looks good to me.  Thanks.

Comment 14 yalzhang@redhat.com 2017-06-06 11:38:41 UTC
I have test it on a host with below packages: 

libvirt-3.2.0-7.el7.x86_64
kernel-3.10.0-671.el7.x86_64

To check if I understand the bug, I have done the performance test with pktgen by below steps, please help to check if I got the point, thank you in advance. 

1. edit 2 guests without Coalesce settings, then start the 2 guests.
  <interface type='network'>
      <source network='default'/>
      <model type='virtio'/>
    </interface>

2. on vm1 and vm2, load the pktgen module

# modprobe pktgen
# modinfo pktgen
filename:       /lib/modules/3.10.0-675.el7.x86_64/kernel/net/core/pktgen.ko.xz
version:        2.74
license:        GPL
description:    Packet Generator tool
author:         Robert Olsson <robert.olsson.se>
rhelversion:    7.4
srcversion:     BF9522B0DBF61EAB7EF44C7
depends:        
intree:         Y
vermagic:       3.10.0-675.el7.x86_64 SMP mod_unload modversions 
signer:         Red Hat Enterprise Linux kernel signing key
sig_key:        84:DD:2F:C3:34:C0:13:EA:D7:FA:0F:4E:EF:0B:BB:A3:A4:96:46:26
sig_hashalgo:   sha256
parm:           pg_count_d:Default number of packets to inject (int)
parm:           pg_delay_d:Default delay between packets (nanoseconds) (int)
parm:           pg_clone_skb_d:Default number of copies of the same packet (int)
parm:           debug:Enable debugging of pktgen module (int)

3. on vm1, execute the test script, after about 1 minute, stop the script, at the same time, monitor on host

# cat /proc/net/pktgen/eth0
Params: count 0  min_pkt_size: 60  max_pkt_size: 60
     frags: 0  delay: 0  clone_skb: 1000  ifname: eth0
     flows: 0 flowlen: 0
     queue_map_min: 0  queue_map_max: 0
     dst_min: 192.168.122.48  dst_max: 
        src_min:   src_max: 
     src_mac: 52:54:00:8e:31:1e dst_mac: 52:54:00:6e:f4:f1
     udp_src_min: 9  udp_src_max: 9  udp_dst_min: 9  udp_dst_max: 9
     src_mac_count: 0  dst_mac_count: 0
     Flags: 
Current:
     pkts-sofar: 58636535  errors: 0
     started: 4589110046us  stopped: 4796286358us idle: 77600us
     seq_num: 58636536  cur_dst_mac_offset: 0  cur_src_mac_offset: 0
     cur_saddr: 192.168.122.77  cur_daddr: 192.168.122.48
     cur_udp_dst: 9  cur_udp_src: 9
     cur_queue_map: 0
     flows: 0
Result: OK: 207176311(c207098711+d77600) usec, 58636535 (60byte,0frags)
  283027pps 135Mb/sec (135852960bps) errors: 0


# sar -n DEV 4
Linux 3.10.0-671.el7.x86_64 (server74) 	06/06/2017 	_x86_64_	(4 CPU)

06:52:50 PM     IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s   txcmp/s  rxmcst/s
06:52:55 PM   enp0s25      1.00      0.40      0.12      0.20      0.00      0.00      0.20
06:52:55 PM    virbr3      0.00      0.00      0.00      0.00      0.00      0.00      0.00
06:52:55 PM     vnet1      1.00 296917.80      0.09  17397.52      0.00      0.00      0.00
06:52:55 PM virbr3-nic      0.00      0.00      0.00      0.00      0.00      0.00      0.00
06:52:55 PM        lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00
06:52:55 PM virbr0-nic      0.00      0.00      0.00      0.00      0.00      0.00      0.00
06:52:55 PM    virbr0      0.00      0.00      0.00      0.00      0.00      0.00      0.00
06:52:55 PM     vnet0 298327.80      1.40  17480.14      0.11      0.00      0.00      0.00

4. destroy the 2 vm2, and add below content in the interface xml, then start rhe 2 guests.

     <coalesce>
        <rx>
          <frames max='128'/>
        </rx>
      </coalesce>

# ethtool -c vnet0 | grep rx-frames:
rx-frames: 64
# ethtool -c vnet1 | grep rx-frames:
rx-frames: 64

5. on vm1, execute the test script then check the result, and monitor on host

# cat /proc/net/pktgen/eth0
Params: count 0  min_pkt_size: 60  max_pkt_size: 60
     frags: 0  delay: 0  clone_skb: 1000  ifname: eth0
     flows: 0 flowlen: 0
     queue_map_min: 0  queue_map_max: 0
     dst_min: 192.168.122.48  dst_max: 
        src_min:   src_max: 
     src_mac: 52:54:00:8e:31:1e dst_mac: 52:54:00:6e:f4:f1
     udp_src_min: 9  udp_src_max: 9  udp_dst_min: 9  udp_dst_max: 9
     src_mac_count: 0  dst_mac_count: 0
     Flags: 
Current:
     pkts-sofar: 36929996  errors: 0
     started: 129442039us  stopped: 243491182us idle: 106549us
     seq_num: 36929997  cur_dst_mac_offset: 0  cur_src_mac_offset: 0
     cur_saddr: 192.168.122.77  cur_daddr: 192.168.122.48
     cur_udp_dst: 9  cur_udp_src: 9
     cur_queue_map: 0
     flows: 0
Result: OK: 114049143(c113942593+d106549) usec, 36929996 (60byte,0frags)
  323807pps 155Mb/sec (155427360bps) errors: 0

#  sar -n DEV 4
Linux 3.10.0-671.el7.x86_64 (server74) 	06/06/2017 	_x86_64_	(4 CPU)
07:04:02 PM     IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s   txcmp/s  rxmcst/s
07:04:06 PM   enp0s25      0.75      0.50      0.05      0.25      0.00      0.00      0.25
07:04:06 PM    virbr3      0.00      0.00      0.00      0.00      0.00      0.00      0.00
07:04:06 PM     vnet1      1.00 332754.50      0.09  19497.33      0.00      0.00      0.00
07:04:06 PM virbr3-nic      0.00      0.00      0.00      0.00      0.00      0.00      0.00
07:04:06 PM        lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00
07:04:06 PM virbr0-nic      0.00      0.00      0.00      0.00      0.00      0.00      0.00
07:04:06 PM    virbr0      0.00      0.00      0.00      0.00      0.00      0.00      0.00
07:04:06 PM     vnet0 333396.50      1.50  19534.95      0.11      0.00      0.00      0.00


After this test, got the result: 

 rx-frames      pkts/s
 -----------+--------------+
      0         283027pps
 -----------+--------------+
      1         296906pps 
 -----------+--------------+
      4         313498pps
 -----------+--------------+
     16         322192pps
 -----------+--------------+
     64         323807pps
 -----------+--------------+
    128         331654pps  -> the rx-frames of vnet0 and vnet1 is 64
 -----------+--------------+
    256          323908pps -> the rx-frames of vnet0 and vnet1 is 64
 -----------+--------------+

once the "frames max" > 64 setting in the guest's xml, the rx_frames of the tap device will keep as 64. If this is expected?

# virsh dumpxml rhel7 | grep /coalesce -B4
      <coalesce>
        <rx>
          <frames max='65'/>
        </rx>
      </coalesce>

# ethtool -c vnet0 | grep rx-frames:
rx-frames: 64

Comment 15 Martin Kletzander 2017-06-06 14:21:39 UTC
(In reply to yalzhang from comment #14)
I can't reply for kernel, but from libvirt's POV it's expected because we set the number and then retrieve it back so that we report the right information in the XML (when you do dumpxml on a running domain).  So the limitation comes from the kernel.  The same happens if you set higher number using 'ethtool -C'.

Comment 16 jason wang 2017-06-07 02:44:26 UTC
(In reply to Martin Kletzander from comment #15)
> (In reply to yalzhang from comment #14)
> I can't reply for kernel, but from libvirt's POV it's expected because we
> set the number and then retrieve it back so that we report the right
> information in the XML (when you do dumpxml on a running domain).  So the
> limitation comes from the kernel.  The same happens if you set higher number
> using 'ethtool -C'.

Yes, we limit it to 64 for preventing bh from being disabled too long. So it's expected.

Thanks

Comment 17 yalzhang@redhat.com 2017-06-07 08:47:38 UTC
Test on below packages:

libvirt-3.2.0-7.el7.x86_64
qemu-kvm-rhev-2.9.0-8.el7.x86_64
kernel-3.10.0-671.el7.x86_64

scenarios:
1. set coalesce in interface type='network'

Q1: for hostdev and macvtap network, coalesce settings exists in the guest xml, but makes no sense, if libvirt plan do the check and reject such setting?

2. set coalesce in interface type='bridge'  ---> PASS

3. hotplug an interface device with coalesce setting  ---> PASS

4. boundary test and illegal value test,the legal value is 0~4294967295 -->PASS

5. update the coalesce setting on the fly by update-device ---> FAIL

6. model type other than virtio, such as rtl8139, e1000, e1000e, it is ok to set coalesce, but make no sense.

Q2: if libvirt plan to do the check and reject such setting?

7. virtio with driver name=qemu, it is ok to set coalesce, but make no sense  as it needs the vhost as backend.

Q3: if libvirt plan to do the check and reject such setting?



Scenario 1:

1. set coalesce in interface=network with macvtap network

# virsh net-dumpxml direct-macvtap
<network>
  <name>direct-macvtap</name>
  <uuid>06848478-25db-4edc-8a85-27a0778b3f50</uuid>
  <forward dev='enp0s25' mode='bridge'>
    <interface dev='enp0s25'/>
  </forward>
</network>

# virsh start rhel7.4
Domain rhel7.4 started

# virsh dumpxml rhel7.4 | grep /interface -B12
    <interface type='direct'>
      <mac address='52:54:00:f1:e7:e1'/>
      <source network='direct-macvtap' dev='enp0s25' mode='bridge'/>
      <target dev='macvtap0'/>
      <model type='virtio'/>
      <coalesce>
        <rx>
          <frames max='64'/>
        </rx>
      </coalesce>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>

# ethtool -c macvtap0
Coalesce parameters for macvtap0:
Cannot get device coalesce settings: Operation not supported

Scenario 5:
1. start a guest with coalesce setting
# virsh dumpxml rhel7.4 | grep /interface -B12
    <interface type='network'>
      <mac address='52:54:00:2c:24:22'/>
      <source network='default' bridge='virbr0'/>
      <target dev='vnet0'/>
      <model type='virtio'/>
      <coalesce>
        <rx>
          <frames max='64'/>
        </rx>
      </coalesce>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>

# cat interface1.xml
   <interface type='network'>
      <mac address='52:54:00:2c:24:22'/>
      <source network='default' bridge='virbr0'/>
      <target dev='vnet0'/>
      <model type='virtio'/>
      <coalesce>
        <rx>
          <frames max='32'/>
        </rx>
      </coalesce>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>

2. update the coalesce setting by update-device, the command return succeed, but no change.

# virsh update-device rhel7.4 interface1.xml
Device updated successfully

# virsh dumpxml rhel7.4 | grep /interface -B12
    <interface type='network'>
      <mac address='52:54:00:2c:24:22'/>
      <source network='default' bridge='virbr0'/>
      <target dev='vnet0'/>
      <model type='virtio'/>
      <coalesce>
        <rx>
          <frames max='64'/>
        </rx>
      </coalesce>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>

# ethtool -c vnet0 | grep rx-frames:
rx-frames: 64

Comment 18 yalzhang@redhat.com 2017-06-08 01:31:09 UTC
one more scenario:

8. migration from rhel7.4 -> rhel7.4  ---> PASS
   migration from rhel7.4 -> rhel7.3  ---> PASS, the guest's xml will ignore the coalesce setting after migrated to a libvirt version which do not support it.

Comment 19 Martin Kletzander 2017-06-12 21:37:06 UTC
A1: No, we did not reject it by decision and now we cannot reject this anymore since there were two releases already and that would break compatibility.

A2: Same here.  It would be logic-duplicating from somewhere else.  Does the live XML contain the data as well?  This could be the only bug to be fixed.

A3: We cannot change it when parsing, but we can check for it when starting the domain.  We can do that actually for the previous cases as well, but please, if that is an issue for you, create another bug for it (and right away assign it to me) so that we don't postpone this fixed one due to some error messages raised when starting a domain.  Feel free to mention all three issues there in that case.

Comment 20 yalzhang@redhat.com 2017-06-13 01:47:54 UTC
Hi Martin, thank you for your confirmation. I have submitted Bug 1460862 for above 3 scenarios.

For the scenario 5 in comment #17, update-device take no effect but return success. I think it is an issue need to be fixed, please check, thank you~

Comment 21 Martin Kletzander 2017-06-14 19:35:41 UTC
I will send additional patch for that and I'll see if the additional checks will be possible to put in as well.

Comment 22 Martin Kletzander 2017-06-16 11:17:20 UTC
Fixed upstream with:

commit 307a205e25ad7db7c895c42ab2e8f59f3839c058 (origin/master, origin/HEAD)
Author: Martin Kletzander <mkletzan>
Date:   2017-06-15 14:22:26 +0200

    qemu: Allow live-updates of coalesce settings

Comment 24 yalzhang@redhat.com 2017-06-21 05:43:31 UTC
Test on libvirt-3.2.0-11.el7.x86_64, still some issues, see step 2.

1. 
# virsh dumpxml rhel7.4 | grep /interface -B12
    <interface type='network'>
      <mac address='52:54:00:04:b2:d8'/>
      <source network='default' bridge='virbr0'/>
      <target dev='vnet0'/>
      <model type='virtio'/>
      <coalesce>
        <rx>
          <frames max='32'/>
        </rx>
      </coalesce>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>

# cat net.xml
   <interface type='network'>
      <mac address='52:54:00:04:b2:d8'/>
      <source network='default' bridge='virbr0'/>
      <target dev='vnet0'/>
      <model type='virtio'/>
      <coalesce>
        <rx>
          <frames max='64'/>
        </rx>
      </coalesce>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>

2. try to update rx_frames from current 32 to 64, failed.

# virsh update-device rhel7.4 net.xml
Device updated successfully

# virsh dumpxml rhel7.4 | grep /interface -B12
    <interface type='network'>
      <mac address='52:54:00:04:b2:d8'/>
      <source network='default' bridge='virbr0'/>
      <target dev='vnet0'/>
      <model type='virtio'/>
      <coalesce>
        <rx>
          <frames max='32'/> ==============> no changes
        </rx>
      </coalesce>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>

# ethtool -c vnet0 | grep rx-frames:
rx-frames: 32  =================> no changes


3. try to update rx_frames from current 32 to unconfigured 0, succeed

# cat net0.xml
   <interface type='network'>
      <mac address='52:54:00:04:b2:d8'/>
      <source network='default' bridge='virbr0'/>
      <target dev='vnet0'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>

# virsh update-device rhel7.4 net0.xml
Device updated successfully

# virsh dumpxml rhel7.4 | grep /interface -B7
    <interface type='network'>
      <mac address='52:54:00:04:b2:d8'/>
      <source network='default' bridge='virbr0'/>
      <target dev='vnet0'/>
      <model type='virtio'/>  ======> no coalesce settings
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>

# ethtool -c vnet0 | grep rx-frames: 
rx-frames: 0

4. try to update the rx-frames from unconfigured 0 to 16, succeed

# cat net1.xml
   <interface type='network'>
      <mac address='52:54:00:04:b2:d8'/>
      <source network='default' bridge='virbr0'/>
      <target dev='vnet0'/>
      <coalesce>
        <rx>
          <frames max='16'/>
        </rx>
      </coalesce>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>

# virsh update-device rhel7.4  net1.xml
Device updated successfully

# virsh dumpxml rhel7.4 | grep /interface -B12
    <interface type='network'>
      <mac address='52:54:00:04:b2:d8'/>
      <source network='default' bridge='virbr0'/>
      <target dev='vnet0'/>
      <model type='virtio'/>
      <coalesce>
        <rx>
          <frames max='16'/>
        </rx>
      </coalesce>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>

# ethtool -c vnet0 | grep rx-frames:
rx-frames: 16

Comment 25 yalzhang@redhat.com 2017-06-21 05:48:00 UTC
Would you like to fix it in the coming build for rhel7.4 or just verify this bug then track the tiny issue in a new bug? Sorry for the late reply, I would test the patch few days ago...

Comment 26 Martin Kletzander 2017-06-21 07:46:52 UTC
No problem, thanks for that, that's definitely a bug and my fault.  Yet another fix is this one:

commit ff7bae6e4fb74a52239d53af3672900c69801508
Author: Martin Kletzander <mkletzan>
Date:   2017-06-21 09:00:58 +0200

    qemu: Change coalesce settings on hotplug when they are different

Comment 29 yalzhang@redhat.com 2017-06-22 01:29:21 UTC
Test on libvirt-3.2.0-14.el7.x86_64 with the scenarios in comment 24, comment 17, the result is as expected, verify this bug.

Comment 30 errata-xmlrpc 2017-08-01 17:21:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1846

Comment 31 errata-xmlrpc 2017-08-02 00:01:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1846


Note You need to log in before you can comment on or make changes to this bug.