Bug 508861
Summary: | kvm: add tap send buffer limit to help UDP networking | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Mark McLoughlin <markmc> | ||||
Component: | kvm | Assignee: | Mark McLoughlin <markmc> | ||||
Status: | CLOSED ERRATA | QA Contact: | Lawrence Lim <llim> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 5.4 | CC: | herbert.xu, jiabwang, lihuang, mwagner, sghosh, shuang, syeghiay, tburke, tools-bugs, virt-maint, ykaul | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | kvm-83-90.el5 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2009-09-02 09:27:45 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 495863 | ||||||
Bug Blocks: | |||||||
Attachments: |
|
Description
Mark McLoughlin
2009-06-30 09:54:15 UTC
Created attachment 349933 [details]
net-add-net-tap-sndbuf-with-a-sensible-default.patch
1. command used: #netperf -H 10.66.70.31 -t UDP_STREAM -l 10 -- -m 2048 2. reproduce on kvm-83-83.el5, but not as serious as comment #0 129024 1024 10.00 162582 0 133.17 129024 10.00 162274 132.91 129024 2048 10.00 93171 0 152.63 129024 10.00 93158 152.61 129024 65507 10.00 5041 0 264.12 129024 10.00 5026 263.33 3. check on kvm-83-90.el5 129024 1024 10.00 231046 0 189.26 129024 10.00 231046 189.26 129024 2048 10.00 99867 0 163.60 129024 10.00 99867 163.60 129024 65507 10.00 5272 0 276.22 129024 10.00 5272 276.22 Can I *VERIFIED* this issue, according to the test result. shuang: yep, that looks good - no packets were dropped and performance was improved issue reproduce on kvm-83-94.el5, packets were dropped from 1473 start vm with virtio network interface guest->host for i in 32 64 128 256 512 1024 1278 1407 1472 1473 1475 2048 4096 8192 16834 32768; do netperf -t UDP_STREAM -f m -H 192.168.20.6 -P 0 -l 10 -- -m $i; done 129024 32 10.00 1514106 0 38.76 129024 10.00 1514106 38.76 129024 64 10.00 1536076 0 78.64 129024 10.00 1536076 78.64 129024 128 10.00 1361436 0 139.40 129024 10.00 1361436 139.40 129024 256 10.00 1359981 0 278.51 129024 10.00 1359981 278.51 129024 512 10.00 1304934 0 534.37 129024 10.00 1304934 534.37 129024 1024 10.00 992948 0 813.29 129024 10.00 992948 813.29 129024 1278 10.00 867703 0 887.02 129024 10.00 867703 887.02 129024 1407 10.00 816792 0 919.26 129024 10.00 816792 919.26 129024 1472 10.00 793871 0 934.75 129024 10.00 793871 934.75 129024 1473 10.00 941008 0 1108.76 129024 10.00 551144 649.40 129024 1475 10.00 877103 0 1034.87 129024 10.00 505451 596.37 129024 2048 10.00 789477 0 1293.39 129024 10.00 276077 452.29 129024 4096 10.00 595492 0 1951.04 129024 10.00 89496 293.22 129024 8192 10.00 310392 0 2033.95 129024 10.00 38422 251.77 129024 16834 10.00 158573 0 2135.31 129024 10.00 12290 165.49 129024 32768 10.00 85726 0 2246.81 129024 10.00 1121 29.38 kvm-83-90.el5: 129024 32 10.00 1718361 0 43.99 129024 10.00 1718361 43.99 129024 64 10.00 1414834 0 72.43 129024 10.00 1414834 72.43 129024 128 10.00 1675115 0 171.51 129024 10.00 1675115 171.51 129024 256 10.00 1250219 0 256.00 129024 10.00 1250183 255.99 129024 512 10.00 1196838 0 490.18 129024 10.00 1196838 490.18 129024 1024 10.00 464854 0 380.74 129024 10.00 464854 380.74 129024 1278 10.00 390095 0 398.76 129024 10.00 390095 398.76 129024 1407 10.00 365242 0 411.04 129024 10.00 365242 411.04 129024 1472 10.00 365555 0 430.45 129024 10.00 365555 430.45 129024 1473 10.00 300405 0 353.93 129024 10.00 300405 353.93 129024 1475 10.00 299979 0 353.91 129024 10.00 299979 353.91 129024 2048 10.00 244993 0 401.37 129024 10.00 244993 401.37 129024 4096 10.00 130086 0 426.19 129024 10.00 130086 426.19 129024 8192 10.00 112190 0 735.19 129024 10.00 112190 735.19 129024 16834 10.00 35714 0 480.88 129024 10.00 35714 480.88 129024 32768 10.00 20015 0 524.64 129024 10.00 20015 524.64 host->host: 129024 32 10.00 2836039 0 72.60 129024 10.00 2831649 72.49 129024 64 10.00 2825320 0 144.64 129024 10.00 2821296 144.44 129024 128 10.00 2099014 0 214.92 129024 10.00 2099014 214.92 129024 256 10.00 2065180 0 422.92 129024 10.00 2065180 422.92 129024 512 10.00 1618174 0 662.78 129024 10.00 1618174 662.78 129024 1024 10.00 993023 0 813.45 129024 10.00 993023 813.45 129024 1278 10.00 864798 0 884.16 129024 10.00 864798 884.16 129024 1407 10.00 813930 0 916.08 129024 10.00 813930 916.08 129024 1472 10.00 791482 0 931.97 129024 10.00 791482 931.97 129024 1473 10.00 645177 0 760.22 129024 10.00 645177 760.22 129024 1475 10.00 644466 0 760.43 129024 10.00 644466 760.43 129024 1056 10.00 969495 0 819.01 129024 10.00 969495 819.01 129024 2048 10.00 526325 0 862.31 129024 10.00 526325 862.31 129024 4096 10.00 277663 0 909.76 129024 10.00 277663 909.76 129024 8192 10.00 141111 0 924.70 129024 10.00 141111 924.70 129024 16834 10.00 68699 0 925.11 129024 10.00 68699 925.11 129024 32768 10.00 35372 0 927.20 129024 10.00 35372 927.20 Hi Mark, is it because of cancelling the tx timer? Since it is not super blocker, I tend to postpone it to 5.5 (In reply to comment #11) > issue reproduce on kvm-83-94.el5, packets were dropped from 1473 > > start vm with virtio network interface > > guest->host This bug is not about guest->host UDP packets being dropped, it is about guest->external UDP packets being dropped With guest->host, the guest can send packets faster that the host can receive them and the host drops them. This is a known issue and the fix for this bug does not help it. With guest->external, without the fix for this bug, you'll see the dropped packets accounted for in 'tc -s qdisc' output for the NIC who's txqueuelen we're exceeding With guest->host, you'll see the dropped packets accounted for in the output of 'awk '/^Udp: / { print $4; }' /proc/net/snmp'. This is Udp/InErrors and means that we are exceeding the receiver's socket buffer (see net.core.rmem_default) Please re-test guest->external and move back to VERIFIED if there hasn't been a regression since comment #9 And just to explain further why shuang's figures look like a regression, but they're not: with kvm-83-90.el5 we see: 129024 32768 10.00 20015 0 524.64 129024 10.00 20015 524.64 i.e. the guest is only managing to send 524Mbit/s to the host in kvm-83-94.el5 we removed the tx mitigation timer (bug #504647) allowing the guest to send much much faster: 129024 32768 10.00 85726 0 2246.81 129024 10.00 1121 29.38 except that because it's sending so fast now, the host is dropping heaps of packets But again, the send buffer limit only helps guest->external, not guest->host the result above on comment#11 is tested on guest->external. and I test again: 1. stop iptables on external machine: #service iptables stop 2. on external machine: [root@dhcp-66-70-31 ~]# sysctl net.bridge.bridge-nf-call-iptables=0 net.bridge.bridge-nf-call-iptables = 0 [root@dhcp-66-70-31 ~]# sysctl net.bridge.bridge-nf-call-iptables net.bridge.bridge-nf-call-iptables = 0 3. Start vm at another machine with virtio network interface and run: #for i in 32 64 128 256 512 1024 1278 1407 1472 1473 1475 2048 4096 8192 16834 32768; do netperf -t UDP_STREAM -f m -H 192.168.20.6 -P 0 -l 10 -- -m $i; done result: 129024 32 10.00 1884503 0 48.24 129024 10.00 1272852 32.58 129024 64 10.00 1810618 0 92.68 129024 10.00 959737 49.13 129024 128 10.00 1726797 0 176.81 129024 10.00 645245 66.07 129024 256 10.00 2394939 0 490.47 129024 10.00 388075 79.48 129024 512 10.00 2557121 0 1047.14 129024 10.00 217027 88.87 129024 1024 10.00 115435 0 94.56 129024 10.00 115435 94.56 129024 1278 10.00 93703 0 95.78 129024 10.00 93703 95.78 129024 1407 10.00 84475 0 95.07 129024 10.00 84475 95.07 129024 1472 10.00 81074 0 95.46 129024 10.00 81074 95.46 129024 1473 10.00 1388354 0 1635.83 129024 10.00 11133 13.12 udp_send: data send error: Message too long 129024 4096 10.00 911530 0 2986.44 129024 10.00 3170 10.39 129024 8192 10.00 477551 0 3129.31 129024 10.00 429 2.81 run Start vm at another machine with virtio network interface and run: #for i in 32 64 128 256 512 1024 1278 1407 1472 1473 1475 2048 4096 8192 16834 32768; do netperf -t UDP_STREAM -f m -H 10.66.70.31 -P 0 -l 10 -- -m $i; done comment #11, comment #12 and this one are tested with crossover kvm-83-94.el5 #for i in 32 64 128 256 512 1024 1278 1407 1472 1473 1475 2048 4096 8192 16834 32768; do netperf -t UDP_STREAM -f m -H 192.168.20.8 -P 0 -l 10 -- -m $i; done 129024 32 10.00 1570558 0 40.20 129024 10.00 1570558 40.20 129024 64 10.00 1615932 0 82.72 129024 10.00 1615932 82.72 129024 128 10.00 1405154 0 143.88 129024 10.00 1405154 143.88 129024 256 10.00 1325752 0 271.49 129024 10.00 1325752 271.49 129024 512 10.00 1355074 0 555.02 129024 10.00 1355074 555.02 129024 1024 10.00 993922 0 814.07 129024 10.00 993922 814.07 129024 1278 10.00 871561 0 890.93 129024 10.00 871561 890.93 129024 1407 10.00 821584 0 924.75 129024 10.00 821584 924.75 129024 1472 10.00 799208 0 941.09 129024 10.00 799208 941.09 129024 1473 10.00 926606 0 1091.74 129024 10.00 545836 643.11 129024 1056 10.00 970946 0 820.15 129024 10.00 970946 820.15 129024 2048 10.00 795668 0 1303.47 129024 10.00 272712 446.76 129024 4096 10.00 545874 0 1788.40 129024 10.00 83105 272.27 129024 8192 10.00 292311 0 1915.35 129024 10.00 35432 232.17 129024 16834 10.00 152800 0 2057.62 129024 10.00 11023 148.44 129024 32768 10.00 78703 0 2062.89 129024 10.00 686 17.98 (In reply to comment #16) > 2. on external machine: > [root@dhcp-66-70-31 ~]# sysctl net.bridge.bridge-nf-call-iptables=0 > net.bridge.bridge-nf-call-iptables = 0 > [root@dhcp-66-70-31 ~]# sysctl net.bridge.bridge-nf-call-iptables > net.bridge.bridge-nf-call-iptables = 0 Please run these two sysctl commands on the host - i.e. the machine the VM is running on Also do the following: 1) On the external machine run: $> awk '/^Udp: / { print $4; }' /proc/net/snmp 2) On the host machine (i.e. the machine the vm is running on) run: $> tc -s qdisc 3) Run e.g. $> netperf -t UDP_STREAM -f m -H 192.168.20.8 -P 0 -l 10 -- 16834 4) Repeat (1) and (2) stop iptables and run sysctl net.bridge.bridge-nf-call-iptables=0 on both host and externel machine. #awk '/^Udp: / { print $4; }' /proc/net/snmp 1. before transfer Udp: InDatagrams NoPorts InErrors OutDatagrams Udp: 8 297 0 305 2. after transfer Udp: InDatagrams NoPorts InErrors OutDatagrams Udp: 13306163 489 968 497 #tc -s qdisc 1. before transfer qdisc pfifo_fast 0: dev eth0 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 42216 bytes 345 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 2. after transfer qdisc pfifo_fast 0: dev eth0 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 11277062809 bytes 17277606 pkt (dropped 776750, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 qdisc pfifo_fast 0: dev tap0 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 31530 bytes 254 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 129024 32 10.00 2210611 0 56.58 129024 10.00 2210250 56.58 129024 64 10.00 2371876 0 121.42 129024 10.00 2371753 121.42 129024 128 10.00 1808021 0 185.12 129024 10.00 1808021 185.12 129024 256 10.00 1770731 0 362.62 129024 10.00 1770731 362.62 129024 512 10.00 2444342 0 1001.07 129024 10.00 1667592 682.95 129024 1024 10.00 991550 0 812.12 129024 10.00 991550 812.12 129024 1472 10.00 790218 0 930.40 129024 10.00 790218 930.40 129024 1473 10.00 646686 0 761.94 129024 10.00 646686 761.94 129024 2048 10.00 526457 0 862.43 129024 10.00 526457 862.43 129024 4096 10.00 277568 0 909.44 129024 10.00 277568 909.44 129024 8192 10.00 141145 0 924.82 129024 10.00 141145 924.82 129024 16834 10.00 68698 0 924.99 129024 10.00 68698 924.99 129024 32768 10.00 35486 0 930.09 129024 10.00 35486 930.09 lihuang makes a good point - bridge-nf-call-iptables=0 needs to be the default for rhev-h. I'll file a new bug (In reply to comment #20) > 129024 512 10.00 2444342 0 1001.07 > 129024 10.00 1667592 682.95 this data point is strange; but all the other data points show the fix is working, I think we have enough to mark this as VERIFIED setting to *VERIFIED* according comment #20 and comment #21. (In reply to comment #21) > lihuang makes a good point - bridge-nf-call-iptables=0 needs to be the default > for rhev-h. I'll file a new bug Filed as bug #514905 An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2009-1272.html |